首页 > 最新文献

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management最新文献

英文 中文
Soft Seeded SSL Graphs for Unsupervised Semantic Similarity-based Retrieval 基于无监督语义相似度检索的软种子SSL图
Avikalp Srivastava, Madhav Datt
Semantic similarity based retrieval is playing an increasingly important role in many IR systems such as modern web search, question-answering, similar document retrieval etc. Improvements in retrieval of semantically similar content are very significant to applications like Quora, Stack Overflow, Siri etc. We propose a novel unsupervised model for semantic similarity based content retrieval, where we construct semantic flow graphs for each query, and introduce the concept of "soft seeding" in graph based semi-supervised learning (SSL) to convert this into an unsupervised model. We demonstrate the effectiveness of our model on an equivalent question retrieval problem on the Stack Exchange QA dataset, where our unsupervised approach significantly outperforms the state-of-the-art unsupervised models, and produces comparable results to the best supervised models. Our research provides a method to tackle semantic similarity based retrieval without any training data, and allows seamless extension to different domain QA communities, as well as to other semantic equivalence tasks.
基于语义相似度的检索在现代网络搜索、问答、相似文档检索等信息检索系统中发挥着越来越重要的作用。在检索语义相似的内容方面的改进对于Quora、Stack Overflow、Siri等应用来说是非常重要的。本文提出了一种基于语义相似度的内容检索的无监督模型,在该模型中,我们为每个查询构建语义流图,并在基于图的半监督学习(SSL)中引入“软播种”的概念,将其转化为无监督模型。我们证明了我们的模型在Stack Exchange QA数据集上的等效问题检索问题上的有效性,其中我们的无监督方法显着优于最先进的无监督模型,并产生与最佳监督模型相当的结果。我们的研究提供了一种在没有任何训练数据的情况下处理基于语义相似度的检索的方法,并允许无缝扩展到不同领域的QA社区,以及其他语义等价任务。
{"title":"Soft Seeded SSL Graphs for Unsupervised Semantic Similarity-based Retrieval","authors":"Avikalp Srivastava, Madhav Datt","doi":"10.1145/3132847.3133162","DOIUrl":"https://doi.org/10.1145/3132847.3133162","url":null,"abstract":"Semantic similarity based retrieval is playing an increasingly important role in many IR systems such as modern web search, question-answering, similar document retrieval etc. Improvements in retrieval of semantically similar content are very significant to applications like Quora, Stack Overflow, Siri etc. We propose a novel unsupervised model for semantic similarity based content retrieval, where we construct semantic flow graphs for each query, and introduce the concept of \"soft seeding\" in graph based semi-supervised learning (SSL) to convert this into an unsupervised model. We demonstrate the effectiveness of our model on an equivalent question retrieval problem on the Stack Exchange QA dataset, where our unsupervised approach significantly outperforms the state-of-the-art unsupervised models, and produces comparable results to the best supervised models. Our research provides a method to tackle semantic similarity based retrieval without any training data, and allows seamless extension to different domain QA communities, as well as to other semantic equivalence tasks.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"224 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75811381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Interactive Social Recommendation 互动社交推荐
Xin Wang, S. Hoi, Chenghao Liu, M. Ester
Social recommendation has been an active research topic over the last decade, based on the assumption that social information from friendship networks is beneficial for improving recommendation accuracy, especially when dealing with cold-start users who lack sufficient past behavior information for accurate recommendation. However, it is nontrivial to use such information, since some of a person's friends may share similar preferences in certain aspects, but others may be totally irrelevant for recommendations. Thus one challenge is to explore and exploit the extend to which a user trusts his/her friends when utilizing social information to improve recommendations. On the other hand, most existing social recommendation models are non-interactive in that their algorithmic strategies are based on batch learning methodology, which learns to train the model in an offline manner from a collection of training data which are accumulated from users? historical interactions with the recommender systems. In the real world, new users may leave the systems for the reason of being recommended with boring items before enough data is collected for training a good model, which results in an inefficient customer retention. To tackle these challenges, we propose a novel method for interactive social recommendation, which not only simultaneously explores user preferences and exploits the effectiveness of personalization in an interactive way, but also adaptively learns different weights for different friends. In addition, we also give analyses on the complexity and regret of the proposed model. Extensive experiments on three real-world datasets illustrate the improvement of our proposed method against the state-of-the-art algorithms.
在过去的十年中,社交推荐一直是一个活跃的研究课题,基于来自友谊网络的社交信息有利于提高推荐准确性的假设,特别是在处理缺乏足够的过去行为信息以进行准确推荐的冷启动用户时。然而,使用这些信息并不是微不足道的,因为一个人的一些朋友可能在某些方面有相似的偏好,但其他人可能与推荐完全无关。因此,一个挑战是在利用社交信息改进推荐时,探索和利用用户信任他/她的朋友的程度。另一方面,大多数现有的社交推荐模型都是非交互式的,因为它们的算法策略是基于批处理学习方法,即从用户积累的训练数据集合中学习以离线方式训练模型。与推荐系统的历史交互。在现实世界中,在收集到足够的数据来训练一个好的模型之前,新用户可能会因为被推荐了无聊的项目而离开系统,这导致了低效的客户留存。为了解决这些挑战,我们提出了一种新的交互式社交推荐方法,该方法不仅可以同时探索用户偏好,以交互的方式利用个性化的有效性,而且可以自适应地学习不同朋友的不同权重。此外,我们还对所提出的模型的复杂性和遗憾进行了分析。在三个真实世界数据集上进行的大量实验表明,我们提出的方法与最先进的算法相比有所改进。
{"title":"Interactive Social Recommendation","authors":"Xin Wang, S. Hoi, Chenghao Liu, M. Ester","doi":"10.1145/3132847.3132880","DOIUrl":"https://doi.org/10.1145/3132847.3132880","url":null,"abstract":"Social recommendation has been an active research topic over the last decade, based on the assumption that social information from friendship networks is beneficial for improving recommendation accuracy, especially when dealing with cold-start users who lack sufficient past behavior information for accurate recommendation. However, it is nontrivial to use such information, since some of a person's friends may share similar preferences in certain aspects, but others may be totally irrelevant for recommendations. Thus one challenge is to explore and exploit the extend to which a user trusts his/her friends when utilizing social information to improve recommendations. On the other hand, most existing social recommendation models are non-interactive in that their algorithmic strategies are based on batch learning methodology, which learns to train the model in an offline manner from a collection of training data which are accumulated from users? historical interactions with the recommender systems. In the real world, new users may leave the systems for the reason of being recommended with boring items before enough data is collected for training a good model, which results in an inefficient customer retention. To tackle these challenges, we propose a novel method for interactive social recommendation, which not only simultaneously explores user preferences and exploits the effectiveness of personalization in an interactive way, but also adaptively learns different weights for different friends. In addition, we also give analyses on the complexity and regret of the proposed model. Extensive experiments on three real-world datasets illustrate the improvement of our proposed method against the state-of-the-art algorithms.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"111 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91348920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
An Enhanced Topic Modeling Approach to Multiple Stance Identification 多姿态识别的增强主题建模方法
Junjie Lin, W. Mao, Yuhao Zhang
People often publish online texts to express their stances, which reflect the essential viewpoints they stand. Stance identification has been an important research topic in text analysis and facilitates many applications in business, public security and government decision making. Previous work on stance identification solely focuses on classifying the supportive or unsupportive attitude towards a certain topic/entity. The other important type of stance identification, multiple stance identification, was largely ignored in previous research. In contrast, multiple stance identification focuses on identifying different standpoints of multiple parties involved in online texts. In this paper, we address the problem of recognizing distinct standpoints implied in textual data. As people are inclined to discuss the topics favorable to their standpoints, topics thus can provide distinguishable information of different standpoints. We propose a topic-based method for standpoint identification. To acquire more distinguishable topics, we further enhance topic model by adding constraints on document-topic distributions. We finally conduct experimental studies on two real datasets to verify the effectiveness of our approach to multiple stance identification.
人们经常在网上发表文章来表达自己的立场,这些观点反映了他们的基本观点。立场识别一直是文本分析中的重要研究课题,在商业、公共安全、政府决策等领域有着广泛的应用。以往的立场识别工作只关注对某一主题/实体的支持或不支持态度的分类。另一种重要的姿态识别类型,即多姿态识别,在以往的研究中基本上被忽视了。而多立场识别则侧重于识别网络文本中涉及的多方的不同立场。在本文中,我们解决了识别文本数据中隐含的不同立场的问题。由于人们倾向于讨论有利于自己立场的话题,因此话题可以提供不同立场的可区分信息。我们提出了一种基于主题的立场识别方法。为了获得更多可区分的主题,我们通过在文档-主题分布上添加约束来进一步增强主题模型。最后,我们在两个真实数据集上进行了实验研究,以验证我们的方法对多姿态识别的有效性。
{"title":"An Enhanced Topic Modeling Approach to Multiple Stance Identification","authors":"Junjie Lin, W. Mao, Yuhao Zhang","doi":"10.1145/3132847.3133145","DOIUrl":"https://doi.org/10.1145/3132847.3133145","url":null,"abstract":"People often publish online texts to express their stances, which reflect the essential viewpoints they stand. Stance identification has been an important research topic in text analysis and facilitates many applications in business, public security and government decision making. Previous work on stance identification solely focuses on classifying the supportive or unsupportive attitude towards a certain topic/entity. The other important type of stance identification, multiple stance identification, was largely ignored in previous research. In contrast, multiple stance identification focuses on identifying different standpoints of multiple parties involved in online texts. In this paper, we address the problem of recognizing distinct standpoints implied in textual data. As people are inclined to discuss the topics favorable to their standpoints, topics thus can provide distinguishable information of different standpoints. We propose a topic-based method for standpoint identification. To acquire more distinguishable topics, we further enhance topic model by adding constraints on document-topic distributions. We finally conduct experimental studies on two real datasets to verify the effectiveness of our approach to multiple stance identification.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"46 29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78785308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Understanding Database Performance Inefficiencies in Real-world Web Applications 了解实际Web应用程序中数据库性能的低效率
Cong Yan, Alvin Cheung, Junwen Yang, Shan Lu
Many modern database-backed web applications are built upon Object Relational Mapping (ORM) frameworks. While such frame- works ease application development by abstracting persistent data as objects, such convenience comes with a performance cost. In this paper, we studied 27 real-world open-source applications built on top of the popular Ruby on Rails ORM framework, with the goal to understand the database-related performance inefficiencies in these applications. We discovered a number of inefficiencies rang- ing from physical design issues to how queries are expressed in the application code. We applied static program analysis to identify and measure how prevalent these issues are, then suggested techniques to alleviate these issues and measured the potential performance gain as a result. These techniques significantly reduce database query time (up to 91%) and the webpage response time (up to 98%). Our study provides guidance to the design of future database engines and ORM frameworks to support database application that are performant yet without sacrificing programmability.
许多现代数据库支持的web应用程序都是建立在对象关系映射(ORM)框架之上的。虽然这种框架通过将持久数据抽象为对象来简化应用程序开发,但这种便利是以性能为代价的。在本文中,我们研究了基于流行的Ruby on Rails ORM框架构建的27个真实的开源应用程序,目的是了解这些应用程序中与数据库相关的性能低下。我们发现了从物理设计问题到如何在应用程序代码中表达查询等一系列低效率问题。我们应用静态程序分析来识别和测量这些问题的普遍程度,然后建议缓解这些问题的技术,并测量潜在的性能增益。这些技术显著减少了数据库查询时间(最多91%)和网页响应时间(最多98%)。我们的研究为未来数据库引擎和ORM框架的设计提供了指导,以支持性能良好但不牺牲可编程性的数据库应用程序。
{"title":"Understanding Database Performance Inefficiencies in Real-world Web Applications","authors":"Cong Yan, Alvin Cheung, Junwen Yang, Shan Lu","doi":"10.1145/3132847.3132954","DOIUrl":"https://doi.org/10.1145/3132847.3132954","url":null,"abstract":"Many modern database-backed web applications are built upon Object Relational Mapping (ORM) frameworks. While such frame- works ease application development by abstracting persistent data as objects, such convenience comes with a performance cost. In this paper, we studied 27 real-world open-source applications built on top of the popular Ruby on Rails ORM framework, with the goal to understand the database-related performance inefficiencies in these applications. We discovered a number of inefficiencies rang- ing from physical design issues to how queries are expressed in the application code. We applied static program analysis to identify and measure how prevalent these issues are, then suggested techniques to alleviate these issues and measured the potential performance gain as a result. These techniques significantly reduce database query time (up to 91%) and the webpage response time (up to 98%). Our study provides guidance to the design of future database engines and ORM frameworks to support database application that are performant yet without sacrificing programmability.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"84 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76799483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Broad Learning based Multi-Source Collaborative Recommendation 基于广泛学习的多源协同推荐
Junxing Zhu, Jiawei Zhang, Lifang He, Quanyuan Wu, Bin Zhou, Chenwei Zhang, Philip S. Yu
Anchor links connect information entities, such as entities of movies or products, across networks from different sources, and thus information in these networks can be transferred directly via anchor links. Therefore, anchor links have great value to many cross-network applications, such as cross-network social link prediction and cross-network recommendation. In this paper, we focus on studying the recommendation problem that can provide ratings of items or services. To address the problem, we propose a Cross-network Collaborative Matrix Factorization (CCMF) recommendation framework based on broad learning setting, which can effectively integrate multi-source information and alleviate the sparse information problem in each individual network. Based on item anchor links CCMF can fuse item similarity information and item latent information across networks from different sources. And different from most of the traditional works, CCMF can make multi-source recommendation tasks collaborate together via the information transfer based on the broad learning setting. During the transfer process, a novel cross-network similarity transfer method is applied to keep the consistency of item similarities between two different networks, and a domain adaptation matrix is used to overcome the domain difference problem. We conduct experiments to compare the proposed CCMF method with both classic and state-of-the-art recommendation techniques. The experimental results illustrate that CCMF outperforms other methods in different experimental circumstances, and has great advantages on dealing with different data sparse problems.
锚链接将不同来源的信息实体(如电影或产品实体)跨网络连接起来,因此这些网络中的信息可以通过锚链接直接传递。因此,锚链接对于跨网络社交链接预测、跨网络推荐等许多跨网络应用都有很大的价值。本文主要研究能够对商品或服务进行评级的推荐问题。针对这一问题,提出了基于广义学习设置的跨网络协同矩阵分解(CCMF)推荐框架,该框架能够有效整合多源信息,缓解单个网络中的信息稀疏问题。基于项目锚链接的CCMF可以融合不同来源的项目相似信息和项目潜在信息。与大多数传统工作不同的是,CCMF可以通过基于广泛学习设置的信息传递使多源推荐任务协同工作。在传递过程中,采用了一种新颖的跨网络相似性传递方法来保持两个不同网络之间项目相似性的一致性,并使用领域自适应矩阵来克服领域差异问题。我们进行了实验,将所提出的CCMF方法与经典和最新的推荐技术进行比较。实验结果表明,CCMF在不同的实验环境下都优于其他方法,在处理不同的数据稀疏问题上具有很大的优势。
{"title":"Broad Learning based Multi-Source Collaborative Recommendation","authors":"Junxing Zhu, Jiawei Zhang, Lifang He, Quanyuan Wu, Bin Zhou, Chenwei Zhang, Philip S. Yu","doi":"10.1145/3132847.3132976","DOIUrl":"https://doi.org/10.1145/3132847.3132976","url":null,"abstract":"Anchor links connect information entities, such as entities of movies or products, across networks from different sources, and thus information in these networks can be transferred directly via anchor links. Therefore, anchor links have great value to many cross-network applications, such as cross-network social link prediction and cross-network recommendation. In this paper, we focus on studying the recommendation problem that can provide ratings of items or services. To address the problem, we propose a Cross-network Collaborative Matrix Factorization (CCMF) recommendation framework based on broad learning setting, which can effectively integrate multi-source information and alleviate the sparse information problem in each individual network. Based on item anchor links CCMF can fuse item similarity information and item latent information across networks from different sources. And different from most of the traditional works, CCMF can make multi-source recommendation tasks collaborate together via the information transfer based on the broad learning setting. During the transfer process, a novel cross-network similarity transfer method is applied to keep the consistency of item similarities between two different networks, and a domain adaptation matrix is used to overcome the domain difference problem. We conduct experiments to compare the proposed CCMF method with both classic and state-of-the-art recommendation techniques. The experimental results illustrate that CCMF outperforms other methods in different experimental circumstances, and has great advantages on dealing with different data sparse problems.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85643842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Extracting Records from the Web Using a Signal Processing Approach 使用信号处理方法从Web中提取记录
R. P. Velloso, C. Dorneles
Extracting records from web pages enables a number of important applications and has immense value due to the amount and diversity of available information that can be extracted. This problem, although vastly studied, remains open because it is not a trivial one. Due to the scale of data, a feasible approach must be both automatic and efficient (and of course effective). We present here a novel approach, fully automatic and computationally efficient, using signal processing techniques to detect regularities and patterns in the structure of web pages. Our approach segments the web page, detects the data regions within it, identifies the records boundaries and aligns the records. Results show high f-score and linearithmic time complexity behaviour.
从网页中提取记录可以实现许多重要的应用程序,并且由于可以提取的可用信息的数量和多样性而具有巨大的价值。这个问题虽然被广泛研究,但仍然没有定论,因为它不是一个微不足道的问题。由于数据的规模,一个可行的方法必须是自动和高效的(当然是有效的)。我们在这里提出了一种全新的方法,全自动和计算效率,使用信号处理技术来检测网页结构中的规律和模式。我们的方法将网页分段,检测其中的数据区域,识别记录边界并对齐记录。结果显示高f-得分和线性时间复杂度行为。
{"title":"Extracting Records from the Web Using a Signal Processing Approach","authors":"R. P. Velloso, C. Dorneles","doi":"10.1145/3132847.3132875","DOIUrl":"https://doi.org/10.1145/3132847.3132875","url":null,"abstract":"Extracting records from web pages enables a number of important applications and has immense value due to the amount and diversity of available information that can be extracted. This problem, although vastly studied, remains open because it is not a trivial one. Due to the scale of data, a feasible approach must be both automatic and efficient (and of course effective). We present here a novel approach, fully automatic and computationally efficient, using signal processing techniques to detect regularities and patterns in the structure of web pages. Our approach segments the web page, detects the data regions within it, identifies the records boundaries and aligns the records. Results show high f-score and linearithmic time complexity behaviour.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85967110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Revealing the Hidden Links in Content Networks: An Application to Event Discovery 揭示内容网络中的隐藏链接:一个事件发现的应用
Antonia Saravanou, I. Katakis, G. Valkanas, V. Kalogeraki, D. Gunopulos
Social networks have become the de facto online resource for people to share, comment on and be informed about events pertinent to their interests and livelihood, ranging from road traffic or an illness to concerts and earthquakes, to economics and politics. This has been the driving force behind research endeavors that analyse such data. In this paper, we focus on how Content Networks can help us identify events effectively. Content Networks incorporate both structural and content-related information of a social network in a unified way, at the same time, bringing together two disparate lines of research: graph-based and content-based event discovery in social media. We model interactions of two types of nodes, users and content, and introduce an algorithm that builds heterogeneous, dynamic graphs, in addition to revealing content links in the network's structure. By linking similar content nodes and tracking connected components over time, we can effectively identify different types of events. Our evaluation on social media streaming data suggests that our approach outperforms state-of-the-art techniques, while showcasing the significance of hidden links to the quality of the results.
社交网络实际上已经成为人们分享、评论和了解与他们的兴趣和生活有关的事件的在线资源,从道路交通或疾病到音乐会和地震,再到经济和政治。这一直是分析此类数据的研究努力背后的推动力。在本文中,我们关注内容网络如何帮助我们有效地识别事件。内容网络以统一的方式整合了社交网络的结构信息和内容相关信息,同时将两种不同的研究方向结合在一起:社交媒体中基于图表的和基于内容的事件发现。我们对用户和内容这两类节点的交互进行了建模,并引入了一种算法,该算法除了揭示网络结构中的内容链接外,还构建了异构的动态图。通过链接相似的内容节点并跟踪连接的组件,我们可以有效地识别不同类型的事件。我们对社交媒体流数据的评估表明,我们的方法优于最先进的技术,同时展示了隐藏链接对结果质量的重要性。
{"title":"Revealing the Hidden Links in Content Networks: An Application to Event Discovery","authors":"Antonia Saravanou, I. Katakis, G. Valkanas, V. Kalogeraki, D. Gunopulos","doi":"10.1145/3132847.3133148","DOIUrl":"https://doi.org/10.1145/3132847.3133148","url":null,"abstract":"Social networks have become the de facto online resource for people to share, comment on and be informed about events pertinent to their interests and livelihood, ranging from road traffic or an illness to concerts and earthquakes, to economics and politics. This has been the driving force behind research endeavors that analyse such data. In this paper, we focus on how Content Networks can help us identify events effectively. Content Networks incorporate both structural and content-related information of a social network in a unified way, at the same time, bringing together two disparate lines of research: graph-based and content-based event discovery in social media. We model interactions of two types of nodes, users and content, and introduce an algorithm that builds heterogeneous, dynamic graphs, in addition to revealing content links in the network's structure. By linking similar content nodes and tracking connected components over time, we can effectively identify different types of events. Our evaluation on social media streaming data suggests that our approach outperforms state-of-the-art techniques, while showcasing the significance of hidden links to the quality of the results.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84993552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Knowledge-based Question Answering by Jointly Generating, Copying and Paraphrasing 联合生成、复制和释义的知识问答
Shuguang Zhu, Xiang Cheng, Sen Su, S. Lang
With the development of large-scale knowledge bases, people are building systems which give simple answers to questions based on consolidate facts. In this paper, we focus on simple questions, which ask about only a subject and relation in the knowledge base. Observing that certain parts of a question usually overlap with names of its corresponding subject and relation in the knowledge base, we argue that a question is formed by a mixture of copying and generation. To model that, we propose a sequence-to-sequence (seq2seq) architecture which encodes a candidate subject-relation pair and decodes it into the given question, where the decoding probability is used to select the best candidate. In our decoder, the copying mode points the subject or relation and duplicates its name, while the generating mode summarizes the meaning of the subject-relation pair and produces a word to smooth the question. Realizing that although sometimes a subject or relation is pointed, different names or keywords might be used, we also incorporate a paraphrasing mode to supplement the copying mode using an automatically mined lexicon. Extensive experiments on the largest dataset exhibit our better performance compared with the state-of-the-art methods.
随着大规模知识库的发展,人们正在构建基于巩固的事实给出简单答案的系统。在本文中,我们关注的是简单的问题,这些问题只询问知识库中的一个主题和关系。观察到问题的某些部分通常与知识库中相应主题和关系的名称重叠,我们认为问题是由复制和生成混合形成的。为了对其进行建模,我们提出了一个序列到序列(seq2seq)架构,该架构对候选主题-关系对进行编码并将其解码为给定的问题,其中解码概率用于选择最佳候选者。在我们的解码器中,复制模式指向主语或关系并重复其名称,而生成模式则总结主语-关系对的意义并产生一个词来解决问题。意识到尽管有时主题或关系是指向的,但可能使用不同的名称或关键字,我们还使用自动挖掘的词典合并了释义模式来补充复制模式。在最大的数据集上进行的大量实验表明,与最先进的方法相比,我们的性能更好。
{"title":"Knowledge-based Question Answering by Jointly Generating, Copying and Paraphrasing","authors":"Shuguang Zhu, Xiang Cheng, Sen Su, S. Lang","doi":"10.1145/3132847.3133064","DOIUrl":"https://doi.org/10.1145/3132847.3133064","url":null,"abstract":"With the development of large-scale knowledge bases, people are building systems which give simple answers to questions based on consolidate facts. In this paper, we focus on simple questions, which ask about only a subject and relation in the knowledge base. Observing that certain parts of a question usually overlap with names of its corresponding subject and relation in the knowledge base, we argue that a question is formed by a mixture of copying and generation. To model that, we propose a sequence-to-sequence (seq2seq) architecture which encodes a candidate subject-relation pair and decodes it into the given question, where the decoding probability is used to select the best candidate. In our decoder, the copying mode points the subject or relation and duplicates its name, while the generating mode summarizes the meaning of the subject-relation pair and produces a word to smooth the question. Realizing that although sometimes a subject or relation is pointed, different names or keywords might be used, we also incorporate a paraphrasing mode to supplement the copying mode using an automatically mined lexicon. Extensive experiments on the largest dataset exhibit our better performance compared with the state-of-the-art methods.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82542172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
TaCLe: Learning Constraints in Tabular Data 表格数据中的学习约束
Sergey Paramonov, Samuel Kolb, Tias Guns, L. D. Raedt
Spreadsheet data is widely used today by many different people and across industries. However, writing, maintaining and identifying good formulae for spreadsheets can be time consuming and error-prone. To address this issue we have introduced the TaCLe system (Tabular Constraint Learner). The system tackles an inverse learning problem: given a plain comma separated file, it reconstructs the spreadsheet formulae that hold in the tables. Two important considerations are the number of cells and constraints to check, and how to deal with multiple formulae for the same cell. Our system reasons over entire rows and columns and has an intuitive user interface for interacting with the learned constraints and data. It can be seen as an intelligent assistance tool for discovering formulae from data. As a result, the user obtains a spreadsheet that can automatically recompute dependent cells when updating or adding data.
电子表格数据今天被许多不同的人和跨行业广泛使用。然而,为电子表格编写、维护和识别好的公式既耗时又容易出错。为了解决这个问题,我们引入了TaCLe系统(表格约束学习器)。该系统解决了一个反向学习问题:给定一个逗号分隔的普通文件,它重建保存在表格中的电子表格公式。两个重要的考虑因素是单元格的数量和要检查的约束,以及如何处理同一单元格的多个公式。我们的系统对整个行和列进行推理,并具有直观的用户界面,用于与学习到的约束和数据进行交互。它可以看作是从数据中发现公式的智能辅助工具。因此,用户获得的电子表格可以在更新或添加数据时自动重新计算相关单元格。
{"title":"TaCLe: Learning Constraints in Tabular Data","authors":"Sergey Paramonov, Samuel Kolb, Tias Guns, L. D. Raedt","doi":"10.1145/3132847.3133193","DOIUrl":"https://doi.org/10.1145/3132847.3133193","url":null,"abstract":"Spreadsheet data is widely used today by many different people and across industries. However, writing, maintaining and identifying good formulae for spreadsheets can be time consuming and error-prone. To address this issue we have introduced the TaCLe system (Tabular Constraint Learner). The system tackles an inverse learning problem: given a plain comma separated file, it reconstructs the spreadsheet formulae that hold in the tables. Two important considerations are the number of cells and constraints to check, and how to deal with multiple formulae for the same cell. Our system reasons over entire rows and columns and has an intuitive user interface for interacting with the learned constraints and data. It can be seen as an intelligent assistance tool for discovering formulae from data. As a result, the user obtains a spreadsheet that can automatically recompute dependent cells when updating or adding data.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88037067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
When Deep Learning Meets Transfer Learning 当深度学习遇到迁移学习
Qiang Yang
Deep learning has achieved great success as evidenced by many practical applications and contests. However, deep learning developed so far also has some inherent limitations. In particular, deep learning is not yet very adaptable to different related domains and cannot handle small data. In this talk, I will give an overview of how transfer learning can help alleviate these problems. In particular, I will present some recent progress on integrating deep learning and transfer learning together and show some interesting applications in sentiment analysis, image processing and urban computing.
深度学习已经取得了巨大的成功,许多实际应用和竞赛都证明了这一点。然而,目前发展起来的深度学习也存在一些固有的局限性。特别是,深度学习对不同相关领域的适应性还不强,不能处理小数据。在这次演讲中,我将概述迁移学习如何帮助缓解这些问题。特别是,我将介绍深度学习和迁移学习结合在一起的一些最新进展,并展示在情感分析、图像处理和城市计算方面的一些有趣的应用。
{"title":"When Deep Learning Meets Transfer Learning","authors":"Qiang Yang","doi":"10.1145/3132847.3137175","DOIUrl":"https://doi.org/10.1145/3132847.3137175","url":null,"abstract":"Deep learning has achieved great success as evidenced by many practical applications and contests. However, deep learning developed so far also has some inherent limitations. In particular, deep learning is not yet very adaptable to different related domains and cannot handle small data. In this talk, I will give an overview of how transfer learning can help alleviate these problems. In particular, I will present some recent progress on integrating deep learning and transfer learning together and show some interesting applications in sentiment analysis, image processing and urban computing.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"305 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91456292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1