首页 > 最新文献

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)最新文献

英文 中文
Multi-criterion Real Time Tweet Summarization Based upon Adaptive Threshold 基于自适应阈值的多准则实时Tweet摘要
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0045
Abdelhamid Chellal, M. Boughanem, B. Dousset
Real time summarization in microblog aims at providing new relevant and non redundant information about an event as soon as it occurs. In this paper, we introduce a new tweet summarization approach where the decision of selecting an incoming tweet is made immediately when a tweet is vailable. Unlike existing approaches where thresholds are redefined, the proposed method estimates thresholds for decision taking in real time as soon as the new tweet arrives. Tweet selection is based upon three criterion namely informativeness, novelty and relevance with regards of the user's interest which are combined as conjunctive condition. Only tweets having an informativeness and novelty scores above a parametric-free threshold are added to the summary. The evaluation of our approach was carried out on the TREC MB RTF 2015 data set and it was compared with well known baselines. The results have revealed that our approach produces the most precise summaries in comparison to all baselines and official runs of the TREC MB RTF 2015 task.
微博实时摘要的目的是在事件发生的第一时间提供相关的、不冗余的新信息。在本文中,我们引入了一种新的推文摘要方法,当推文可用时,立即做出选择推文的决定。与现有的重新定义阈值的方法不同,该方法在新tweet到达时实时估计决策的阈值。Tweet的选择基于三个标准,即信息量、新颖性和用户兴趣的相关性,它们作为合取条件组合在一起。只有信息性和新颖性得分高于无参数阈值的tweet才会被添加到摘要中。我们的方法在TREC MB RTF 2015数据集上进行了评估,并与众所周知的基线进行了比较。结果表明,与TREC MB RTF 2015任务的所有基线和官方运行相比,我们的方法产生了最精确的摘要。
{"title":"Multi-criterion Real Time Tweet Summarization Based upon Adaptive Threshold","authors":"Abdelhamid Chellal, M. Boughanem, B. Dousset","doi":"10.1109/WI.2016.0045","DOIUrl":"https://doi.org/10.1109/WI.2016.0045","url":null,"abstract":"Real time summarization in microblog aims at providing new relevant and non redundant information about an event as soon as it occurs. In this paper, we introduce a new tweet summarization approach where the decision of selecting an incoming tweet is made immediately when a tweet is vailable. Unlike existing approaches where thresholds are redefined, the proposed method estimates thresholds for decision taking in real time as soon as the new tweet arrives. Tweet selection is based upon three criterion namely informativeness, novelty and relevance with regards of the user's interest which are combined as conjunctive condition. Only tweets having an informativeness and novelty scores above a parametric-free threshold are added to the summary. The evaluation of our approach was carried out on the TREC MB RTF 2015 data set and it was compared with well known baselines. The results have revealed that our approach produces the most precise summaries in comparison to all baselines and official runs of the TREC MB RTF 2015 task.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"23 1","pages":"264-271"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80119860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Towards Accurate Relation Extraction from Wikipedia 从维基百科中准确提取关系
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0023
Yulong Gu, Jiaxing Song, Weidong Liu, Y. Yao, Lixin Zou
Enormous efforts of human volunteers have made Wikipedia become a treasure of textual knowledge. Relation extraction that aims at extracting structured knowledge in the unstructured texts in Wikipedia is an appealing but quite challenging problem because it's hard for machines to understand plain texts. Existing methods are not effective enough because they understand relation types in textual level without exploiting knowledge behind plain texts. In this paper, we propose a novel framework called Athena 2.0 leveraging Semantic Patterns which are patterns that can understand relation types in semantic level to solve this problem. Extensive experiments show that Athena 2.0 significantly outperforms existing methods.
人类志愿者的巨大努力使维基百科成为文字知识的宝库。关系提取旨在从维基百科的非结构化文本中提取结构化知识,这是一个吸引人但颇具挑战性的问题,因为机器很难理解纯文本。现有的方法在文本层次上理解关系类型,而没有利用纯文本背后的知识,因此效果不够好。在本文中,我们提出了一个名为Athena 2.0的新框架,利用语义模式来解决这个问题,语义模式是一种在语义层面上理解关系类型的模式。大量的实验表明,Athena 2.0显著优于现有的方法。
{"title":"Towards Accurate Relation Extraction from Wikipedia","authors":"Yulong Gu, Jiaxing Song, Weidong Liu, Y. Yao, Lixin Zou","doi":"10.1109/WI.2016.0023","DOIUrl":"https://doi.org/10.1109/WI.2016.0023","url":null,"abstract":"Enormous efforts of human volunteers have made Wikipedia become a treasure of textual knowledge. Relation extraction that aims at extracting structured knowledge in the unstructured texts in Wikipedia is an appealing but quite challenging problem because it's hard for machines to understand plain texts. Existing methods are not effective enough because they understand relation types in textual level without exploiting knowledge behind plain texts. In this paper, we propose a novel framework called Athena 2.0 leveraging Semantic Patterns which are patterns that can understand relation types in semantic level to solve this problem. Extensive experiments show that Athena 2.0 significantly outperforms existing methods.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"66 1","pages":"89-96"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79618997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Split Smart Swap Clustering for Clutter Problem in Web Mapping System Web映射系统中杂波问题的分体智能交换聚类
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0070
Qinpei Zhao, Zhenyu A. Liao, Jiangfeng Li, Yang Shi, Qirong Tang
The development of location-based applications raises a new challenge to manage and visualize large amounts of geo-tags presented on a web map. The visualization of the geo-tags often leads to a clutter problem, especially in web-mapping systems. We present a new clustering method to reduce the amount of visual clutter. A split smart swap strategy, which has the advantage that it can be applied to a certain data only once at all map scales, is employed in the method. We compare the proposed method to several other methods. Taking the advantage of the one-time running offline, the proposed method is more applicable for the clutter problem.
基于位置的应用程序的发展对网络地图上大量地理标记的管理和可视化提出了新的挑战。地理标签的可视化常常导致混乱的问题,特别是在web地图系统中。提出了一种新的聚类方法来减少视觉杂波的数量。该方法采用了一种分割智能交换策略,其优点是在所有地图尺度下只能对某一特定数据应用一次。我们将所提出的方法与其他几种方法进行了比较。该方法利用了一次性脱机运行的优点,更适用于杂波问题。
{"title":"A Split Smart Swap Clustering for Clutter Problem in Web Mapping System","authors":"Qinpei Zhao, Zhenyu A. Liao, Jiangfeng Li, Yang Shi, Qirong Tang","doi":"10.1109/WI.2016.0070","DOIUrl":"https://doi.org/10.1109/WI.2016.0070","url":null,"abstract":"The development of location-based applications raises a new challenge to manage and visualize large amounts of geo-tags presented on a web map. The visualization of the geo-tags often leads to a clutter problem, especially in web-mapping systems. We present a new clustering method to reduce the amount of visual clutter. A split smart swap strategy, which has the advantage that it can be applied to a certain data only once at all map scales, is employed in the method. We compare the proposed method to several other methods. Taking the advantage of the one-time running offline, the proposed method is more applicable for the clutter problem.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"14 1","pages":"439-443"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83063220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Knowledge-Driven Approach to Predict Personality Traits by Leveraging Social Media Data 利用社交媒体数据预测个性特征的知识驱动方法
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0048
M. Thilakaratne, R. Weerasinghe, Sujan Perera
The day-to-day behavior of the individuals reveal their personality traits. With the emergence of the social media platforms, some aspects of this behavior are being recorded in their online profiles. This provides necessary input to develop algorithms that can predict personality traits of individuals. However, these algorithms need to exploit the semantics of the data in order to reveal the personality traits. Current studies on this topic mainly exploited the syntactic features of the language used by individuals to predict their personality traits. In this work we demonstrate the value of exploiting semantics of the messages conveyed in social media posts for predicting personality traits. In other words, we present a study that attempts to simulate the cognitive ability of the human brain, which allows to identify the important implicit information in social media posts for understanding the personality traits of an individual. Our approach shows the value of publicly available knowledge bases in eliciting implicit information in the user generated content and their impact on predicting the personality traits of an individual. We evaluated our approach using well-known 'myPersonality' dataset and showed that it outperforms the state-of-the-art algorithms that mainly depend on syntactic features.
个人的日常行为揭示了他们的个性特征。随着社交媒体平台的出现,这种行为的某些方面正在被记录在他们的在线档案中。这为开发能够预测个人性格特征的算法提供了必要的输入。然而,这些算法需要利用数据的语义来揭示人格特征。目前对这一主题的研究主要是利用个体使用的语言的句法特征来预测其人格特征。在这项工作中,我们展示了利用社交媒体帖子中传达的信息的语义来预测人格特质的价值。换句话说,我们提出了一项研究,试图模拟人类大脑的认知能力,从而识别社交媒体帖子中重要的隐含信息,以了解个人的个性特征。我们的方法显示了公共可用知识库在从用户生成的内容中提取隐含信息方面的价值,以及它们对预测个人性格特征的影响。我们使用著名的“myPersonality”数据集评估了我们的方法,并表明它优于主要依赖句法特征的最先进算法。
{"title":"Knowledge-Driven Approach to Predict Personality Traits by Leveraging Social Media Data","authors":"M. Thilakaratne, R. Weerasinghe, Sujan Perera","doi":"10.1109/WI.2016.0048","DOIUrl":"https://doi.org/10.1109/WI.2016.0048","url":null,"abstract":"The day-to-day behavior of the individuals reveal their personality traits. With the emergence of the social media platforms, some aspects of this behavior are being recorded in their online profiles. This provides necessary input to develop algorithms that can predict personality traits of individuals. However, these algorithms need to exploit the semantics of the data in order to reveal the personality traits. Current studies on this topic mainly exploited the syntactic features of the language used by individuals to predict their personality traits. In this work we demonstrate the value of exploiting semantics of the messages conveyed in social media posts for predicting personality traits. In other words, we present a study that attempts to simulate the cognitive ability of the human brain, which allows to identify the important implicit information in social media posts for understanding the personality traits of an individual. Our approach shows the value of publicly available knowledge bases in eliciting implicit information in the user generated content and their impact on predicting the personality traits of an individual. We evaluated our approach using well-known 'myPersonality' dataset and showed that it outperforms the state-of-the-art algorithms that mainly depend on syntactic features.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"33 1","pages":"288-295"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89303884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Multi-agent Simulation Framework for Large-Scale Coalition Formation 大规模联盟形成的多智能体仿真框架
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0055
Pavel Janovsky, S. DeLoach
Coalition formation, a key factor in multi-agent cooperation, can be solved optimally for at most a few dozen agents. This paper proposes a general approach to find suboptimal solutions for a large-scale coalition formation problem containing thousands of agents using multi-agent simulation. We model coalition formation as an iterative process in which agents join and leave coalitions, and we propose several valuation functions that assign values to the coalitions. We propose several coalition selection strategies that agents may use to decide whether or not to leave their current coalition and which coalition to join. We also show how these valuation functions and coalition selection strategies represent specific coalition formation applications. Finally, we show almost-optimal performance of our algorithms in small-scale scenarios by comparing our solutions with an optimal solution, and we show stable performance in a large-scale setting in which searching for the optimal solution is not feasible.
联盟的形成是多智能体合作的关键问题,其最优解最多为几十个智能体。本文提出了一种利用多智能体仿真求解包含数千个智能体的大规模联盟形成问题的次优解的一般方法。我们将联盟形成建模为代理加入和离开联盟的迭代过程,并提出了几个评估函数,为联盟分配值。我们提出了几种联盟选择策略,代理可以使用这些策略来决定是否离开当前的联盟以及加入哪个联盟。我们还展示了这些评估函数和联盟选择策略如何代表特定的联盟形成应用。最后,我们通过将我们的解决方案与最优解进行比较,在小规模场景中展示了我们的算法的几乎最优性能,并且我们在大规模设置中显示了稳定的性能,其中搜索最优解是不可行的。
{"title":"Multi-agent Simulation Framework for Large-Scale Coalition Formation","authors":"Pavel Janovsky, S. DeLoach","doi":"10.1109/WI.2016.0055","DOIUrl":"https://doi.org/10.1109/WI.2016.0055","url":null,"abstract":"Coalition formation, a key factor in multi-agent cooperation, can be solved optimally for at most a few dozen agents. This paper proposes a general approach to find suboptimal solutions for a large-scale coalition formation problem containing thousands of agents using multi-agent simulation. We model coalition formation as an iterative process in which agents join and leave coalitions, and we propose several valuation functions that assign values to the coalitions. We propose several coalition selection strategies that agents may use to decide whether or not to leave their current coalition and which coalition to join. We also show how these valuation functions and coalition selection strategies represent specific coalition formation applications. Finally, we show almost-optimal performance of our algorithms in small-scale scenarios by comparing our solutions with an optimal solution, and we show stable performance in a large-scale setting in which searching for the optimal solution is not feasible.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"70 1","pages":"343-350"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83909435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Connection Minimization in REST API with Random Walks 使用随机漫步的REST API中的连接最小化
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0059
Li Li, Min Luo
A key constraint of REST API is that all the resources must be reachable by some hyperlink paths from an entry point. However, to apply this constraint without prudence can result in excessive hyperlinks that do not provide new services but increase the dependence between the resources. Excessive hyperlinks are difficult to identify because: 1) a REST API can have dynamic and unbounded paths, and 2) the hyperlinks used to navigate a path are not observable and can be ambiguous. To tackle the first challenge, we propose a REST API model and a random walk algorithm to reduce the paths of a REST API to a small set. To address the second challenge, we develop a client model and a connection minimization algorithm to identify excessive hyperlinks based on given paths. By combining the random walk and the connection minimization algorithms, our method can minimize the connections of a REST API in polynomial time without involving the actual clients. A prototype system has been implemented and the tests show that the method is correct and can converge 90.6% to 99.9% faster than the baseline approach.
REST API的一个关键约束是,所有的资源必须可以从一个入口点通过一些超链接路径访问。但是,不谨慎地应用此约束可能会导致过量的超链接,这些超链接不能提供新服务,但会增加资源之间的依赖性。过多的超链接很难识别,因为:1)REST API可以具有动态和无界的路径,2)用于导航路径的超链接是不可观察的,并且可能是不明确的。为了解决第一个挑战,我们提出了一个REST API模型和一个随机漫步算法,以将REST API的路径减少到一个小集合。为了解决第二个挑战,我们开发了一个客户端模型和一个连接最小化算法,以根据给定的路径识别过多的超链接。通过结合随机漫步和连接最小化算法,我们的方法可以在多项式时间内最小化REST API的连接,而不涉及实际的客户端。仿真结果表明,该方法是正确的,收敛速度比基准方法快90.6% ~ 99.9%。
{"title":"Connection Minimization in REST API with Random Walks","authors":"Li Li, Min Luo","doi":"10.1109/WI.2016.0059","DOIUrl":"https://doi.org/10.1109/WI.2016.0059","url":null,"abstract":"A key constraint of REST API is that all the resources must be reachable by some hyperlink paths from an entry point. However, to apply this constraint without prudence can result in excessive hyperlinks that do not provide new services but increase the dependence between the resources. Excessive hyperlinks are difficult to identify because: 1) a REST API can have dynamic and unbounded paths, and 2) the hyperlinks used to navigate a path are not observable and can be ambiguous. To tackle the first challenge, we propose a REST API model and a random walk algorithm to reduce the paths of a REST API to a small set. To address the second challenge, we develop a client model and a connection minimization algorithm to identify excessive hyperlinks based on given paths. By combining the random walk and the connection minimization algorithms, our method can minimize the connections of a REST API in polynomial time without involving the actual clients. A prototype system has been implemented and the tests show that the method is correct and can converge 90.6% to 99.9% faster than the baseline approach.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"80 1","pages":"375-382"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83955885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Preprocessing for Web Combinatorial Problems Web组合问题的数据预处理
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0067
H. Drias, Samir Kechid, Sofia Adamou, Farouk Benyoucef
In the field of data science, we consider usually data independently from a problem to be solved. The originality of this paper consists in handling huge instances of combinatorial problems with datamining technologies in order to reduce the complexity of their treatment. Such task can be performed on Web combinatorial optimization such as internet data packet routing and web clustering. We focus in particular on the satisfiability of Boolean formulae but the proposed idea could be adopted for any other complex problem. The aim is to explore the satisfiability instance using datamining techniques in order to reduce its size, prior to solve it. An estimated solution for the obtained instance is then computed using a hybrid algorithm based on DPLL technique and a genetic algorithm. It is then compared to the solution of the initial instance in order to validate the method effectiveness. We performed experiments on the wellknown BMC datasets and show the benefits of using datamining techniques as a pretreatment, prior to solving the problem.
在数据科学领域,我们通常将数据与要解决的问题独立考虑。本文的独创性在于用数据挖掘技术处理组合问题的大量实例,以减少其处理的复杂性。这类任务可以通过网络数据包路由和网络集群等网络组合优化来完成。我们特别关注布尔公式的可满足性,但所提出的思想可以用于任何其他复杂问题。目的是利用数据挖掘技术探索可满足性实例,以便在解决它之前减小它的大小。然后使用基于DPLL技术和遗传算法的混合算法计算得到的实例的估计解。然后将其与初始实例的解进行比较,以验证方法的有效性。我们在著名的BMC数据集上进行了实验,并展示了在解决问题之前使用数据挖掘技术作为预处理的好处。
{"title":"Data Preprocessing for Web Combinatorial Problems","authors":"H. Drias, Samir Kechid, Sofia Adamou, Farouk Benyoucef","doi":"10.1109/WI.2016.0067","DOIUrl":"https://doi.org/10.1109/WI.2016.0067","url":null,"abstract":"In the field of data science, we consider usually data independently from a problem to be solved. The originality of this paper consists in handling huge instances of combinatorial problems with datamining technologies in order to reduce the complexity of their treatment. Such task can be performed on Web combinatorial optimization such as internet data packet routing and web clustering. We focus in particular on the satisfiability of Boolean formulae but the proposed idea could be adopted for any other complex problem. The aim is to explore the satisfiability instance using datamining techniques in order to reduce its size, prior to solve it. An estimated solution for the obtained instance is then computed using a hybrid algorithm based on DPLL technique and a genetic algorithm. It is then compared to the solution of the initial instance in order to validate the method effectiveness. We performed experiments on the wellknown BMC datasets and show the benefits of using datamining techniques as a pretreatment, prior to solving the problem.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"17 1","pages":"425-428"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83833311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Interactive Circular Visual Analytic Tool for Visualization of Web Data 用于Web数据可视化的交互式圆形可视化分析工具
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0127
P. Dubois, Zhao Han, Fan Jiang, C. Leung
Visual analytics on frequent web usage patterns aims to help users to (i) analyze the data so as to discover implicit, previously unknown and potentially useful information in the form of collections of frequently visited web pages in a single session and to (ii) visually represent the discovered knowledge so as to gain insight about the data. In this paper, we propose an interactive visual analytics tool (iVAT) for frequent pattern mining. It uses an orientation free, circular layout to show frequent patterns. Moreover, we provide users with interactive feature to explicitly show connections between superset and subsets of sets of visited web pages. Experimental results show the effectiveness of our iVAT for visual analytics of frequent patterns about web data.
对频繁的网络使用模式进行可视化分析的目的是帮助用户(i)分析数据,以发现隐含的、以前未知的、潜在有用的信息,这些信息是以单个会话中频繁访问的网页集合的形式呈现的;(ii)可视化地表示发现的知识,以便深入了解数据。本文提出了一种用于频繁模式挖掘的交互式可视化分析工具(iVAT)。它使用方向自由的圆形布局来显示频繁的模式。此外,我们为用户提供交互功能,以显式显示访问过的网页集的超集和子集之间的连接。实验结果表明了我们的iVAT对web数据频繁模式的可视化分析的有效性。
{"title":"An Interactive Circular Visual Analytic Tool for Visualization of Web Data","authors":"P. Dubois, Zhao Han, Fan Jiang, C. Leung","doi":"10.1109/WI.2016.0127","DOIUrl":"https://doi.org/10.1109/WI.2016.0127","url":null,"abstract":"Visual analytics on frequent web usage patterns aims to help users to (i) analyze the data so as to discover implicit, previously unknown and potentially useful information in the form of collections of frequently visited web pages in a single session and to (ii) visually represent the discovered knowledge so as to gain insight about the data. In this paper, we propose an interactive visual analytics tool (iVAT) for frequent pattern mining. It uses an orientation free, circular layout to show frequent patterns. Moreover, we provide users with interactive feature to explicitly show connections between superset and subsets of sets of visited web pages. Experimental results show the effectiveness of our iVAT for visual analytics of frequent patterns about web data.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"1 1","pages":"709-712"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90620252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Supporting News Article Understanding by Detecting Subject-Background Event Relations 通过检测主题-背景事件关系支持新闻文章理解
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0044
Shotaro Tanaka, A. Jatowt, Katsumi Tanaka
Typically, news articles mention not just one but multiple events. These events can be classified into subject or background events. The former are events that the article is written about, while the latter are additional events referred to in order to explain the background of the subject events (e.g., causal relations, circumstances or the consequences of the main event). Background events are considered to play an important role in helping to understand articles. In this paper, we first propose to classify content of news articles into subject or background event descriptions. In the second part of the paper, we demonstrate a novel solution for improving the news article search. Based on the subject and background relationship structure between events and articles, our method outputs news articles that help with understanding of a given target article.
通常情况下,新闻文章不只提到一个事件,而是提到多个事件。这些事件可以分为主题事件和背景事件。前者是文章所写的事件,而后者是为了解释主题事件的背景而提到的附加事件(例如,因果关系,情况或主要事件的后果)。背景事件被认为在帮助理解文章方面起着重要作用。在本文中,我们首先提出将新闻文章的内容分为主题描述和背景事件描述。在论文的第二部分,我们展示了一种改进新闻文章搜索的新解决方案。基于事件和文章之间的主题和背景关系结构,我们的方法输出有助于理解给定目标文章的新闻文章。
{"title":"Supporting News Article Understanding by Detecting Subject-Background Event Relations","authors":"Shotaro Tanaka, A. Jatowt, Katsumi Tanaka","doi":"10.1109/WI.2016.0044","DOIUrl":"https://doi.org/10.1109/WI.2016.0044","url":null,"abstract":"Typically, news articles mention not just one but multiple events. These events can be classified into subject or background events. The former are events that the article is written about, while the latter are additional events referred to in order to explain the background of the subject events (e.g., causal relations, circumstances or the consequences of the main event). Background events are considered to play an important role in helping to understand articles. In this paper, we first propose to classify content of news articles into subject or background event descriptions. In the second part of the paper, we demonstrate a novel solution for improving the news article search. Based on the subject and background relationship structure between events and articles, our method outputs news articles that help with understanding of a given target article.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"39 1","pages":"256-263"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85733724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Discovering Coherent Topics with Entity Topic Models 用实体主题模型发现连贯主题
Pub Date : 2016-10-01 DOI: 10.1109/WI.2016.0015
M. Allahyari, K. Kochut
Probabilistic topic models are powerful techniques which are widely used for discovering topics or semantic content from a large collection of documents. However, because topic models are entirely unsupervised, they may lead to topics that are not understandable in applications. Recently, several knowledge-based topic models have been proposed which primarily use word-level domain knowledge in the model to enhance the topic coherence and ignore the rich information carried by entities (e.g persons, location, organizations, etc.) associated with the documents. Additionally, there exists a vast amount of prior knowledge (background knowledge) represented as ontologies and Linked Open Data (LOD), which can be incorporated into the topic models to produce coherent topics. In this paper, we introduce a novel entity-based topic model, called EntLDA, to effectively integrate an ontology with an entity topic model to improve the topic modeling process. Furthermore, to increase the coherence of the identified topics, we introduce a novel ontology-based regularization framework, which is then integrated with the EntLDA model. Our experimental results demonstrate the effectiveness of the proposed model in improving the coherence of the topics.
概率主题模型是一种强大的技术,广泛用于从大量文档中发现主题或语义内容。然而,由于主题模型是完全不受监督的,它们可能导致在应用程序中无法理解的主题。近年来,人们提出了几种基于知识的主题模型,这些模型主要利用词级领域知识来增强主题的连贯性,而忽略了与文档相关的实体(如人物、地点、组织等)所携带的丰富信息。此外,存在大量以本体和关联开放数据(LOD)表示的先验知识(背景知识),这些知识可以被纳入主题模型以产生连贯的主题。在本文中,我们引入了一种新的基于实体的主题模型EntLDA,将本体与实体主题模型有效地集成在一起,以改进主题建模过程。此外,为了增加识别主题的一致性,我们引入了一种新的基于本体的正则化框架,然后将其与EntLDA模型集成。我们的实验结果证明了该模型在提高主题一致性方面的有效性。
{"title":"Discovering Coherent Topics with Entity Topic Models","authors":"M. Allahyari, K. Kochut","doi":"10.1109/WI.2016.0015","DOIUrl":"https://doi.org/10.1109/WI.2016.0015","url":null,"abstract":"Probabilistic topic models are powerful techniques which are widely used for discovering topics or semantic content from a large collection of documents. However, because topic models are entirely unsupervised, they may lead to topics that are not understandable in applications. Recently, several knowledge-based topic models have been proposed which primarily use word-level domain knowledge in the model to enhance the topic coherence and ignore the rich information carried by entities (e.g persons, location, organizations, etc.) associated with the documents. Additionally, there exists a vast amount of prior knowledge (background knowledge) represented as ontologies and Linked Open Data (LOD), which can be incorporated into the topic models to produce coherent topics. In this paper, we introduce a novel entity-based topic model, called EntLDA, to effectively integrate an ontology with an entity topic model to improve the topic modeling process. Furthermore, to increase the coherence of the identified topics, we introduce a novel ontology-based regularization framework, which is then integrated with the EntLDA model. Our experimental results demonstrate the effectiveness of the proposed model in improving the coherence of the topics.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"17 1","pages":"26-33"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88417419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
期刊
2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1