首页 > 最新文献

2010 Sixth International Conference on Semantics, Knowledge and Grids最新文献

英文 中文
Weaving the Semantic Link Network of Events 编织事件的语义链接网络
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.40
Junsheng Zhang, Yunchuan Sun
Event processing is a significant evolution in the field of information technology. Semantic link network (SLN) is a semantic data model for managing resources and their semantic relations. This paper proposes a two-layered SLN model for event processing: the matter level and the event level. The event SLN aims to record and manage the evolving process of the matter level. We propose a domain independent schema for the event SLN consisting of a set of primary link types and a set of reasoning rules. The model is useful in data encapsulating, knowledge retrieving, knowledge flow discovery, and intelligent applications for the internet of things.
事件处理是信息技术领域的一个重要发展。语义链路网络(SLN)是一种用于管理资源及其语义关系的语义数据模型。本文提出了一种事件处理的两层SLN模型:物质层和事件层。事件SLN旨在记录和管理物质层面的演变过程。我们提出了一种由一组主链路类型和一组推理规则组成的独立于域的事件单节点网络模式。该模型可用于数据封装、知识检索、知识流发现和物联网智能应用。
{"title":"Weaving the Semantic Link Network of Events","authors":"Junsheng Zhang, Yunchuan Sun","doi":"10.1109/SKG.2010.40","DOIUrl":"https://doi.org/10.1109/SKG.2010.40","url":null,"abstract":"Event processing is a significant evolution in the field of information technology. Semantic link network (SLN) is a semantic data model for managing resources and their semantic relations. This paper proposes a two-layered SLN model for event processing: the matter level and the event level. The event SLN aims to record and manage the evolving process of the matter level. We propose a domain independent schema for the event SLN consisting of a set of primary link types and a set of reasoning rules. The model is useful in data encapsulating, knowledge retrieving, knowledge flow discovery, and intelligent applications for the internet of things.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124443237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Reconsidering Nature of Thesaurus and Its Automatic Construction in Information Network 再论信息网络中词库的性质及其自动构建
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.73
Wen Zeng, Huilin Wang, Junsheng Zhang
Modern information network such as Digital Library contain much more data than ever before. These data are globally distributed, become accessible to huge, heterogeneous user groups easily. On the other hand, the enormous amount of information requires powerful tools for the user to find the relevant data. One such tool is thesaurus. The thesaurus as an ontology is playing an increasingly important role in knowledge management and the semantic web. The paper reconsiders the nature of thesaurus from the view of ontology. It proposes a system framework of thesaurus construction. And the approach and ideas of thesaurus construction are also described in the paper.
数字图书馆等现代信息网络所包含的数据比以往任何时候都要多。这些数据是全球分布的,可以很容易地被庞大的、异构的用户群访问。另一方面,海量的信息需要强大的工具来帮助用户查找相关数据。其中一个工具就是同义词词典。词库作为一种本体,在知识管理和语义网中发挥着越来越重要的作用。本文从本体论的角度重新思考词库的本质。提出了一个词库构建的系统框架。本文还阐述了构建词表的方法和思路。
{"title":"Reconsidering Nature of Thesaurus and Its Automatic Construction in Information Network","authors":"Wen Zeng, Huilin Wang, Junsheng Zhang","doi":"10.1109/SKG.2010.73","DOIUrl":"https://doi.org/10.1109/SKG.2010.73","url":null,"abstract":"Modern information network such as Digital Library contain much more data than ever before. These data are globally distributed, become accessible to huge, heterogeneous user groups easily. On the other hand, the enormous amount of information requires powerful tools for the user to find the relevant data. One such tool is thesaurus. The thesaurus as an ontology is playing an increasingly important role in knowledge management and the semantic web. The paper reconsiders the nature of thesaurus from the view of ontology. It proposes a system framework of thesaurus construction. And the approach and ideas of thesaurus construction are also described in the paper.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126295997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Holistic Solution for Duplicate Entity Identification in Deep Web Data Integration 深度网络数据集成中重复实体识别的整体解决方案
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.38
W. Liu, Xiaofeng Meng
The proliferation of deep Web offers users a great opportunity to search high-quality information from Web. As a necessary step in deep Web data integration, the goal of duplicate entity identification is to discover the duplicate records from the integrated Web databases for further applications(e.g. price-comparison services). However, most of existing works address this issue only between two data sources, which are not practical to deep Web data integration systems. That is, one duplicate entity matcher trained over two specific Web databases cannot be applied to other Web databases. In addition, the cost of preparing the training set for n Web databases is C_n^2 times higher than that for two Web databases. In this paper, we propose a holistic solution to address the new challenges posed by deep Web, whose goal is to build one duplicate entity matcher over multiple Web databases. The extensive experiments on two domains show that the proposed solution is highly effective for deep Web data integration.
深度网络的扩散为用户提供了从网络中搜索高质量信息的绝佳机会。作为深度Web数据集成的必要步骤,重复实体识别的目标是从集成的Web数据库中发现重复的记录,以便进一步应用(如:价格比较服务)。然而,现有的大多数工作只解决了两个数据源之间的问题,这对于深度Web数据集成系统来说是不实用的。也就是说,在两个特定Web数据库上训练的重复实体匹配器不能应用于其他Web数据库。另外,n个Web数据库的训练集准备成本比2个Web数据库的训练集准备成本高C_n^2倍。在本文中,我们提出了一个整体解决方案来解决深度网络带来的新挑战,其目标是在多个Web数据库上构建一个重复的实体匹配器。在两个领域的大量实验表明,该方案对深度Web数据集成是非常有效的。
{"title":"A Holistic Solution for Duplicate Entity Identification in Deep Web Data Integration","authors":"W. Liu, Xiaofeng Meng","doi":"10.1109/SKG.2010.38","DOIUrl":"https://doi.org/10.1109/SKG.2010.38","url":null,"abstract":"The proliferation of deep Web offers users a great opportunity to search high-quality information from Web. As a necessary step in deep Web data integration, the goal of duplicate entity identification is to discover the duplicate records from the integrated Web databases for further applications(e.g. price-comparison services). However, most of existing works address this issue only between two data sources, which are not practical to deep Web data integration systems. That is, one duplicate entity matcher trained over two specific Web databases cannot be applied to other Web databases. In addition, the cost of preparing the training set for n Web databases is C_n^2 times higher than that for two Web databases. In this paper, we propose a holistic solution to address the new challenges posed by deep Web, whose goal is to build one duplicate entity matcher over multiple Web databases. The extensive experiments on two domains show that the proposed solution is highly effective for deep Web data integration.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126092799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Research of Mobile E-commerce Security Solution Based on External Electronic Device 基于外部电子设备的移动电子商务安全解决方案研究
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.58
Dechao Sun, Tiejun Pan, Zhong Wan, Haiyan He
Along with the development of mobile E-commerce, its security problem received more and more attention, and becomed the main factor that affects its development. To address this, we proposed a new Mobile E-commerce system solution that can be used for mobile payment without changing mobile device hardware configuration by connecting to an special and exteral security electronic device. This electronic security device is connected to the mobile device through adaptable interface for enhancing the security ability and storing private data. As an application, we implement a mobile payment system consists of the front end administration module on the mobile device, backend administration module on the server and electronic device as the security module. This solution realizes the high security and low cost of mobile payment, has good applied value and marketable foreground.
随着移动电子商务的发展,其安全问题越来越受到重视,并成为影响其发展的主要因素。为了解决这个问题,我们提出了一种新的移动电子商务系统解决方案,通过连接一个特殊的外部安全电子设备,可以在不改变移动设备硬件配置的情况下用于移动支付。该电子安全装置通过自适应接口与移动设备连接,增强安全能力并存储私人数据。作为一个应用程序,我们实现了一个移动支付系统,该系统由移动设备端的前端管理模块、服务器端的后端管理模块和电子设备端的安全模块组成。该方案实现了移动支付的高安全性和低成本,具有良好的应用价值和市场前景。
{"title":"Research of Mobile E-commerce Security Solution Based on External Electronic Device","authors":"Dechao Sun, Tiejun Pan, Zhong Wan, Haiyan He","doi":"10.1109/SKG.2010.58","DOIUrl":"https://doi.org/10.1109/SKG.2010.58","url":null,"abstract":"Along with the development of mobile E-commerce, its security problem received more and more attention, and becomed the main factor that affects its development. To address this, we proposed a new Mobile E-commerce system solution that can be used for mobile payment without changing mobile device hardware configuration by connecting to an special and exteral security electronic device. This electronic security device is connected to the mobile device through adaptable interface for enhancing the security ability and storing private data. As an application, we implement a mobile payment system consists of the front end administration module on the mobile device, backend administration module on the server and electronic device as the security module. This solution realizes the high security and low cost of mobile payment, has good applied value and marketable foreground.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"84 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125924189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Comparison of Wordprocessing Document Format in OOXML and ODF OOXML和ODF中文字处理文档格式的比较
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.44
Xia Hou, Ning Li, Hongbo Yang, Qi Liang
OOXML (Office Open XML) and ODF (Open Document Format) are the world's two major document format standards. The structure and some main components of the word processing document in OOXML and ODF are analysed and compared in detail. The results are the base of document interoperability between OOXML and ODF. Some valuable conclusions can be gotten that 1) Most components of the word processing document in one format have the logical counterpart in the other one, 2) Some components have no counterpart or some corresponding relationships between OOXML and ODF are very complicated. The latter point is one of the most important obstructions in the interoperability between OOXML and ODF.
OOXML (Office Open XML)和ODF (Open Document Format)是世界上两个主要的文档格式标准。对OOXML和ODF格式的文字处理文档的结构和主要组成部分进行了详细的分析和比较。这些结果是OOXML和ODF之间文档互操作性的基础。可以得出一些有价值的结论:1)一种格式的文字处理文档的大多数组件在另一种格式中具有逻辑对应物;2)一些组件没有对应物或OOXML与ODF之间的某些对应关系非常复杂。后一点是OOXML和ODF之间互操作性中最重要的障碍之一。
{"title":"Comparison of Wordprocessing Document Format in OOXML and ODF","authors":"Xia Hou, Ning Li, Hongbo Yang, Qi Liang","doi":"10.1109/SKG.2010.44","DOIUrl":"https://doi.org/10.1109/SKG.2010.44","url":null,"abstract":"OOXML (Office Open XML) and ODF (Open Document Format) are the world's two major document format standards. The structure and some main components of the word processing document in OOXML and ODF are analysed and compared in detail. The results are the base of document interoperability between OOXML and ODF. Some valuable conclusions can be gotten that 1) Most components of the word processing document in one format have the logical counterpart in the other one, 2) Some components have no counterpart or some corresponding relationships between OOXML and ODF are very complicated. The latter point is one of the most important obstructions in the interoperability between OOXML and ODF.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131713181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Join Optimization in the MapReduce Environment for Column-wise Data Store MapReduce环境下列式数据存储的Join优化
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.18
Minqi Zhou, Rong Zhang, Dadan Zeng, Weining Qian, Aoying Zhou
The chain join processing which combines records from two or more tables sequentially has been well studied in the centralized databases. However, it has seldom been discussed in the cloud computing era, and remains imperative to be solved, especially where structured (or relational) data are stored in a column (attribute) wise fashion in distributed file systems (e.g., Google File System) over hundreds of or even thousands of commodities PCs. In this paper, we propose a novel method for chain join processing, which is one of the common primitives in the cloud era for column-wise stored data analysis. By effectively selecting the dedicated records (tuples) for the chain join based on the information exploited within bipartite join graph, communication cost for record transmission could be reduced dramatically. A bushy tree structure is deployed to regulate the chain join sequence, which further reduces the number of intermediate results generated and transmitted, and explores higher parallelism in join processing, while results in more efficient join processing. Our extensive performance study confirms the effectiveness and efficiency of our methods.
在集中式数据库中,将两个或多个表的记录按顺序组合在一起的链式连接处理已经得到了很好的研究。然而,它在云计算时代很少被讨论,并且仍然是迫切需要解决的问题,特别是当结构化(或关系)数据以列(属性)明智的方式存储在分布式文件系统(例如,Google文件系统)中,超过数百甚至数千台商品pc。在本文中,我们提出了一种新的链连接处理方法,这是云时代用于列式存储数据分析的常见原语之一。基于二部连接图中所利用的信息,有效地选择用于链连接的专用记录(元组),可以显著降低记录传输的通信成本。采用灌木树结构对链连接序列进行调节,进一步减少了中间结果的生成和传输,并在连接处理中探索了更高的并行性,从而提高了连接处理效率。我们广泛的性能研究证实了我们方法的有效性和效率。
{"title":"Join Optimization in the MapReduce Environment for Column-wise Data Store","authors":"Minqi Zhou, Rong Zhang, Dadan Zeng, Weining Qian, Aoying Zhou","doi":"10.1109/SKG.2010.18","DOIUrl":"https://doi.org/10.1109/SKG.2010.18","url":null,"abstract":"The chain join processing which combines records from two or more tables sequentially has been well studied in the centralized databases. However, it has seldom been discussed in the cloud computing era, and remains imperative to be solved, especially where structured (or relational) data are stored in a column (attribute) wise fashion in distributed file systems (e.g., Google File System) over hundreds of or even thousands of commodities PCs. In this paper, we propose a novel method for chain join processing, which is one of the common primitives in the cloud era for column-wise stored data analysis. By effectively selecting the dedicated records (tuples) for the chain join based on the information exploited within bipartite join graph, communication cost for record transmission could be reduced dramatically. A bushy tree structure is deployed to regulate the chain join sequence, which further reduces the number of intermediate results generated and transmitted, and explores higher parallelism in join processing, while results in more efficient join processing. Our extensive performance study confirms the effectiveness and efficiency of our methods.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132219093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Mapping Relational Databases into Ontologies through a Graph-based Formal Model 通过基于图的形式化模型将关系数据库映射到本体
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.33
Shihan Yang, Jinzhao Wu
One of key issues of the Semantic Web applications is the lack of semantic data (ontologies). Although the vast majority of data are stored in the popular relational databases, they are still not easily available for many next generation Web applications. Therefore, one of core challenges of Semantic Web is whether these applications can automatically retrieve semantic information from the existed relational databases. This paper proposes a middle graph-based formal model language, W-graph, a bridge between relational databases and ontologies, which abstracts semantic information from relational database instances semi-automatically and then generates an OWL ontology automatically. This method not only maps relational database schemata to ontologies, but also populates ontologies with data stored in databases. Moreover, a proof of semantic preserving on the mapping is provided, and a case study and an implemented prototype tool are also reported.
语义Web应用程序的一个关键问题是缺乏语义数据(本体)。尽管绝大多数数据都存储在流行的关系数据库中,但对于许多下一代Web应用程序来说,它们仍然不容易获得。因此,语义Web的核心挑战之一是这些应用程序能否自动地从现有的关系数据库中检索语义信息。本文提出了一种基于中间图的形式化模型语言W-graph,它作为关系数据库与本体之间的桥梁,半自动地从关系数据库实例中提取语义信息,然后自动生成OWL本体。该方法不仅将关系数据库模式映射到本体,而且还使用存储在数据库中的数据填充本体。此外,给出了映射上语义保持的证明,并给出了一个实例研究和实现的原型工具。
{"title":"Mapping Relational Databases into Ontologies through a Graph-based Formal Model","authors":"Shihan Yang, Jinzhao Wu","doi":"10.1109/SKG.2010.33","DOIUrl":"https://doi.org/10.1109/SKG.2010.33","url":null,"abstract":"One of key issues of the Semantic Web applications is the lack of semantic data (ontologies). Although the vast majority of data are stored in the popular relational databases, they are still not easily available for many next generation Web applications. Therefore, one of core challenges of Semantic Web is whether these applications can automatically retrieve semantic information from the existed relational databases. This paper proposes a middle graph-based formal model language, W-graph, a bridge between relational databases and ontologies, which abstracts semantic information from relational database instances semi-automatically and then generates an OWL ontology automatically. This method not only maps relational database schemata to ontologies, but also populates ontologies with data stored in databases. Moreover, a proof of semantic preserving on the mapping is provided, and a case study and an implemented prototype tool are also reported.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115185634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Multimedia Information Retrieval Algorithm in P2P Networks Based on the Classification of Peers 基于对等体分类的P2P网络多媒体信息检索算法
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.54
G. Wu, Zhipeng Jiang, Suixiang Gao, Wenguo Yang
The Multimedia Information Retrieval (MIR) in the P2P networks has been widely studied. In this paper, we propose a new comprehensive similarity function to calculate the similarity of peers in the P2P networks so as to classify these peers. We also apply the relevance feedback in the process of retrieval in order to improve the speed and accuracy of retrieval. In simulation, we compare our algorithm to the traditional method on the basis of the performance of the test which includes four types of thousands of files (text, image, video, and audio). The results show that our algorithm performs better on both speed and accuracy.
P2P网络中的多媒体信息检索(MIR)已经得到了广泛的研究。本文提出了一种新的综合相似度函数来计算P2P网络中对等点的相似度,从而对对等点进行分类。为了提高检索的速度和准确性,我们还在检索过程中应用了相关反馈。在模拟中,我们将我们的算法与传统方法进行比较,基于测试的性能,包括四种类型的数千个文件(文本,图像,视频和音频)。结果表明,该算法在速度和精度上都有较好的提高。
{"title":"A Multimedia Information Retrieval Algorithm in P2P Networks Based on the Classification of Peers","authors":"G. Wu, Zhipeng Jiang, Suixiang Gao, Wenguo Yang","doi":"10.1109/SKG.2010.54","DOIUrl":"https://doi.org/10.1109/SKG.2010.54","url":null,"abstract":"The Multimedia Information Retrieval (MIR) in the P2P networks has been widely studied. In this paper, we propose a new comprehensive similarity function to calculate the similarity of peers in the P2P networks so as to classify these peers. We also apply the relevance feedback in the process of retrieval in order to improve the speed and accuracy of retrieval. In simulation, we compare our algorithm to the traditional method on the basis of the performance of the test which includes four types of thousands of files (text, image, video, and audio). The results show that our algorithm performs better on both speed and accuracy.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121576545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Searching for Historical Events on a Large-Scale Web Archive 在大规模Web档案中搜索历史事件
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.37
Lian'en Huang, Wu Lin, Xiaoming Li
Finding knowledge on the Web has long been a hot research issue. Today the Web has become a popular medium for publishing news and opinion articles, which are important carriers of human knowledge, especially of social knowledge. Developing techniques of automatically collecting and analysing these articles on a large scale is thus desirable. In this paper we propose techniques for searching for events on the Web, and our techniques have been tested on a large scale web archive. Given an event, or a news topic cared by many people, the purpose of this paper is to find out near-all news stories related to it. First, a novel domain-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords are used to represent an event and compose the query expression. This situation of retrieval is different from most search engines' in that the number of keywords is large. We then propose several retrieval algorithms based on BM25 for the method. Evaluation show that these algorithms perform better than unmodified BM25 in our situation and the best one is chosen as the algorithm of our system. Finally an experimental system has been built on a collection of 2 billion web pages and the running performance is reported, which shows the effectiveness of our approaches.
在网络上查找知识一直是一个研究热点问题。今天,网络已经成为发布新闻和评论文章的流行媒介,它们是人类知识特别是社会知识的重要载体。因此,开发大规模自动收集和分析这些物品的技术是可取的。在本文中,我们提出了在Web上搜索事件的技术,我们的技术已经在一个大规模的Web存档中进行了测试。给定一个事件,或者一个很多人关心的新闻话题,本文的目的是找出几乎所有与之相关的新闻故事。首先,提出了一种适用于大多数网站的基于锚文本的独立于领域的网页新闻故事提取方法。实验表明,我们的方法性能良好,优于我们发现的另一种方法。其次,提出了一种基于域的事件表示方法,其中使用数百个关键字来表示事件并组成查询表达式。这种检索的情况与大多数搜索引擎的不同之处在于关键词的数量很大。针对该方法,提出了几种基于BM25的检索算法。在我们的情况下,这些算法的性能优于未修改的BM25算法,并选择了其中最好的算法作为我们系统的算法。最后在20亿个网页的集合上建立了一个实验系统,并报告了运行性能,证明了我们的方法的有效性。
{"title":"Searching for Historical Events on a Large-Scale Web Archive","authors":"Lian'en Huang, Wu Lin, Xiaoming Li","doi":"10.1109/SKG.2010.37","DOIUrl":"https://doi.org/10.1109/SKG.2010.37","url":null,"abstract":"Finding knowledge on the Web has long been a hot research issue. Today the Web has become a popular medium for publishing news and opinion articles, which are important carriers of human knowledge, especially of social knowledge. Developing techniques of automatically collecting and analysing these articles on a large scale is thus desirable. In this paper we propose techniques for searching for events on the Web, and our techniques have been tested on a large scale web archive. Given an event, or a news topic cared by many people, the purpose of this paper is to find out near-all news stories related to it. First, a novel domain-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords are used to represent an event and compose the query expression. This situation of retrieval is different from most search engines' in that the number of keywords is large. We then propose several retrieval algorithms based on BM25 for the method. Evaluation show that these algorithms perform better than unmodified BM25 in our situation and the best one is chosen as the algorithm of our system. Finally an experimental system has been built on a collection of 2 billion web pages and the running performance is reported, which shows the effectiveness of our approaches.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126643689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cyber Physical Society 网络物理学会
Pub Date : 2010-11-01 DOI: 10.1109/SKG.2010.7
H. Zhuge
Natural physical space provides material basis for the birth and evolution of human beings and civilization. The progress of human society has created the cyber space. With the rapid development of information technology, the cyber space is connecting physical space, social space and mental space to form a new world — Cyber Physical Society. The way to explore the cyber physical society is different from the way to explore the natural physical space and society. This paper describes the ideal of the Cyber Physical Society, and presents its distinguished characteristics and scientific issues. Research on the Cyber Physical Society could lead to the revolution of society, science and technology.
自然物理空间为人类和文明的诞生与演化提供了物质基础。人类社会的进步创造了网络空间。随着信息技术的飞速发展,网络空间正在连接物理空间、社会空间和精神空间,形成一个新的世界——网络物理社会。探索网络物理社会的方式不同于探索自然物理空间和社会的方式。本文阐述了网络物理社会的理想,提出了网络物理社会的特点和存在的科学问题。对网络物理社会的研究可以引发社会、科学和技术的革命。
{"title":"Cyber Physical Society","authors":"H. Zhuge","doi":"10.1109/SKG.2010.7","DOIUrl":"https://doi.org/10.1109/SKG.2010.7","url":null,"abstract":"Natural physical space provides material basis for the birth and evolution of human beings and civilization. The progress of human society has created the cyber space. With the rapid development of information technology, the cyber space is connecting physical space, social space and mental space to form a new world — Cyber Physical Society. The way to explore the cyber physical society is different from the way to explore the natural physical space and society. This paper describes the ideal of the Cyber Physical Society, and presents its distinguished characteristics and scientific issues. Research on the Cyber Physical Society could lead to the revolution of society, science and technology.","PeriodicalId":105513,"journal":{"name":"2010 Sixth International Conference on Semantics, Knowledge and Grids","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121312679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
2010 Sixth International Conference on Semantics, Knowledge and Grids
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1