首页 > 最新文献

Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)最新文献

英文 中文
Concurrent goal-oriented co-clustering generation in social networks 社交网络中并发目标导向的共聚类生成
Fengjiao Wang, Guan Wang, Shuyang Lin, Philip S. Yu
Recent years, social network has attracted many attentions from research communities in data mining, social science and mobile etc, since users can create different types of information due to different actions and the information gives us the opportunities to better understand the insights of people's social lives. Co-clustering is an important technique to detect patterns and phenomena of two types of closely related objects. For example, in a location based social network, places can be clustered with regards to location and category, respectively and users can be clustered w.r.t. their location and interests, respectively. Therefore, there are usually some latent goals behind a co-clustering application. However, traditionally, co-clustering methods are not specifically designed to handle multiple goals. That leaves certain drawbacks, i.e., it cannot guarantee that objects satisfying each individual goal would be clustered into the same cluster. However, in many cases, clusters of objects meeting the same goal is required, e.g., a user may want to search places within one category but in different locations. In this paper, we propose a goal-oriented co-clustering model, which could generate co-clusterings with regards to different goals simultaneously. By this method, we could get co-clusterings containing objects with desired aspects of information from the original data source. Seed features sets are pre-selected to represent goals of co-clusterings. By generating expanded feature sets from seed feature sets, the proposed model concurrently co-clustering objects and assigning other features to different feature clusters.
近年来,社交网络引起了数据挖掘、社会科学和移动等研究领域的广泛关注,因为用户可以通过不同的行为创造不同类型的信息,这些信息让我们有机会更好地了解人们的社交生活。共聚类是检测两类密切相关对象的模式和现象的重要技术。例如,在基于位置的社交网络中,可以分别根据位置和类别对地点进行聚类,并且可以分别根据用户的位置和兴趣对用户进行聚类。因此,在协同集群应用程序背后通常有一些潜在的目标。然而,传统的共聚类方法并不是专门为处理多个目标而设计的。这留下了一定的缺点,即,它不能保证满足每个单独目标的对象将被聚集到同一个集群中。然而,在许多情况下,需要满足相同目标的对象集群,例如,用户可能想要搜索一个类别中的位置,但在不同的位置。本文提出了一种面向目标的共聚类模型,该模型可以同时生成针对不同目标的共聚类。通过这种方法,我们可以获得包含原始数据源中具有所需信息方面的对象的共聚类。预先选择种子特征集来表示共聚类的目标。该模型通过从种子特征集生成扩展特征集,同时对目标进行共聚,并将其他特征分配到不同的特征聚类中。
{"title":"Concurrent goal-oriented co-clustering generation in social networks","authors":"Fengjiao Wang, Guan Wang, Shuyang Lin, Philip S. Yu","doi":"10.1109/ICOSC.2015.7050833","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050833","url":null,"abstract":"Recent years, social network has attracted many attentions from research communities in data mining, social science and mobile etc, since users can create different types of information due to different actions and the information gives us the opportunities to better understand the insights of people's social lives. Co-clustering is an important technique to detect patterns and phenomena of two types of closely related objects. For example, in a location based social network, places can be clustered with regards to location and category, respectively and users can be clustered w.r.t. their location and interests, respectively. Therefore, there are usually some latent goals behind a co-clustering application. However, traditionally, co-clustering methods are not specifically designed to handle multiple goals. That leaves certain drawbacks, i.e., it cannot guarantee that objects satisfying each individual goal would be clustered into the same cluster. However, in many cases, clusters of objects meeting the same goal is required, e.g., a user may want to search places within one category but in different locations. In this paper, we propose a goal-oriented co-clustering model, which could generate co-clusterings with regards to different goals simultaneously. By this method, we could get co-clusterings containing objects with desired aspects of information from the original data source. Seed features sets are pre-selected to represent goals of co-clusterings. By generating expanded feature sets from seed feature sets, the proposed model concurrently co-clustering objects and assigning other features to different feature clusters.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129131357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Workload pattern analysis 工作负载模式分析
M. Vora
In a service oriented world, performance plays a vital role in the success of any IT system. For an application running in a production environment, whenever there is a change in the workload or workload pattern, utilization of major server resources like cpus, disks, memory, network etc. will also change. In this paper, we are extending our methodology to estimate the server resource utilization for any given workload pattern by extracting the optimal information from the historic production logs (application logs and resource utilization or system monitoring logs) using a specifically designed genetic algorithm. Across all experimental validations, we find the average absolute error in estimating utilization of server resources was less than 15%. Unlike traditional approaches to estimate overall resource utilization, method presented here, neither requires to estimate service demands for each individual application functions nor does it require to benchmark individual business transactions. Only necessary input to the model is the application logs containing the information about the throughput (for example an access log in case of web application) and system monitoring logs containing aggregate resource utilization information.
在面向服务的世界中,性能对任何IT系统的成功都起着至关重要的作用。对于在生产环境中运行的应用程序,每当工作负载或工作负载模式发生变化时,cpu、磁盘、内存、网络等主要服务器资源的利用率也会发生变化。在本文中,我们将扩展我们的方法,通过使用专门设计的遗传算法从历史生产日志(应用程序日志和资源利用或系统监视日志)中提取最佳信息,来估计任何给定工作负载模式的服务器资源利用率。在所有的实验验证中,我们发现估计服务器资源利用率的平均绝对误差小于15%。与估计总体资源利用率的传统方法不同,本文介绍的方法既不需要估计每个单独应用程序功能的服务需求,也不需要对单个业务事务进行基准测试。该模型的唯一必要输入是包含吞吐量信息的应用程序日志(例如web应用程序中的访问日志)和包含聚合资源利用信息的系统监控日志。
{"title":"Workload pattern analysis","authors":"M. Vora","doi":"10.1109/ICOSC.2015.7050793","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050793","url":null,"abstract":"In a service oriented world, performance plays a vital role in the success of any IT system. For an application running in a production environment, whenever there is a change in the workload or workload pattern, utilization of major server resources like cpus, disks, memory, network etc. will also change. In this paper, we are extending our methodology to estimate the server resource utilization for any given workload pattern by extracting the optimal information from the historic production logs (application logs and resource utilization or system monitoring logs) using a specifically designed genetic algorithm. Across all experimental validations, we find the average absolute error in estimating utilization of server resources was less than 15%. Unlike traditional approaches to estimate overall resource utilization, method presented here, neither requires to estimate service demands for each individual application functions nor does it require to benchmark individual business transactions. Only necessary input to the model is the application logs containing the information about the throughput (for example an access log in case of web application) and system monitoring logs containing aggregate resource utilization information.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123968558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Link scientific publications using linked data 使用关联数据链接科学出版物
Qingliang Miao, Yao Meng, Lu Fang, Fumihito Nishino, N. Igata
Scientific publication management services are changing drastically. On the one hand, researchers demand intelligent search services to discover scientific publications. On the other hand, publishers need to incorporate semantic information to better organize their digital assets and make publications more discoverable. For this purpose, we investigate how to manage scientific publications using Linked Data and introduce FELinker, an entity linking component that links scientific publications with DBPedia. In particular, this paper introduces advantages of linking scientific publications with Linked Data, discusses major challenges, and outlines the proposed method. Experiment shows the proposed method could get promising performance in scientific publication linkage.
科学出版管理服务正在发生巨大变化。一方面,研究人员需要智能搜索服务来发现科学出版物。另一方面,出版商需要整合语义信息来更好地组织他们的数字资产,使出版物更容易被发现。为此,我们研究了如何使用关联数据管理科学出版物,并介绍了FELinker,一个将科学出版物与DBPedia链接起来的实体链接组件。特别地,本文介绍了将科学出版物与关联数据连接起来的优势,讨论了主要挑战,并概述了提出的方法。实验表明,该方法在科学出版物链接中取得了良好的效果。
{"title":"Link scientific publications using linked data","authors":"Qingliang Miao, Yao Meng, Lu Fang, Fumihito Nishino, N. Igata","doi":"10.1109/ICOSC.2015.7050818","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050818","url":null,"abstract":"Scientific publication management services are changing drastically. On the one hand, researchers demand intelligent search services to discover scientific publications. On the other hand, publishers need to incorporate semantic information to better organize their digital assets and make publications more discoverable. For this purpose, we investigate how to manage scientific publications using Linked Data and introduce FELinker, an entity linking component that links scientific publications with DBPedia. In particular, this paper introduces advantages of linking scientific publications with Linked Data, discusses major challenges, and outlines the proposed method. Experiment shows the proposed method could get promising performance in scientific publication linkage.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124237986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Ontology-based economic models for bioenergy and biofuel projects 生物能源和生物燃料项目的基于本体的经济模型
Krishna Sapkota, Pathmeswaran Raju, William J. Byrne, C. Chapman
Bioenergy is a renewable energy generated from biomass, while biofuel is a hydrocarbon fuel that is produced from biomass. Recently, bioenergy and biofuel projects are encouraged and supported by many governments and organizations in various ways such as providing incentives, technical supports, information, and decision support tools. Economic model is one of the decision support tools, which helps to estimate the costs and earnings involved in a project. It is constructed with various elements such as concepts, relations, logics, constants and equations. In current economic models, all the elements are hard coded into some programming code, which makes the model less reusable and extendable. To address the issue, we present an ontology-based economic model in this paper. In particular, we have leveraged the Semantic Web technologies to represent the knowledge about the bioenergy and biofuel economics and inferred the equations and other values required for economic calculations. The case study has been carried out in two of the INTERREG Projects and found promising results.
生物能源是由生物质产生的可再生能源,而生物燃料是由生物质产生的碳氢化合物燃料。最近,生物能源和生物燃料项目受到许多政府和组织的鼓励和支持,包括提供激励、技术支持、信息和决策支持工具等。经济模型是决策支持工具之一,它有助于估算项目所涉及的成本和收益。它由概念、关系、逻辑、常数和方程等各种元素构成。在当前的经济模型中,所有元素都硬编码到一些编程代码中,这使得模型的可重用性和可扩展性较差。为了解决这一问题,本文提出了一个基于本体的经济模型。特别是,我们利用语义网技术来表示关于生物能源和生物燃料经济学的知识,并推断出经济计算所需的方程和其他值。该案例研究已在INTERREG项目中的两个项目中进行,并发现了令人鼓舞的结果。
{"title":"Ontology-based economic models for bioenergy and biofuel projects","authors":"Krishna Sapkota, Pathmeswaran Raju, William J. Byrne, C. Chapman","doi":"10.1109/ICOSC.2015.7050839","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050839","url":null,"abstract":"Bioenergy is a renewable energy generated from biomass, while biofuel is a hydrocarbon fuel that is produced from biomass. Recently, bioenergy and biofuel projects are encouraged and supported by many governments and organizations in various ways such as providing incentives, technical supports, information, and decision support tools. Economic model is one of the decision support tools, which helps to estimate the costs and earnings involved in a project. It is constructed with various elements such as concepts, relations, logics, constants and equations. In current economic models, all the elements are hard coded into some programming code, which makes the model less reusable and extendable. To address the issue, we present an ontology-based economic model in this paper. In particular, we have leveraged the Semantic Web technologies to represent the knowledge about the bioenergy and biofuel economics and inferred the equations and other values required for economic calculations. The case study has been carried out in two of the INTERREG Projects and found promising results.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134516140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A mobile social networking service for urban community disaster response 城市社区灾害响应的移动社交网络服务
Yuan-Chih Yu
When disaster strikes the urban community, residents may suffer life-threatening, environmental impact, and economic loss. Meanwhile, the cellular and Internet services are prone to be fail because the disaster may cause the network infrastructure damage. As the communication service is so important for the disaster response, we suggest an emergency social networking solution, called ECSN, to conquer such crisis. ECSN is a community-based emergent social networking service suitable for dealing with the tasks on disaster response. To provide the ECSN service, we construct a Disaster Response Portal dedicated designed for disaster management. From the software architecture perspective, it has mobile client agents and server-side services working together to realize the concept of “Community Social Networking”. Through the Disaster Response Portal, the local disaster rescue and response can be integrated with nationwide disaster management. Also, disaster response can be easily planned and manageable at a community scope and benefits other emergent measures taken by disaster prevention, mitigation, preparedness, and recovery. After simulation experiments validate, the result shows the system can work well on the problem domain. Most importantly, the total solution creates a new practicable model for the mobility of urban disaster response.
当灾难袭击城市社区时,居民可能会遭受生命危险、环境影响和经济损失。同时,由于灾难可能造成网络基础设施的破坏,蜂窝和互联网业务也容易出现故障。由于通信服务在灾难应对中如此重要,我们提出了一种名为ECSN的紧急社交网络解决方案,以克服这种危机。ECSN是一个以社区为基础的应急社会网络服务,适合处理灾害响应任务。为了提供ECSN服务,我们构建了一个专门用于灾难管理的灾难响应门户。从软件架构的角度来看,它将移动客户端代理和服务器端服务协同工作,实现“社区社交网络”的概念。通过灾害响应门户,可以将地方灾害救援和响应与全国灾害管理相结合。此外,灾害应对可以很容易地在社区范围内进行规划和管理,并有利于在防灾、减灾、备灾和恢复方面采取的其他紧急措施。经过仿真实验验证,结果表明该系统能很好地处理问题域。最重要的是,整体解决方案为城市灾害响应的机动性创造了一个新的可行模型。
{"title":"A mobile social networking service for urban community disaster response","authors":"Yuan-Chih Yu","doi":"10.1109/ICOSC.2015.7050860","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050860","url":null,"abstract":"When disaster strikes the urban community, residents may suffer life-threatening, environmental impact, and economic loss. Meanwhile, the cellular and Internet services are prone to be fail because the disaster may cause the network infrastructure damage. As the communication service is so important for the disaster response, we suggest an emergency social networking solution, called ECSN, to conquer such crisis. ECSN is a community-based emergent social networking service suitable for dealing with the tasks on disaster response. To provide the ECSN service, we construct a Disaster Response Portal dedicated designed for disaster management. From the software architecture perspective, it has mobile client agents and server-side services working together to realize the concept of “Community Social Networking”. Through the Disaster Response Portal, the local disaster rescue and response can be integrated with nationwide disaster management. Also, disaster response can be easily planned and manageable at a community scope and benefits other emergent measures taken by disaster prevention, mitigation, preparedness, and recovery. After simulation experiments validate, the result shows the system can work well on the problem domain. Most importantly, the total solution creates a new practicable model for the mobility of urban disaster response.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115658262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A word prediction methodology for automatic sentence completion 一种自动补全句子的词预测方法
Carmelo Spiccia, A. Augello, G. Pilato, G. Vassallo
Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network language models.
单词预测通常依赖于n-gram的出现统计,这可能会有巨大的数据存储需求,并且没有考虑到文本的一般含义。我们提出了一种基于潜在语义分析的替代方法来解决这些问题。与经典的Word-Document方法相比,采用非对称的Word-Word频率矩阵在大型训练数据集上实现了更高的可扩展性。我们提出了一个函数来为句子中缺失的词的候选项打分。我们将展示该函数如何近似给定候选词的出现概率。实验结果表明,该方法优于非神经网络语言模型。
{"title":"A word prediction methodology for automatic sentence completion","authors":"Carmelo Spiccia, A. Augello, G. Pilato, G. Vassallo","doi":"10.1109/ICOSC.2015.7050813","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050813","url":null,"abstract":"Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network language models.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121924075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Google based hybrid approach for discovering services 基于Google的发现服务的混合方法
Shailja Sharma, J. Lather, M. Dave
The Current description standards for Web Services such as WSDL and UDDI have a significant drawback of being restricted to the syntactic aspects of service. A service provider registers a service in the universal repository i.e. UDDI so that the service consumers can search and discover the required service that meets the user functional requirements from thousands of registered services. Matching the user request with all services in a particular category of the repository is a cumbersome task. Semantic approaches are required to further assist the user in discovering relevant services. In this paper, we have proposed a semantic approach that gives ranked list of services based on the web based relatedness score and helps the users in the selection of potentially relevant and semantically similar services within a category. The proposed approach has been implemented on 80 OWLS services and the results have shown that the approach gives ranked list of services with ease of the selection process for the user.
Web服务的当前描述标准(如WSDL和UDDI)有一个明显的缺点,即局限于服务的语法方面。服务提供者在通用存储库(即UDDI)中注册服务,以便服务使用者可以从数千个已注册的服务中搜索和发现满足用户功能需求的所需服务。将用户请求与存储库特定类别中的所有服务相匹配是一项繁琐的任务。需要使用语义方法进一步帮助用户发现相关服务。在本文中,我们提出了一种语义方法,该方法基于基于web的相关性评分给出服务的排名列表,并帮助用户在一个类别中选择潜在的相关和语义相似的服务。该方法已在80个owl服务上实现,结果表明,该方法为用户提供了易于选择的服务排序列表。
{"title":"Google based hybrid approach for discovering services","authors":"Shailja Sharma, J. Lather, M. Dave","doi":"10.1109/ICOSC.2015.7050859","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050859","url":null,"abstract":"The Current description standards for Web Services such as WSDL and UDDI have a significant drawback of being restricted to the syntactic aspects of service. A service provider registers a service in the universal repository i.e. UDDI so that the service consumers can search and discover the required service that meets the user functional requirements from thousands of registered services. Matching the user request with all services in a particular category of the repository is a cumbersome task. Semantic approaches are required to further assist the user in discovering relevant services. In this paper, we have proposed a semantic approach that gives ranked list of services based on the web based relatedness score and helps the users in the selection of potentially relevant and semantically similar services within a category. The proposed approach has been implemented on 80 OWLS services and the results have shown that the approach gives ranked list of services with ease of the selection process for the user.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132458953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Semantic data mining: A survey of ontology-based approaches 语义数据挖掘:基于本体的方法综述
D. Dou, Hao Wang, Haishan Liu
Semantic Data Mining refers to the data mining tasks that systematically incorporate domain knowledge, especially formal semantics, into the process. In the past, many research efforts have attested the benefits of incorporating domain knowledge in data mining. At the same time, the proliferation of knowledge engineering has enriched the family of domain knowledge, especially formal semantics and Semantic Web ontologies. Ontology is an explicit specification of conceptualization and a formal way to define the semantics of knowledge and data. The formal structure of ontology makes it a nature way to encode domain knowledge for the data mining use. In this survey paper, we introduce general concepts of semantic data mining. We investigate why ontology has the potential to help semantic data mining and how formal semantics in ontologies can be incorporated into the data mining process. We provide detail discussions for the advances and state of art of ontology-based approaches and an introduction of approaches that are based on other form of knowledge representations.
语义数据挖掘是指将领域知识,特别是形式语义系统地融入到数据挖掘过程中的数据挖掘任务。在过去,许多研究工作已经证明了将领域知识纳入数据挖掘的好处。与此同时,知识工程的发展丰富了领域知识家族,特别是形式语义和语义Web本体。本体是一种明确的概念化规范,是定义知识和数据语义的形式化方法。本体的形式化结构使其成为数据挖掘领域知识编码的一种自然方式。在本文中,我们介绍了语义数据挖掘的一般概念。我们研究了本体为什么有潜力帮助语义数据挖掘,以及如何将本体中的形式语义纳入数据挖掘过程。我们详细讨论了基于本体论的方法的进展和现状,并介绍了基于其他形式的知识表示的方法。
{"title":"Semantic data mining: A survey of ontology-based approaches","authors":"D. Dou, Hao Wang, Haishan Liu","doi":"10.1109/ICOSC.2015.7050814","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050814","url":null,"abstract":"Semantic Data Mining refers to the data mining tasks that systematically incorporate domain knowledge, especially formal semantics, into the process. In the past, many research efforts have attested the benefits of incorporating domain knowledge in data mining. At the same time, the proliferation of knowledge engineering has enriched the family of domain knowledge, especially formal semantics and Semantic Web ontologies. Ontology is an explicit specification of conceptualization and a formal way to define the semantics of knowledge and data. The formal structure of ontology makes it a nature way to encode domain knowledge for the data mining use. In this survey paper, we introduce general concepts of semantic data mining. We investigate why ontology has the potential to help semantic data mining and how formal semantics in ontologies can be incorporated into the data mining process. We provide detail discussions for the advances and state of art of ontology-based approaches and an introduction of approaches that are based on other form of knowledge representations.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127429812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 156
Word Sense Disambiguation using WSD specific WordNet of polysemy words 词义消歧使用WSD特有的WordNet的多义词
Udaya Raj Dhungana, S. Shakya, K. Baral, Bharat Sharma
This paper presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense word are referred to as the clue words. The conventional WordNet organises nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the structure of WordNet, we developed a new model of WordNet that organizes the different senses of polysemy words as well as the single sense words based on the clue words. These clue words for each sense of a polysemy word as well as for single sense word are used to disambiguate the correct meaning of the polysemy word in the given context using knowledge-based Word Sense Disambiguation (WSD) algorithms. The clue word can be a noun, verb, adjective or adverb.
本文提出了一种新的WordNet模型,该模型基于线索词对多义词的正确意义进行消歧。一个多义词和一个单义词所对应的词被称为线索词。传统的WordNet将名词、动词、形容词和副词组织成一组同义词,称为同义词集,每组同义词集表达一个不同的概念。与WordNet的结构相比,我们开发了一种新的WordNet模型,该模型基于线索词对多义词的不同意义和单义词进行组织。使用基于知识的词义消歧(WSD)算法,对一个多义词的每个意义以及单个意义的线索词进行消歧,从而在给定的上下文中消除该多义词的正确意义。提示词可以是名词、动词、形容词或副词。
{"title":"Word Sense Disambiguation using WSD specific WordNet of polysemy words","authors":"Udaya Raj Dhungana, S. Shakya, K. Baral, Bharat Sharma","doi":"10.1109/ICOSC.2015.7050794","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050794","url":null,"abstract":"This paper presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense word are referred to as the clue words. The conventional WordNet organises nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the structure of WordNet, we developed a new model of WordNet that organizes the different senses of polysemy words as well as the single sense words based on the clue words. These clue words for each sense of a polysemy word as well as for single sense word are used to disambiguate the correct meaning of the polysemy word in the given context using knowledge-based Word Sense Disambiguation (WSD) algorithms. The clue word can be a noun, verb, adjective or adverb.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123313424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
期刊
Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1