一种领域无关的混合自动分类归纳方法

Bushra Zafar, Usman Qamar, Ayesha Imran
{"title":"一种领域无关的混合自动分类归纳方法","authors":"Bushra Zafar, Usman Qamar, Ayesha Imran","doi":"10.1109/PDCAT.2016.085","DOIUrl":null,"url":null,"abstract":"Semantic taxonomies are the flexible way to organize, navigate and retrieve information effectively. Natural Language Processing and Artificial Intelligence tasks are heavily relied on these taxonomies. This paper presents a taxonomy induction system that integrates two modules: word-embedding and string inclusion. We implement a simple, semi-supervised and domain independent system based on Taxonomy Extraction Evaluation (TExEval2) Task, SemEval 2016. The task is divided into two steps, first is to identify hyponym-hypernym relations and then to construct a taxonomy from a domain specific terms lists. The system is trained over large general corpus. The system learns vectors for phrases and utilizes word vectors with phrases such as \"known as\", etc. to generate possible hypernyms and construct taxonomy. Three different domains, i.e. environment, food and science are considered for taxonomy induction. The constructed taxonomies are evaluated against gold standard taxonomies. The proposed system achieved significant results for hyponym-hypernym identification and taxonomy induction.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Domain-Independent Hybrid Approach for Automatic Taxonomy Induction\",\"authors\":\"Bushra Zafar, Usman Qamar, Ayesha Imran\",\"doi\":\"10.1109/PDCAT.2016.085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic taxonomies are the flexible way to organize, navigate and retrieve information effectively. Natural Language Processing and Artificial Intelligence tasks are heavily relied on these taxonomies. This paper presents a taxonomy induction system that integrates two modules: word-embedding and string inclusion. We implement a simple, semi-supervised and domain independent system based on Taxonomy Extraction Evaluation (TExEval2) Task, SemEval 2016. The task is divided into two steps, first is to identify hyponym-hypernym relations and then to construct a taxonomy from a domain specific terms lists. The system is trained over large general corpus. The system learns vectors for phrases and utilizes word vectors with phrases such as \\\"known as\\\", etc. to generate possible hypernyms and construct taxonomy. Three different domains, i.e. environment, food and science are considered for taxonomy induction. The constructed taxonomies are evaluated against gold standard taxonomies. The proposed system achieved significant results for hyponym-hypernym identification and taxonomy induction.\",\"PeriodicalId\":203925,\"journal\":{\"name\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2016.085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

语义分类法是有效组织、导航和检索信息的灵活方法。自然语言处理和人工智能任务严重依赖于这些分类法。本文提出了一个集词嵌入和字符串包含两个模块于一体的分类归纳系统。我们基于分类抽取评估(TExEval2)任务,SemEval 2016实现了一个简单的、半监督的、领域独立的系统。该任务分为两个步骤,首先是识别上下词关系,然后从特定于域的术语列表构建分类法。该系统是在大型通用语料库上进行训练的。该系统学习短语的向量,并利用短语(如“known as”等)的词向量来生成可能的上义词并构建分类。分类归纳考虑三个不同的领域,即环境、食品和科学。根据金标准分类法对构建的分类法进行评估。该系统在上下词识别和分类归纳方面取得了显著的成果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Domain-Independent Hybrid Approach for Automatic Taxonomy Induction
Semantic taxonomies are the flexible way to organize, navigate and retrieve information effectively. Natural Language Processing and Artificial Intelligence tasks are heavily relied on these taxonomies. This paper presents a taxonomy induction system that integrates two modules: word-embedding and string inclusion. We implement a simple, semi-supervised and domain independent system based on Taxonomy Extraction Evaluation (TExEval2) Task, SemEval 2016. The task is divided into two steps, first is to identify hyponym-hypernym relations and then to construct a taxonomy from a domain specific terms lists. The system is trained over large general corpus. The system learns vectors for phrases and utilizes word vectors with phrases such as "known as", etc. to generate possible hypernyms and construct taxonomy. Three different domains, i.e. environment, food and science are considered for taxonomy induction. The constructed taxonomies are evaluated against gold standard taxonomies. The proposed system achieved significant results for hyponym-hypernym identification and taxonomy induction.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Learning-Based System for Monitoring Electrical Load in Smart Grid A Domain-Independent Hybrid Approach for Automatic Taxonomy Induction CUDA-Based Parallel Implementation of IBM Word Alignment Algorithm for Statistical Machine Translation Optimal Scheduling Algorithm of MapReduce Tasks Based on QoS in the Hybrid Cloud Pre-Impact Fall Detection Based on Wearable Device Using Dynamic Threshold Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1