用于识别科学文献中的术语的自动pos模板和混合度量

2010 Third International Symposium on Knowledge Acquisition and Modeling Pub Date : 2010-11-29 DOI:10.1109/KAM.2010.5646319

Hongliang You, Wei Zhang, Junyi Shen, Yang Yu, Ting Liu

{"title":"用于识别科学文献中的术语的自动pos模板和混合度量","authors":"Hongliang You, Wei Zhang, Junyi Shen, Yang Yu, Ting Liu","doi":"10.1109/KAM.2010.5646319","DOIUrl":null,"url":null,"abstract":"Automatic Term Recognition (ATR) is an important task for Knowledge Acquisition, which aims at acquiring formalized words which are not recorded in time in the glossary. In recent years, several statistical methods has proved to be effective, and emerging methods such as C-value, NC-Value, TermExtractor has shown great advantages on this task. However, few works have been done on the Metric mixing algorithm that combines those metrics as a whole. In this paper, we first collect part-of-speech templates from already-known terms automatically, namely Auto-POS templates, instead of artificial regular expressions, and then we match them with POS strings to acquire candidate terms. Finally we sort those candidates by metric mixing algorithm. Experimental results on IEEE2006-2007 metadata show that the metric mixing algorithm performs better than any separate metrics alone.","PeriodicalId":160788,"journal":{"name":"2010 Third International Symposium on Knowledge Acquisition and Modeling","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Auto-POS templates and mixed metrics for recognizing terms in scientific literature\",\"authors\":\"Hongliang You, Wei Zhang, Junyi Shen, Yang Yu, Ting Liu\",\"doi\":\"10.1109/KAM.2010.5646319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic Term Recognition (ATR) is an important task for Knowledge Acquisition, which aims at acquiring formalized words which are not recorded in time in the glossary. In recent years, several statistical methods has proved to be effective, and emerging methods such as C-value, NC-Value, TermExtractor has shown great advantages on this task. However, few works have been done on the Metric mixing algorithm that combines those metrics as a whole. In this paper, we first collect part-of-speech templates from already-known terms automatically, namely Auto-POS templates, instead of artificial regular expressions, and then we match them with POS strings to acquire candidate terms. Finally we sort those candidates by metric mixing algorithm. Experimental results on IEEE2006-2007 metadata show that the metric mixing algorithm performs better than any separate metrics alone.\",\"PeriodicalId\":160788,\"journal\":{\"name\":\"2010 Third International Symposium on Knowledge Acquisition and Modeling\",\"volume\":\"243 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Third International Symposium on Knowledge Acquisition and Modeling\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KAM.2010.5646319\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Third International Symposium on Knowledge Acquisition and Modeling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KAM.2010.5646319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

术语自动识别(ATR)是知识获取的一项重要任务，其目的是获取未及时记录在词汇表中的形式化词汇。近年来，已有几种统计方法被证明是有效的，C-value、NC-Value、TermExtractor等新兴方法在这一任务上显示出很大的优势。然而，将这些度量作为一个整体进行度量混合算法的研究工作却很少。本文首先从已知词中自动收集词性模板，即Auto-POS模板，代替人工正则表达式，然后与词性字符串进行匹配，获得候选词。最后用度量混合算法对候选对象进行排序。在IEEE2006-2007元数据上的实验结果表明，度量混合算法的性能优于单独使用任何单独的度量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Auto-POS templates and mixed metrics for recognizing terms in scientific literature

Automatic Term Recognition (ATR) is an important task for Knowledge Acquisition, which aims at acquiring formalized words which are not recorded in time in the glossary. In recent years, several statistical methods has proved to be effective, and emerging methods such as C-value, NC-Value, TermExtractor has shown great advantages on this task. However, few works have been done on the Metric mixing algorithm that combines those metrics as a whole. In this paper, we first collect part-of-speech templates from already-known terms automatically, namely Auto-POS templates, instead of artificial regular expressions, and then we match them with POS strings to acquire candidate terms. Finally we sort those candidates by metric mixing algorithm. Experimental results on IEEE2006-2007 metadata show that the metric mixing algorithm performs better than any separate metrics alone.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 Third International Symposium on Knowledge Acquisition and Modeling

自引率

0.00%

发文量