汉语特定领域新词检测

Joint Conference on Lexical and Computational Semantics Pub Date : 1900-01-01 DOI:10.18653/v1/S17-1005

Ao Chen, Maosong Sun

{"title":"汉语特定领域新词检测","authors":"Ao Chen, Maosong Sun","doi":"10.18653/v1/S17-1005","DOIUrl":null,"url":null,"abstract":"With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and always play a critical role in the domain-specific NLP tasks. This paper aims at extracting Chinese domain-specific new words automatically. The extraction of domain-specific new words has two parts including both new words in this domain and the especially important words. In this work, we propose a joint statistical model to perform these two works simultaneously. Compared to traditional new words detection models, our model doesn't need handcraft features which are labor intensive. Experimental results demonstrate that our joint model achieves a better performance compared with the state-of-the-art methods.","PeriodicalId":344435,"journal":{"name":"Joint Conference on Lexical and Computational Semantics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Domain-Specific New Words Detection in Chinese\",\"authors\":\"Ao Chen, Maosong Sun\",\"doi\":\"10.18653/v1/S17-1005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and always play a critical role in the domain-specific NLP tasks. This paper aims at extracting Chinese domain-specific new words automatically. The extraction of domain-specific new words has two parts including both new words in this domain and the especially important words. In this work, we propose a joint statistical model to perform these two works simultaneously. Compared to traditional new words detection models, our model doesn't need handcraft features which are labor intensive. Experimental results demonstrate that our joint model achieves a better performance compared with the state-of-the-art methods.\",\"PeriodicalId\":344435,\"journal\":{\"name\":\"Joint Conference on Lexical and Computational Semantics\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Joint Conference on Lexical and Computational Semantics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/S17-1005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Joint Conference on Lexical and Computational Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/S17-1005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

随着互联网的爆炸式增长，越来越多的特定领域环境出现，如论坛、博客、mooc等。特定领域词汇出现在这些领域，并且在特定领域的自然语言处理任务中起着至关重要的作用。本文旨在自动提取中文特定领域的新词。特定领域生词的提取分为两个部分，包括该领域的生词和特别重要的生词。在这项工作中，我们提出了一个联合统计模型来同时执行这两项工作。与传统的新词检测模型相比，我们的模型不需要人工密集型的手工特征。实验结果表明，与现有方法相比，我们的联合模型具有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Domain-Specific New Words Detection in Chinese

With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and always play a critical role in the domain-specific NLP tasks. This paper aims at extracting Chinese domain-specific new words automatically. The extraction of domain-specific new words has two parts including both new words in this domain and the especially important words. In this work, we propose a joint statistical model to perform these two works simultaneously. Compared to traditional new words detection models, our model doesn't need handcraft features which are labor intensive. Experimental results demonstrate that our joint model achieves a better performance compared with the state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Joint Conference on Lexical and Computational Semantics

自引率

0.00%

发文量