汉语特定领域新词检测

Ao Chen, Maosong Sun
{"title":"汉语特定领域新词检测","authors":"Ao Chen, Maosong Sun","doi":"10.18653/v1/S17-1005","DOIUrl":null,"url":null,"abstract":"With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and always play a critical role in the domain-specific NLP tasks. This paper aims at extracting Chinese domain-specific new words automatically. The extraction of domain-specific new words has two parts including both new words in this domain and the especially important words. In this work, we propose a joint statistical model to perform these two works simultaneously. Compared to traditional new words detection models, our model doesn't need handcraft features which are labor intensive. Experimental results demonstrate that our joint model achieves a better performance compared with the state-of-the-art methods.","PeriodicalId":344435,"journal":{"name":"Joint Conference on Lexical and Computational Semantics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Domain-Specific New Words Detection in Chinese\",\"authors\":\"Ao Chen, Maosong Sun\",\"doi\":\"10.18653/v1/S17-1005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and always play a critical role in the domain-specific NLP tasks. This paper aims at extracting Chinese domain-specific new words automatically. The extraction of domain-specific new words has two parts including both new words in this domain and the especially important words. In this work, we propose a joint statistical model to perform these two works simultaneously. Compared to traditional new words detection models, our model doesn't need handcraft features which are labor intensive. Experimental results demonstrate that our joint model achieves a better performance compared with the state-of-the-art methods.\",\"PeriodicalId\":344435,\"journal\":{\"name\":\"Joint Conference on Lexical and Computational Semantics\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Joint Conference on Lexical and Computational Semantics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/S17-1005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Joint Conference on Lexical and Computational Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/S17-1005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

随着互联网的爆炸式增长,越来越多的特定领域环境出现,如论坛、博客、mooc等。特定领域词汇出现在这些领域,并且在特定领域的自然语言处理任务中起着至关重要的作用。本文旨在自动提取中文特定领域的新词。特定领域生词的提取分为两个部分,包括该领域的生词和特别重要的生词。在这项工作中,我们提出了一个联合统计模型来同时执行这两项工作。与传统的新词检测模型相比,我们的模型不需要人工密集型的手工特征。实验结果表明,与现有方法相比,我们的联合模型具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Domain-Specific New Words Detection in Chinese
With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc. Domain-specific words appear in these areas and always play a critical role in the domain-specific NLP tasks. This paper aims at extracting Chinese domain-specific new words automatically. The extraction of domain-specific new words has two parts including both new words in this domain and the especially important words. In this work, we propose a joint statistical model to perform these two works simultaneously. Compared to traditional new words detection models, our model doesn't need handcraft features which are labor intensive. Experimental results demonstrate that our joint model achieves a better performance compared with the state-of-the-art methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Embedded Semantic Lexicon Induction with Joint Global and Local Optimization Semantic Frames and Visual Scenes: Learning Semantic Role Inventories from Image and Video Descriptions Comparing Approaches for Automatic Question Identification Detecting Asymmetric Semantic Relations in Context: A Case-Study on Hypernymy Detection Deep Learning Models For Multiword Expression Identification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1