基于三角特征的最优距离度量函数在基于实例的人工神经网络词义消歧中的应用

P. Tamilselvi, S. Srivatsa
{"title":"基于三角特征的最优距离度量函数在基于实例的人工神经网络词义消歧中的应用","authors":"P. Tamilselvi, S. Srivatsa","doi":"10.1109/ICOAC.2011.6165190","DOIUrl":null,"url":null,"abstract":"In general, different levels of knowledge are used for disambiguation. In this paper, only three knowledge features or sources (trigram) are used to achieve the word sense disambiguation. Case based approach is applied for the disambiguation process. Cases are nothing but the refined form of words collected from Semcor, used for deriving the sense of the ambiguous input word. All possible Part of Speech (PoS) listed in Brown Corpus are collected and grouped into seventeen groups, and each group is assigned with a constant value. Trigram features of input (ambiguous words) as well as cases are represented as vector of size 1×3. Vector values for the ambiguous word and other two neighboring words are taken out from those assigned weights based on their PoS. In this paper ten different distance metric functions are empirically analyzed for improving the accuracy performance of word disambiguation with minimal knowledge sources. Neural Network is used for extracting correct sense of the ambiguous word from the selected minimal distance cases. In this paper, a long sentence is taken to project the performance of disambiguation process. From the result, it is clear that, post-trigramed Hamming function (F9) produced appreciable disambiguation accuracy 78.57% (recognized eleven ambiguous words out of fourteen).","PeriodicalId":369712,"journal":{"name":"2011 Third International Conference on Advanced Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Optimal distance metric function with trigram features for case based word sense disambiguation using artificial neural network\",\"authors\":\"P. Tamilselvi, S. Srivatsa\",\"doi\":\"10.1109/ICOAC.2011.6165190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In general, different levels of knowledge are used for disambiguation. In this paper, only three knowledge features or sources (trigram) are used to achieve the word sense disambiguation. Case based approach is applied for the disambiguation process. Cases are nothing but the refined form of words collected from Semcor, used for deriving the sense of the ambiguous input word. All possible Part of Speech (PoS) listed in Brown Corpus are collected and grouped into seventeen groups, and each group is assigned with a constant value. Trigram features of input (ambiguous words) as well as cases are represented as vector of size 1×3. Vector values for the ambiguous word and other two neighboring words are taken out from those assigned weights based on their PoS. In this paper ten different distance metric functions are empirically analyzed for improving the accuracy performance of word disambiguation with minimal knowledge sources. Neural Network is used for extracting correct sense of the ambiguous word from the selected minimal distance cases. In this paper, a long sentence is taken to project the performance of disambiguation process. From the result, it is clear that, post-trigramed Hamming function (F9) produced appreciable disambiguation accuracy 78.57% (recognized eleven ambiguous words out of fourteen).\",\"PeriodicalId\":369712,\"journal\":{\"name\":\"2011 Third International Conference on Advanced Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 Third International Conference on Advanced Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOAC.2011.6165190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Third International Conference on Advanced Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOAC.2011.6165190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

一般来说,不同层次的知识被用于消歧。本文仅使用三个知识特征或来源(三元组)来实现词义消歧。在消歧过程中采用了基于实例的方法。case只是从Semcor中收集的单词的精炼形式,用于派生歧义输入单词的意思。收集Brown语料库中所有可能的词类,并将其分为17组,每组赋一个常数。输入(歧义词)的三元组特征以及大小写表示为大小为1×3的向量。根据歧义词和相邻两个词的词序权重,提取歧义词和相邻两个词的向量值。本文对十种不同的距离度量函数进行了实证分析,以期在最少的知识来源下提高词消歧义的准确性。利用神经网络从选取的最小距离情况中提取歧义词的正确意义。本文以一个长句为例,对消歧过程的性能进行了评价。从结果可以清楚地看出,后三格汉明函数(F9)产生了可观的消歧准确率78.57%(识别出14个歧义词中的11个)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Optimal distance metric function with trigram features for case based word sense disambiguation using artificial neural network
In general, different levels of knowledge are used for disambiguation. In this paper, only three knowledge features or sources (trigram) are used to achieve the word sense disambiguation. Case based approach is applied for the disambiguation process. Cases are nothing but the refined form of words collected from Semcor, used for deriving the sense of the ambiguous input word. All possible Part of Speech (PoS) listed in Brown Corpus are collected and grouped into seventeen groups, and each group is assigned with a constant value. Trigram features of input (ambiguous words) as well as cases are represented as vector of size 1×3. Vector values for the ambiguous word and other two neighboring words are taken out from those assigned weights based on their PoS. In this paper ten different distance metric functions are empirically analyzed for improving the accuracy performance of word disambiguation with minimal knowledge sources. Neural Network is used for extracting correct sense of the ambiguous word from the selected minimal distance cases. In this paper, a long sentence is taken to project the performance of disambiguation process. From the result, it is clear that, post-trigramed Hamming function (F9) produced appreciable disambiguation accuracy 78.57% (recognized eleven ambiguous words out of fourteen).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Keynote speaker I: Ubiquitous sensing Bio-molecular event extraction using Support Vector Machine Genetically optimized ANFIS based Intelligent Navigation System An efficient clusterhead election algorithm based on maximum weight for MANET A novel business model for enterprise service logic change management
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1