首页 > 最新文献

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)最新文献

英文 中文
The present conditions, problems and future direction of the server-controlled clinical pathway system development in psychiatric hospitals 精神病院服务器控制临床路径系统发展的现状、问题及未来发展方向
Mai Date, T. Tanioka, Yuko Yasuhara, Kazuyuki Matsumoto, Yukie Iwasa, Chiemi Kawanishi, Eri Hirai, Fuji Ren
Clinical pathways used today are paper-based, although many different kinds of clinical pathway are used in clinical practice. However, in the development of software for clinical pathway, it is difficult to achieve cooperation between medical experts, who are not used to expressing their ideas and work in words, and system-developers whose medical knowledge is limited. As a consequence, the current situation is that medical practitioners make their own software and use it in their practice. While this may work, it is less than ideal because refinements that may make the software most effective through engineering expertise is not used. Thus, in our research team, the nurse researchers and the engineering researchers cooperated and developed a clinical pathway system. In this paper, the present conditions, problems, and future direction of the server-controlled CP system development in the psychiatric hospitals, is discussed from viewpoint of nursing as a user.
尽管临床实践中使用了许多不同种类的临床路径,但目前使用的临床路径是基于纸张的。然而,在临床路径软件的开发中,不习惯用语言表达想法和工作的医学专家与医学知识有限的系统开发人员之间很难实现合作。因此,目前的情况是,医生制作自己的软件,并在他们的实践中使用它。虽然这可能起作用,但它并不理想,因为没有使用可能通过工程专业知识使软件最有效的改进。因此,在我们的研究团队中,护士研究人员和工程研究人员合作开发了临床路径系统。本文从护理用户的角度,探讨了精神病院服务器控制CP系统发展的现状、问题及未来发展方向。
{"title":"The present conditions, problems and future direction of the server-controlled clinical pathway system development in psychiatric hospitals","authors":"Mai Date, T. Tanioka, Yuko Yasuhara, Kazuyuki Matsumoto, Yukie Iwasa, Chiemi Kawanishi, Eri Hirai, Fuji Ren","doi":"10.1109/NLPKE.2010.5587812","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587812","url":null,"abstract":"Clinical pathways used today are paper-based, although many different kinds of clinical pathway are used in clinical practice. However, in the development of software for clinical pathway, it is difficult to achieve cooperation between medical experts, who are not used to expressing their ideas and work in words, and system-developers whose medical knowledge is limited. As a consequence, the current situation is that medical practitioners make their own software and use it in their practice. While this may work, it is less than ideal because refinements that may make the software most effective through engineering expertise is not used. Thus, in our research team, the nurse researchers and the engineering researchers cooperated and developed a clinical pathway system. In this paper, the present conditions, problems, and future direction of the server-controlled CP system development in the psychiatric hospitals, is discussed from viewpoint of nursing as a user.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123925355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Obtaining chinese semantic knowledge from online encyclopedia 从在线百科全书中获取汉语语义知识
Liu Yang, Tingting He, Xinhui Tu, Jinguang Chen
This paper proposes a method to obtain the semantic knowledge from an online encyclopedia called Hudong encyclopedia 2(hudong baike). We obtain concepts and then their semantic related concepts and compute the semantic relatedness by utilizing inner hyperlinks and the open category information in Hudong encyclopedia. By comparing our results with human judgments, we show that our relatedness computing method is quite effective.
本文提出了一种从在线百科全书《沪东百科全书2》中获取语义知识的方法。利用沪东百科的内部超链接和开放分类信息,获得概念和语义相关概念,并计算语义相关性。通过与人类判断结果的比较,我们证明了我们的关联度计算方法是非常有效的。
{"title":"Obtaining chinese semantic knowledge from online encyclopedia","authors":"Liu Yang, Tingting He, Xinhui Tu, Jinguang Chen","doi":"10.1109/NLPKE.2010.5587787","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587787","url":null,"abstract":"This paper proposes a method to obtain the semantic knowledge from an online encyclopedia called Hudong encyclopedia 2(hudong baike). We obtain concepts and then their semantic related concepts and compute the semantic relatedness by utilizing inner hyperlinks and the open category information in Hudong encyclopedia. By comparing our results with human judgments, we show that our relatedness computing method is quite effective.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115208504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Wisdom media “CAIWA Channel” based on natural language interface agent 智慧媒体“CAIWA频道”基于自然语言接口代理
Takuo Henmi, Shengyang Huang, F. Ren
Based on the breakthroughs in natural language interface agent “CAIWA,” web content distribution platform “CAIWA Channel” has been built for both PC and smatphone clients. The platform is regarded as wisdom media, since it seeks for providing information precisely what users demand, in addition to removing barriers of user interface bottlenecks. Not only recommendations of information to the user based on the past behavior, interest and profile, but also the CAIWA Channel has built-in evolution mechanism by proactively collecting information about the user in a natural manner via conversation with the user. It can accumulate knowledge for better meeting the user's personal requirements, while web contents are organized according to the user's interest. CAIWA Channel is not a mere information search system but a knowledge query system and a human-touch system incorporating emotional reactions.
基于自然语言接口代理“CAIWA”的突破,构建了面向PC和智能手机客户端的web内容分发平台“CAIWA Channel”。该平台被认为是智慧媒体,因为它除了消除用户界面瓶颈的障碍外,还寻求准确地提供用户所需的信息。CAIWA Channel不仅根据用户过去的行为、兴趣和个人资料向用户推荐信息,而且还内置了进化机制,通过与用户的对话,以自然的方式主动收集用户的信息。它可以积累知识,更好地满足用户的个人需求,而网页内容是根据用户的兴趣来组织的。CAIWA频道不是一个单纯的信息搜索系统,而是一个知识查询系统,是一个包含情感反应的人机交互系统。
{"title":"Wisdom media “CAIWA Channel” based on natural language interface agent","authors":"Takuo Henmi, Shengyang Huang, F. Ren","doi":"10.1109/NLPKE.2010.5587862","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587862","url":null,"abstract":"Based on the breakthroughs in natural language interface agent “CAIWA,” web content distribution platform “CAIWA Channel” has been built for both PC and smatphone clients. The platform is regarded as wisdom media, since it seeks for providing information precisely what users demand, in addition to removing barriers of user interface bottlenecks. Not only recommendations of information to the user based on the past behavior, interest and profile, but also the CAIWA Channel has built-in evolution mechanism by proactively collecting information about the user in a natural manner via conversation with the user. It can accumulate knowledge for better meeting the user's personal requirements, while web contents are organized according to the user's interest. CAIWA Channel is not a mere information search system but a knowledge query system and a human-touch system incorporating emotional reactions.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114527305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semantic role labeling for Bengali using 5Ws 使用5w的孟加拉语语义角色标注
Amitava Das, Aniruddha Ghosh, Sivaji Bandyopadhyay
In this paper we present different methodologies to extract semantic role labels of Bengali nouns using 5W distilling. The 5W task seeks to extract the semantic information of nouns in a natural language sentence by distilling it into the answers to the 5W questions: Who, What, When, Where and Why. As Bengali is a resource constraint language, the building of annotated gold standard corpus and acquisition of linguistics tools for features extraction are described in this paper. The tag label wise reported precision values of the present system are: 79.56% (Who), 65.45% (What), 73.35% (When), 77.66% (Where) and 63.50% (Why).
本文提出了利用5W提取孟加拉语名词语义角色标签的不同方法。5W任务旨在提取自然语言句子中名词的语义信息,将其提炼成5W问题(Who, What, When, Where and Why)的答案。由于孟加拉语是一种资源约束性语言,本文介绍了带注释金标准语料库的构建和特征提取语言学工具的获取。目前系统的标签报告精度值分别为:79.56% (Who)、65.45% (What)、73.35% (When)、77.66% (Where)和63.50% (Why)。
{"title":"Semantic role labeling for Bengali using 5Ws","authors":"Amitava Das, Aniruddha Ghosh, Sivaji Bandyopadhyay","doi":"10.1109/NLPKE.2010.5587772","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587772","url":null,"abstract":"In this paper we present different methodologies to extract semantic role labels of Bengali nouns using 5W distilling. The 5W task seeks to extract the semantic information of nouns in a natural language sentence by distilling it into the answers to the 5W questions: Who, What, When, Where and Why. As Bengali is a resource constraint language, the building of annotated gold standard corpus and acquisition of linguistics tools for features extraction are described in this paper. The tag label wise reported precision values of the present system are: 79.56% (Who), 65.45% (What), 73.35% (When), 77.66% (Where) and 63.50% (Why).","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114588655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Use relative weight to improve the kNN for unbalanced text category 使用相对权重来改善不平衡文本类别的kNN
Xiaodong Liu, F. Ren, Caixia Yuan
The technology of text category is widely used in natural language processing. As one of best text category algorithms, kNN is very popular used in many applications. Traditional kNN assumes that the distribution of training data is even, however, it is not the case for many situations. When we used kNN in our Topic Detection and Tracking (TDT) system, it did not perform well due to the bias of training data set. To overcome the obstacle caused by data bias, this paper proposes an approach which uses relative weight to adjust the weight of kNN (RWKNN). When evaluated on the data of TDT2 and TDT3 Chinese corpus, RWKNN proves to be robust on unbalanced data and yields better performance than the traditional kNN.
文本分类技术在自然语言处理中有着广泛的应用。作为最好的文本分类算法之一,kNN在许多应用中得到了广泛的应用。传统的kNN假设训练数据的分布是均匀的,但在很多情况下并非如此。当我们在主题检测和跟踪(TDT)系统中使用kNN时,由于训练数据集的偏差,它的性能不佳。为了克服数据偏差带来的障碍,本文提出了一种利用相对权重来调整kNN权重(RWKNN)的方法。通过对TDT2和TDT3汉语语料库数据的评估,RWKNN在非平衡数据上具有鲁棒性,性能优于传统kNN。
{"title":"Use relative weight to improve the kNN for unbalanced text category","authors":"Xiaodong Liu, F. Ren, Caixia Yuan","doi":"10.1109/NLPKE.2010.5587799","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587799","url":null,"abstract":"The technology of text category is widely used in natural language processing. As one of best text category algorithms, kNN is very popular used in many applications. Traditional kNN assumes that the distribution of training data is even, however, it is not the case for many situations. When we used kNN in our Topic Detection and Tracking (TDT) system, it did not perform well due to the bias of training data set. To overcome the obstacle caused by data bias, this paper proposes an approach which uses relative weight to adjust the weight of kNN (RWKNN). When evaluated on the data of TDT2 and TDT3 Chinese corpus, RWKNN proves to be robust on unbalanced data and yields better performance than the traditional kNN.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128983817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A hybrid-strategy method combining semantic analysis with rule-based MT for patent machine translation 一种结合语义分析和规则机器翻译的专利机器翻译混合策略方法
Yaohong Jin
This paper presents a hybrid method combining semantic analysis with rule-based MT for patent machine translation. Based on the theory of Hierarchical Network of Concepts, the semantic analysis used the lv principle to deal with the ambiguity of multiple verbs and the boundary of long NP. The determination of main verb can help to select the right syntax tree, and the boundary detection of long NP can help to schedule the process of syntax. From the result of the experiments, we can see that this hybrid-strategy method can effectively improve the performance of Chinese-English patent machine translation.
提出了一种将语义分析与基于规则的机器翻译相结合的专利机器翻译方法。语义分析以概念层次网络理论为基础,运用lv原理处理多动词的歧义和长NP的边界。谓语动词的确定有助于选择正确的语法树,长NP的边界检测有助于调度语法过程。从实验结果可以看出,这种混合策略方法可以有效地提高汉英专利机器翻译的性能。
{"title":"A hybrid-strategy method combining semantic analysis with rule-based MT for patent machine translation","authors":"Yaohong Jin","doi":"10.1109/NLPKE.2010.5587763","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587763","url":null,"abstract":"This paper presents a hybrid method combining semantic analysis with rule-based MT for patent machine translation. Based on the theory of Hierarchical Network of Concepts, the semantic analysis used the lv principle to deal with the ambiguity of multiple verbs and the boundary of long NP. The determination of main verb can help to select the right syntax tree, and the boundary detection of long NP can help to schedule the process of syntax. From the result of the experiments, we can see that this hybrid-strategy method can effectively improve the performance of Chinese-English patent machine translation.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116768375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Extraction of purpose data using surface text patterns 使用表面文本模式提取目的数据
P. K. Mayee, R. Sangal, Soma Paul
This paper presents the concept of surface text patterns for extracting purpose data from the web. In order to obtain an optimal set of patterns, we have developed a method for learning purpose patterns automatically. A corpus was downloaded from the Internet using bootstrapping by providing a few hand-crafted examples of each purpose pattern to a generic search engine. This corpus was then tagged and patterns were extracted from the returned documents by automated means and standardized. The precision of each pattern and the average precision for each group were computed. The extracted patterns were then used to extract purpose data. The results for extraction from the web have been reported.
本文提出了表面文本模式的概念,用于从网络中提取目的数据。为了获得最优的模式集,我们开发了一种自动学习目的模式的方法。通过向通用搜索引擎提供每个目的模式的几个手工制作的示例,使用引导从Internet下载了一个语料库。然后对语料库进行标记,并通过自动化和标准化的方式从返回的文档中提取模式。计算了每种模式的精度和每组的平均精度。然后使用提取的模式提取目的数据。已经报道了从网络中提取的结果。
{"title":"Extraction of purpose data using surface text patterns","authors":"P. K. Mayee, R. Sangal, Soma Paul","doi":"10.1109/NLPKE.2010.5587860","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587860","url":null,"abstract":"This paper presents the concept of surface text patterns for extracting purpose data from the web. In order to obtain an optimal set of patterns, we have developed a method for learning purpose patterns automatically. A corpus was downloaded from the Internet using bootstrapping by providing a few hand-crafted examples of each purpose pattern to a generic search engine. This corpus was then tagged and patterns were extracted from the returned documents by automated means and standardized. The precision of each pattern and the average precision for each group were computed. The extracted patterns were then used to extract purpose data. The results for extraction from the web have been reported.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123754103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection and correction of real-word spelling errors in Persian language 波斯语实词拼写错误的检测与纠正
Heshaam Faili
Several statistical methods have already been proposed to detect and correct the real-word errors of a context. However, to the best of our knowledge, none of them has been applied on Persian language yet. In this paper, a statistical method based on mutual information of Persian words to deal with context sensitive spelling errors is presented. Different experiments show the accuracy of correction method on a test data which only contains one real-word error in each sentence to be about 80.5% and 87% with respect to precision and recall metrics.
已经提出了几种统计方法来检测和纠正上下文的实际错误。然而,据我们所知,这些方法还没有应用于波斯语。本文提出了一种基于波斯语单词互信息的统计方法来处理上下文敏感拼写错误。不同的实验表明,在每句只包含一个真实单词错误的测试数据上,修正方法的准确率分别为80.5%和87%。
{"title":"Detection and correction of real-word spelling errors in Persian language","authors":"Heshaam Faili","doi":"10.1109/NLPKE.2010.5587806","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587806","url":null,"abstract":"Several statistical methods have already been proposed to detect and correct the real-word errors of a context. However, to the best of our knowledge, none of them has been applied on Persian language yet. In this paper, a statistical method based on mutual information of Persian words to deal with context sensitive spelling errors is presented. Different experiments show the accuracy of correction method on a test data which only contains one real-word error in each sentence to be about 80.5% and 87% with respect to precision and recall metrics.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127274967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
An error driven method to improve rules for the recognition of Chinese modality “LE” 改进汉语情态“LE”识别规则的误差驱动方法
Yihui Zhou, Hongying Zan, Lingling Mu, Yingcheng Yuan
We have a “Trinity” way for the recognition of Chinese modality “LE”, in which dictionary, usage rule base and usage corpora combine as the knowledge base. Handcrafted rules can hardly cover all usages in the real texts. So this paper proposes an error driven method for the automatic rules improvement. Experimental results show that, after the automatic rules improvement, the recognition precision of the modality “LE” improves by over 1.85%.
我们提出了汉语情态“LE”识别的“三位一体”方法,即词典、用法规则库和用法语料库相结合作为知识库。手工制定的规则很难涵盖实际文本中的所有用法。为此,本文提出了一种误差驱动的规则自动改进方法。实验结果表明,经过自动规则改进后,模态“LE”的识别精度提高了1.85%以上。
{"title":"An error driven method to improve rules for the recognition of Chinese modality “LE”","authors":"Yihui Zhou, Hongying Zan, Lingling Mu, Yingcheng Yuan","doi":"10.1109/NLPKE.2010.5587825","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587825","url":null,"abstract":"We have a “Trinity” way for the recognition of Chinese modality “LE”, in which dictionary, usage rule base and usage corpora combine as the knowledge base. Handcrafted rules can hardly cover all usages in the real texts. So this paper proposes an error driven method for the automatic rules improvement. Experimental results show that, after the automatic rules improvement, the recognition precision of the modality “LE” improves by over 1.85%.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"269 10-13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132879809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An unsupervised approach to preposition error correction 一种无监督的介词纠错方法
Aminul Islam, D. Inkpen
In this work, an unsupervised statistical method for automatic correction of preposition errors using the Google n-gram data set is presented and compared to the state-of-the-art. We use the Google n-gram data set in a back-off fashion that increases the performance of the method. The method works automatically, does not require any human-annotated knowledge resources (e.g., ontologies) and can be applied to English language texts, including non-native (L2) ones in which preposition errors are known to be numerous. The method can be applied to other languages for which Google n-grams are available.
在这项工作中,提出了一种使用Google n-gram数据集自动纠正介词错误的无监督统计方法,并与最先进的方法进行了比较。我们以后退的方式使用Google n-gram数据集,以提高方法的性能。该方法自动工作,不需要任何人工注释的知识资源(例如本体),并且可以应用于英语语言文本,包括已知有大量介词错误的非母语(L2)文本。该方法可以应用于其他语言,谷歌n-grams可用。
{"title":"An unsupervised approach to preposition error correction","authors":"Aminul Islam, D. Inkpen","doi":"10.1109/NLPKE.2010.5587782","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587782","url":null,"abstract":"In this work, an unsupervised statistical method for automatic correction of preposition errors using the Google n-gram data set is presented and compared to the state-of-the-art. We use the Google n-gram data set in a back-off fashion that increases the performance of the method. The method works automatically, does not require any human-annotated knowledge resources (e.g., ontologies) and can be applied to English language texts, including non-native (L2) ones in which preposition errors are known to be numerous. The method can be applied to other languages for which Google n-grams are available.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129374271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1