首页 > 最新文献

Proceedings of the 19th international conference on Computational linguistics -最新文献

英文 中文
An Agent-based Approach to Chinese Named Entity Recognition 基于agent的中文命名实体识别方法
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072308
Shiren Ye, Tat-Seng Chua, Jimin Liu
Chinese NE (Named Entity) recognition is a difficult problem because of the uncertainty in word segmentation and flexibility in language structure. This paper proposes the use of a rationality model in a multi-agent framework to tackle this problem. We employ a greedy strategy and use the NE rationality model to evaluate and detect all possible NEs in the text. We then treat the process of selecting the best possible NEs as a multi-agent negotiation problem. The resulting system is robust and is able to handle different types of NE effectively. Our test on the MET-2 test corpus indicates that our system is able to achieve high F1 values of above 92% on all NE types.
由于分词的不确定性和语言结构的灵活性,汉语命名实体识别是一个难题。本文提出在多智能体框架中使用合理性模型来解决这一问题。我们采用贪婪策略,并使用网元合理性模型来评估和检测文本中所有可能的网元。然后,我们将选择最佳可能网元的过程视为一个多智能体协商问题。所得到的系统具有鲁棒性,能够有效地处理不同类型的网元。我们在MET-2测试语料库上的测试表明,我们的系统能够在所有网元类型上达到92%以上的高F1值。
{"title":"An Agent-based Approach to Chinese Named Entity Recognition","authors":"Shiren Ye, Tat-Seng Chua, Jimin Liu","doi":"10.3115/1072228.1072308","DOIUrl":"https://doi.org/10.3115/1072228.1072308","url":null,"abstract":"Chinese NE (Named Entity) recognition is a difficult problem because of the uncertainty in word segmentation and flexibility in language structure. This paper proposes the use of a rationality model in a multi-agent framework to tackle this problem. We employ a greedy strategy and use the NE rationality model to evaluate and detect all possible NEs in the text. We then treat the process of selecting the best possible NEs as a multi-agent negotiation problem. The resulting system is robust and is able to handle different types of NE effectively. Our test on the MET-2 test corpus indicates that our system is able to achieve high F1 values of above 92% on all NE types.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125777558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Morphological Analysis of the Spontaneous Speech Corpus 自发语音语料库的形态分析
Pub Date : 2002-08-24 DOI: 10.3115/1071884.1071903
Kiyotaka Uchimoto, Chikashi Nobata, Atsushi Yamada, S. Sekine, H. Isahara
This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-of-speech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus. We also show that a dictionary developed for a corpus on a certain domain is helpful for improving accuracy in analyzing a corpus on another domain.
本文描述了一个用分词和词性等形态学信息标注自发语音语料库的方案。我们使用了一个基于最大熵模型的形态学分析系统,该系统与语料库的领域无关。本文展示了使用该模型所获得的标注精度,并讨论了在标注自发语音语料库时存在的问题。我们还表明,为某一领域的语料库开发的词典有助于提高分析另一领域语料库的准确性。
{"title":"Morphological Analysis of the Spontaneous Speech Corpus","authors":"Kiyotaka Uchimoto, Chikashi Nobata, Atsushi Yamada, S. Sekine, H. Isahara","doi":"10.3115/1071884.1071903","DOIUrl":"https://doi.org/10.3115/1071884.1071903","url":null,"abstract":"This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-of-speech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus. We also show that a dictionary developed for a corpus on a certain domain is helpful for improving accuracy in analyzing a corpus on another domain.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116597959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A Robust Cross-Style Bilingual Sentences Alignment Model 鲁棒跨风格双语句子对齐模型
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072237
T. Kueng, Keh-Yih Su
Most current sentence alignment approaches adopt sentence length and cognate as the alignment features; and they are mostly trained and tested in the documents with the same style. Since the length distribution, alignment-type distribution (used by length-based approaches) and cognate frequency vary significantly across texts with different styles, the length-based approaches fail to achieve similar performance when tested in corpora of different styles. The experiments show that the performance in F-measure could drop from 98.2% to 85.6% when a length-based approach is trained by a technical manual and then tested on a general magazine.Since a large percentage of content words in the source text would be translated into the corresponding translation duals to preserve the meaning in the target text, transfer lexicons are usually regarded as more reliable cues for aligning sentences when the alignment task is performed by human. To enhance the robustness, a robust statistical model based on both transfer lexicons and sentence lengths are proposed in this paper. After integrating the transfer lexicons into the model, a 60% F-measure error reduction (from 14.4% to 5.8%) is observed.
目前的句子对齐方法大多采用句子长度和同源词作为对齐特征;他们大多是在同一风格的文档中进行培训和测试的。由于长度分布、对齐类型分布(基于长度的方法使用)和同源词频率在不同风格的文本中存在显著差异,因此在不同风格的语料库中测试时,基于长度的方法无法达到相似的性能。实验表明,当基于长度的方法经过技术手册的训练,然后在普通杂志上进行测试时,F-measure的性能可以从98.2%下降到85.6%。由于源文本中有很大比例的内容词需要翻译成相应的翻译对偶来保留目标文本中的意义,因此在人工进行句子对齐任务时,迁移词汇通常被认为是更可靠的句子对齐线索。为了增强统计模型的鲁棒性,本文提出了一种基于迁移词汇和句子长度的鲁棒统计模型。将迁移词汇整合到模型中后,F-measure误差降低了60%(从14.4%降至5.8%)。
{"title":"A Robust Cross-Style Bilingual Sentences Alignment Model","authors":"T. Kueng, Keh-Yih Su","doi":"10.3115/1072228.1072237","DOIUrl":"https://doi.org/10.3115/1072228.1072237","url":null,"abstract":"Most current sentence alignment approaches adopt sentence length and cognate as the alignment features; and they are mostly trained and tested in the documents with the same style. Since the length distribution, alignment-type distribution (used by length-based approaches) and cognate frequency vary significantly across texts with different styles, the length-based approaches fail to achieve similar performance when tested in corpora of different styles. The experiments show that the performance in F-measure could drop from 98.2% to 85.6% when a length-based approach is trained by a technical manual and then tested on a general magazine.Since a large percentage of content words in the source text would be translated into the corresponding translation duals to preserve the meaning in the target text, transfer lexicons are usually regarded as more reliable cues for aligning sentences when the alignment task is performed by human. To enhance the robustness, a robust statistical model based on both transfer lexicons and sentence lengths are proposed in this paper. After integrating the transfer lexicons into the model, a 60% F-measure error reduction (from 14.4% to 5.8%) is observed.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132139760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Natural Language and Inference in a Computer Game 计算机游戏中的自然语言和推理
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072341
Malte Gabsdil, Alexander Koller, Kristina Striegnitz
We present an engine for text adventures - computer games with which the player interacts using natural language. The system employs current methods from computational linguistics and an efficient inference system for description logic to make the interaction more natural. The inference system is especially useful in the linguistic modules dealing with reference resolution and generation and we show how we use it to rank different readings in the case of referential and syntactic ambiguities. It turns out that the player's utterances are naturally restricted in the game scenario, which simplifies the language processing task.
我们提出了一个文本冒险引擎——玩家使用自然语言与之交互的电脑游戏。该系统采用了计算语言学的最新方法和高效的描述逻辑推理系统,使交互更加自然。推理系统在处理指称解析和生成的语言模块中特别有用,我们展示了如何在指称和句法歧义的情况下使用它对不同的阅读进行排序。事实证明,玩家的话语在游戏场景中自然受到限制,这简化了语言处理任务。
{"title":"Natural Language and Inference in a Computer Game","authors":"Malte Gabsdil, Alexander Koller, Kristina Striegnitz","doi":"10.3115/1072228.1072341","DOIUrl":"https://doi.org/10.3115/1072228.1072341","url":null,"abstract":"We present an engine for text adventures - computer games with which the player interacts using natural language. The system employs current methods from computational linguistics and an efficient inference system for description logic to make the interaction more natural. The inference system is especially useful in the linguistic modules dealing with reference resolution and generation and we show how we use it to rank different readings in the case of referential and syntactic ambiguities. It turns out that the player's utterances are naturally restricted in the game scenario, which simplifies the language processing task.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"174 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125801461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Structure Alignment Using Bilingual Chunking 使用双语分块法进行结构对齐
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072238
Wei Wang, M. Zhou, Jin-Xia Huang, C. Huang
A new statistical method called "bilingual chunking" for structure alignment is proposed. Different with the existing approaches which align hierarchical structures like sub-trees, our method conducts alignment on chunks. The alignment is finished through a simultaneous bilingual chunking algorithm. Using the constrains of chunk correspondence between source language (SL) and target language (TL), our algorithm can dramatically reduce search space, support time synchronous DP algorithm, and lead to highly consistent chunking. Furthermore, by unifying the POS tagging and chunking in the search process, our algorithm alleviates effectively the influence of POS tagging deficiency to the chunking result.The experimental results with English-Chinese structure alignment show that our model can produce 90% in precision for chunking, and 87% in precision for chunk alignment.
提出了一种新的结构对齐统计方法——“双语分块”。与现有的对齐子树等层次结构的方法不同,我们的方法对块进行对齐。通过同步双语分块算法完成对齐。利用源语言(SL)和目标语言(TL)之间的块对应约束,该算法可以大大减少搜索空间,支持时间同步DP算法,并实现高度一致的分块。此外,通过统一搜索过程中的词性标注和分块,有效缓解了词性标注不足对分块结果的影响。英汉结构对齐的实验结果表明,该模型的分块精度可达90%,分块对齐精度可达87%。
{"title":"Structure Alignment Using Bilingual Chunking","authors":"Wei Wang, M. Zhou, Jin-Xia Huang, C. Huang","doi":"10.3115/1072228.1072238","DOIUrl":"https://doi.org/10.3115/1072228.1072238","url":null,"abstract":"A new statistical method called \"bilingual chunking\" for structure alignment is proposed. Different with the existing approaches which align hierarchical structures like sub-trees, our method conducts alignment on chunks. The alignment is finished through a simultaneous bilingual chunking algorithm. Using the constrains of chunk correspondence between source language (SL) and target language (TL), our algorithm can dramatically reduce search space, support time synchronous DP algorithm, and lead to highly consistent chunking. Furthermore, by unifying the POS tagging and chunking in the search process, our algorithm alleviates effectively the influence of POS tagging deficiency to the chunking result.The experimental results with English-Chinese structure alignment show that our model can produce 90% in precision for chunking, and 87% in precision for chunk alignment.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131852933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics 使用基于内容的度量在跨语言环境中对摘要进行元评价
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072301
Horacio Saggion, Dragomir R. Radev, Simone Teufel, Wai Lam
We describe a framework for the evaluation of summaries in English and Chinese using similarity measures. The framework can be used to evaluate extractive, non-extractive, single and multi-document summarization. We focus on the resources developed that are made available for the research community.
我们描述了一个使用相似度度量来评价英语和汉语摘要的框架。该框架可用于评估抽取性、非抽取性、单文档和多文档摘要。我们专注于为研究界提供的已开发资源。
{"title":"Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics","authors":"Horacio Saggion, Dragomir R. Radev, Simone Teufel, Wai Lam","doi":"10.3115/1072228.1072301","DOIUrl":"https://doi.org/10.3115/1072228.1072301","url":null,"abstract":"We describe a framework for the evaluation of summaries in English and Chinese using similarity measures. The framework can be used to evaluate extractive, non-extractive, single and multi-document summarization. We focus on the resources developed that are made available for the research community.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129036828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 64
Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora 在专业可比语料库中寻找候选翻译对等物
Pub Date : 2002-08-24 DOI: 10.3115/1071884.1071904
Yun-Chuang Chiao, Pierre Zweigenbaum
Previous attempts at identifying translational equivalents in comparable corpora have dealt with very large 'general language' corpora and words. We address this task in a specialized domain, medicine, starting from smaller non-parallel, comparable corpora and an initial bilingual medical lexicon. We compare the distributional contexts of source and target words, testing several weighting factors and similarity measures. On a test set of frequently occurring words, for the best combination (the Jaccard similarity measure with or without tf.idf weighting), the correct translation is ranked first for 20% of our test words, and is found in the top 10 candidates for 50% of them. An additional reverse-translation filtering step improves the precision of the top candidate translation up to 74%, with a 33% recall.
以前在可比语料库中识别翻译对等物的尝试处理了非常大的“通用语言”语料库和单词。我们从一个较小的非平行的、可比较的语料库和一个初始的双语医学词典开始,在一个专门的领域——医学中解决这个任务。我们比较了源词和目标词的分布上下文,测试了几个权重因子和相似度度量。在一个频繁出现的单词的测试集上,为了获得最佳组合(使用或不使用tf的Jaccard相似性度量)。在我们的测试词中,正确的翻译在20%的测试词中排名第一,在50%的测试词中排名前10位。一个额外的反翻译过滤步骤将最佳候选翻译的精度提高到74%,召回率为33%。
{"title":"Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora","authors":"Yun-Chuang Chiao, Pierre Zweigenbaum","doi":"10.3115/1071884.1071904","DOIUrl":"https://doi.org/10.3115/1071884.1071904","url":null,"abstract":"Previous attempts at identifying translational equivalents in comparable corpora have dealt with very large 'general language' corpora and words. We address this task in a specialized domain, medicine, starting from smaller non-parallel, comparable corpora and an initial bilingual medical lexicon. We compare the distributional contexts of source and target words, testing several weighting factors and similarity measures. On a test set of frequently occurring words, for the best combination (the Jaccard similarity measure with or without tf.idf weighting), the correct translation is ranked first for 20% of our test words, and is found in the top 10 candidates for 50% of them. An additional reverse-translation filtering step improves the precision of the top candidate translation up to 74%, with a 33% recall.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131196356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
Learning Verb Argument Structure from Minimally Annotated Corpora 从最小标注语料库中学习动词论点结构
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072268
Anoop Sarkar, Woottiporn Tripasai
In this paper we investigate the task of automatically identifying the correct argument structure for a set of verbs. The argument structure of a verb allows us to predict the relationship between the syntactic arguments of a verb and their role in the underlying lexical semantics of the verb. Following the method described in (Merlo and Stevenson, 2001), we exploit the distributions of some selected features from the local context of a verb. These features were extracted from a 23M word WSJ corpus based on part-of-speech tags and phrasal chunks alone. We constructed several decision tree classifiers trained on this data. The best performing classifier achieved an error rate of 33.4%. This work shows that a subcategorization frame (SF) learning algorithm previously applied to Czech (Sarkar and Zeman, 2000) is used to extract SFs in English. The extracted SFs are evaluated by classifying verbs into verb alternation classes.
在本文中,我们研究了自动识别一组动词的正确论点结构的任务。动词的实参结构使我们能够预测动词的句法实参与它们在动词的基础词汇语义中的作用之间的关系。按照(Merlo和Stevenson, 2001)中描述的方法,我们从动词的局部上下文中利用一些选定特征的分布。这些特征是从23M个基于词性标签和短语块的WSJ语料库中提取出来的。我们根据这些数据构建了几个决策树分类器。表现最好的分类器错误率为33.4%。这项工作表明,以前应用于捷克语的子分类框架(SF)学习算法(Sarkar和Zeman, 2000)被用于提取英语中的子分类框架。通过将动词分类为动词交替类来评估提取的SFs。
{"title":"Learning Verb Argument Structure from Minimally Annotated Corpora","authors":"Anoop Sarkar, Woottiporn Tripasai","doi":"10.3115/1072228.1072268","DOIUrl":"https://doi.org/10.3115/1072228.1072268","url":null,"abstract":"In this paper we investigate the task of automatically identifying the correct argument structure for a set of verbs. The argument structure of a verb allows us to predict the relationship between the syntactic arguments of a verb and their role in the underlying lexical semantics of the verb. Following the method described in (Merlo and Stevenson, 2001), we exploit the distributions of some selected features from the local context of a verb. These features were extracted from a 23M word WSJ corpus based on part-of-speech tags and phrasal chunks alone. We constructed several decision tree classifiers trained on this data. The best performing classifier achieved an error rate of 33.4%. This work shows that a subcategorization frame (SF) learning algorithm previously applied to Czech (Sarkar and Zeman, 2000) is used to extract SFs in English. The extracted SFs are evaluated by classifying verbs into verb alternation classes.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130000109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Syntactic Features for High Precision Word Sense Disambiguation 高精度词义消歧的句法特征
Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072340
David Martínez, Eneko Agirre, Lluís Màrquez i Villodre
This paper explores the contribution of a broad range of syntactic features to WSD: grammatical relations coded as the presence of adjuncts/arguments in isolation or as subcategorization frames, and instantiated grammatical relations between words. We have tested the performance of syntactic features using two different ML algorithms (Decision Lists and AdaBoost) on the Senseval-2 data. Adding syntactic features to a basic set of traditional features improves performance, especially for AdaBoost. In addition, several methods to build arbitrarily high accuracy WSD systems are also tried, showing that syntactic features allow for a precision of 86% and a coverage of 26% or 95% precision and 8% coverage.
本文探讨了广泛的语法特征对WSD的贡献:编码为孤立的辅词/参数或子分类框架的语法关系,以及单词之间的实例化语法关系。我们在Senseval-2数据上使用两种不同的ML算法(Decision Lists和AdaBoost)测试了语法特征的性能。在一组基本的传统特性中添加语法特性可以提高性能,特别是对于AdaBoost。此外,还尝试了几种构建任意高精度WSD系统的方法,结果表明,语法特征允许86%的精度和26%的覆盖率或95%的精度和8%的覆盖率。
{"title":"Syntactic Features for High Precision Word Sense Disambiguation","authors":"David Martínez, Eneko Agirre, Lluís Màrquez i Villodre","doi":"10.3115/1072228.1072340","DOIUrl":"https://doi.org/10.3115/1072228.1072340","url":null,"abstract":"This paper explores the contribution of a broad range of syntactic features to WSD: grammatical relations coded as the presence of adjuncts/arguments in isolation or as subcategorization frames, and instantiated grammatical relations between words. We have tested the performance of syntactic features using two different ML algorithms (Decision Lists and AdaBoost) on the Senseval-2 data. Adding syntactic features to a basic set of traditional features improves performance, especially for AdaBoost. In addition, several methods to build arbitrarily high accuracy WSD systems are also tried, showing that syntactic features allow for a precision of 86% and a coverage of 26% or 95% precision and 8% coverage.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134262848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
期刊
Proceedings of the 19th international conference on Computational linguistics -
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1