首页 > 最新文献

Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03最新文献

英文 中文
Weakly Supervised Natural Language Learning Without Redundant Views 无冗余视图的弱监督自然语言学习
Vincent Ng, Claire Cardie
We investigate single-view algorithms as an alternative to multi-view algorithms for weakly supervised learning for natural language processing tasks without a natural feature split. In particular, we apply co-training, self-training, and EM to one such task and find that both self-training and FS-EM, a new variation of EM that incorporates feature selection, outperform co-training and are comparatively less sensitive to parameter changes.
我们研究了单视图算法作为无自然特征分割的自然语言处理任务弱监督学习的多视图算法的替代方案。特别是,我们将共同训练、自我训练和EM应用于这样一个任务,并发现自我训练和FS-EM (EM的一种新变体,包含特征选择)都优于共同训练,并且对参数变化相对不那么敏感。
{"title":"Weakly Supervised Natural Language Learning Without Redundant Views","authors":"Vincent Ng, Claire Cardie","doi":"10.3115/1073445.1073468","DOIUrl":"https://doi.org/10.3115/1073445.1073468","url":null,"abstract":"We investigate single-view algorithms as an alternative to multi-view algorithms for weakly supervised learning for natural language processing tasks without a natural feature split. In particular, we apply co-training, self-training, and EM to one such task and find that both self-training and FS-EM, a new variation of EM that incorporates feature selection, outperform co-training and are comparatively less sensitive to parameter changes.","PeriodicalId":277518,"journal":{"name":"Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121424680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 140
A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation 统计机器翻译对准模板模型的加权有限状态传感器实现
Shankar Kumar, W. Byrne
We present a derivation of the alignment template model for statistical machine translation and an implementation of the model using weighted finite state transducers. The approach we describe allows us to implement each constituent distribution of the model as a weighted finite state transducer or acceptor. We show that bitext word alignment and translation under the model can be performed with standard FSM operations involving these transducers. One of the benefits of using this framework is that it obviates the need to develop specialized search procedures, even for the generation of lattices or N-Best lists of bitext word alignments and translation hypotheses. We evaluate the implementation of the model on the French-to-English Hansards task and report alignment and translation performance.
我们提出了统计机器翻译的对齐模板模型的推导,并使用加权有限状态传感器实现了该模型。我们描述的方法允许我们将模型的每个组成分布实现为加权有限状态传感器或受体。我们证明了该模型下的文本对齐和翻译可以通过涉及这些换能器的标准FSM操作来执行。使用这个框架的好处之一是,它不需要开发专门的搜索过程,甚至不需要生成文本单词对齐和翻译假设的格或N-Best列表。我们评估了该模型在法语-英语备忘录任务中的实施情况,并报告了一致性和翻译绩效。
{"title":"A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation","authors":"Shankar Kumar, W. Byrne","doi":"10.3115/1073445.1073464","DOIUrl":"https://doi.org/10.3115/1073445.1073464","url":null,"abstract":"We present a derivation of the alignment template model for statistical machine translation and an implementation of the model using weighted finite state transducers. The approach we describe allows us to implement each constituent distribution of the model as a weighted finite state transducer or acceptor. We show that bitext word alignment and translation under the model can be performed with standard FSM operations involving these transducers. One of the benefits of using this framework is that it obviates the need to develop specialized search procedures, even for the generation of lattices or N-Best lists of bitext word alignments and translation hypotheses. We evaluate the implementation of the model on the French-to-English Hansards task and report alignment and translation performance.","PeriodicalId":277518,"journal":{"name":"Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130585204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 104
Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar 基于歧义包装和随机消歧方法的词汇功能语法统计句子凝聚
S. Riezler, Tracy Holloway King, Dick Crouch, A. Zaenen
We present an application of ambiguity packing and stochastic disambiguation techniques for Lexical-Functional Grammars (LFG) to the domain of sentence condensation. Our system incorporates a linguistic parser/generator for LFG, a transfer component for parse reduction operating on packed parse forests, and a maximum-entropy model for stochastic output selection. Furthermore, we propose the use of standard parser evaluation methods for automatically evaluating the summarization quality of sentence condensation systems. An experimental evaluation of summarization quality shows a close correlation between the automatic parse-based evaluation and a manual evaluation of generated strings. Overall summarization quality of the proposed system is state-of-the-art, with guaranteed grammaticality of the system output due to the use of a constraint-based parser/generator.
本文提出了一种基于语义功能语法的歧义填充和随机消歧技术在句子凝聚领域的应用。我们的系统包含一个用于LFG的语言解析器/生成器,一个用于在打包解析森林上操作的解析缩减的传输组件,以及一个用于随机输出选择的最大熵模型。此外,我们提出使用标准解析器评估方法来自动评估句子浓缩系统的摘要质量。摘要质量的实验评估表明,自动基于解析的评估与人工对生成字符串的评估之间存在密切的相关性。所建议系统的总体摘要质量是最先进的,由于使用了基于约束的解析器/生成器,因此保证了系统输出的语法性。
{"title":"Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar","authors":"S. Riezler, Tracy Holloway King, Dick Crouch, A. Zaenen","doi":"10.3115/1073445.1073471","DOIUrl":"https://doi.org/10.3115/1073445.1073471","url":null,"abstract":"We present an application of ambiguity packing and stochastic disambiguation techniques for Lexical-Functional Grammars (LFG) to the domain of sentence condensation. Our system incorporates a linguistic parser/generator for LFG, a transfer component for parse reduction operating on packed parse forests, and a maximum-entropy model for stochastic output selection. Furthermore, we propose the use of standard parser evaluation methods for automatically evaluating the summarization quality of sentence condensation systems. An experimental evaluation of summarization quality shows a close correlation between the automatic parse-based evaluation and a manual evaluation of generated strings. Overall summarization quality of the proposed system is state-of-the-art, with guaranteed grammaticality of the system output due to the use of a constraint-based parser/generator.","PeriodicalId":277518,"journal":{"name":"Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127013024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 100
Inducing History Representations for Broad Coverage Statistical Parsing 引申历史表示用于大范围统计分析
James Henderson
We present a neural network method for inducing representations of parse histories and using these history representations to estimate the probabilities needed by a statistical left-corner parser. The resulting statistical parser achieves performance (89.1% F-measure) on the Penn Treebank which is only 0.6% below the best current parser for this task, despite using a smaller vocabulary size and less prior linguistic knowledge. Crucial to this success is the use of structurally determined soft biases in inducing the representation of the parse history, and no use of hard independence assumptions.
我们提出了一种神经网络方法来诱导解析历史的表示,并使用这些历史表示来估计统计左角解析器所需的概率。结果统计解析器在Penn Treebank上实现了性能(89.1% F-measure),仅比当前最佳解析器低0.6%,尽管使用了较小的词汇量和较少的先验语言知识。这一成功的关键是在诱导解析历史的表示时使用结构决定的软偏差,而不使用硬独立性假设。
{"title":"Inducing History Representations for Broad Coverage Statistical Parsing","authors":"James Henderson","doi":"10.3115/1073445.1073459","DOIUrl":"https://doi.org/10.3115/1073445.1073459","url":null,"abstract":"We present a neural network method for inducing representations of parse histories and using these history representations to estimate the probabilities needed by a statistical left-corner parser. The resulting statistical parser achieves performance (89.1% F-measure) on the Penn Treebank which is only 0.6% below the best current parser for this task, despite using a smaller vocabulary size and less prior linguistic knowledge. Crucial to this success is the use of structurally determined soft biases in inducing the representation of the parse history, and no use of hard independence assumptions.","PeriodicalId":277518,"journal":{"name":"Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125463010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
Minimally Supervised Induction of Grammatical Gender 语法性别的最低监督归纳
Silviu Cucerzan, David Yarowsky
This paper investigates the problem of determining grammatical gender for the nouns of a language starting with minimal resources: a very small list of seed nouns for which gender is known or via translingual projection of natural gender. We show that through a bootstrapping process that uses contextual clues from an unannotated corpus and morphological clues modeled with suffix tries, accurate gender predictions can be induced for five diverse test languages.
本文研究了从最小的资源开始确定语言名词的语法性别的问题:一个非常小的已知性别的种子名词列表或通过自然性别的翻译语言投射。我们表明,通过使用来自未注释语料库的上下文线索和后缀尝试建模的形态学线索的引导过程,可以对五种不同的测试语言进行准确的性别预测。
{"title":"Minimally Supervised Induction of Grammatical Gender","authors":"Silviu Cucerzan, David Yarowsky","doi":"10.3115/1073445.1073451","DOIUrl":"https://doi.org/10.3115/1073445.1073451","url":null,"abstract":"This paper investigates the problem of determining grammatical gender for the nouns of a language starting with minimal resources: a very small list of seed nouns for which gender is known or via translingual projection of natural gender. We show that through a bootstrapping process that uses contextual clues from an unannotated corpus and morphological clues modeled with suffix tries, accurate gender predictions can be induced for five diverse test languages.","PeriodicalId":277518,"journal":{"name":"Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122100975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
期刊
Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1