首页 > 最新文献

2011 International Conference on Asian Language Processing最新文献

英文 中文
An Orientation Model for Hierarchical Phrase-Based Translation 层次化短语翻译的导向模型
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.43
Xinyan Xiao, Jinsong Su, Yang Liu, Qun Liu, Shouxun Lin
The hierarchical phrase-based (HPB) translation exploits the power of grammar to perform long distance reorderings, without specifying nonterminal orientations against adjacent blocks or considering the lexical information covered by nonterminals. In this paper, we borrow from phrase-based system the idea of orientation model to enhance the reordering ability of HPB translation. We distinguish three orientations (monotone, swap, discontinuous) of a nonterminal based on the alignment of grammar, and select the appropriate orientation of nonterminal using lexical information covered by it. By incorporating the orientation model, our approach significantly outperforms a standard HPB system up to 1.02 Bleu on large scale NIST Chinese-English translation task, and 0.51 Bleu on WMT German-English translation task.
基于分层短语(HPB)的翻译利用语法的力量来执行长距离重排序,而不需要针对相邻块指定非终结符的方向,也不需要考虑非终结符所涵盖的词汇信息。在本文中,我们从基于短语的系统中借鉴了导向模型的思想来增强HPB翻译的重新排序能力。我们根据语法对齐区分非终结语的三种方向(单调、交换、不连续),并利用其所涵盖的词汇信息选择合适的非终结语方向。通过引入定向模型,我们的方法在大规模NIST汉英翻译任务上的性能显著优于标准HPB系统,在大规模NIST汉英翻译任务上的性能高达1.02 Bleu,在WMT德英翻译任务上的性能高达0.51 Bleu。
{"title":"An Orientation Model for Hierarchical Phrase-Based Translation","authors":"Xinyan Xiao, Jinsong Su, Yang Liu, Qun Liu, Shouxun Lin","doi":"10.1109/IALP.2011.43","DOIUrl":"https://doi.org/10.1109/IALP.2011.43","url":null,"abstract":"The hierarchical phrase-based (HPB) translation exploits the power of grammar to perform long distance reorderings, without specifying nonterminal orientations against adjacent blocks or considering the lexical information covered by nonterminals. In this paper, we borrow from phrase-based system the idea of orientation model to enhance the reordering ability of HPB translation. We distinguish three orientations (monotone, swap, discontinuous) of a nonterminal based on the alignment of grammar, and select the appropriate orientation of nonterminal using lexical information covered by it. By incorporating the orientation model, our approach significantly outperforms a standard HPB system up to 1.02 Bleu on large scale NIST Chinese-English translation task, and 0.51 Bleu on WMT German-English translation task.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121901655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Discourse Structures of English Exposition 英语论述语篇结构
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.51
Donghong Liu, Meizhen Liao
Van Kuppevelt's approach to discourse structure emphasizes that topicality is the general organizing principle. His discourse structure consisting of MAIN STRUCTURE and SIDE STRUCTURE has been applied to conversations and discourse segmentations rather than expository essays. In this paper two discourse structures of expository essays are proposed in Van Kuppevelt's framework. The proposed structure can test the soundness of the ways of developing expository essays.
Van Kuppevelt的话语结构研究方法强调话题性是一般的组织原则。他的话语结构由主结构和副结构组成,主要应用于对话和话语切分,而不是说明文。本文在Van Kuppevelt的框架下提出了说明文的两种话语结构。本文提出的结构可以检验说明文写作方式的合理性。
{"title":"Discourse Structures of English Exposition","authors":"Donghong Liu, Meizhen Liao","doi":"10.1109/IALP.2011.51","DOIUrl":"https://doi.org/10.1109/IALP.2011.51","url":null,"abstract":"Van Kuppevelt's approach to discourse structure emphasizes that topicality is the general organizing principle. His discourse structure consisting of MAIN STRUCTURE and SIDE STRUCTURE has been applied to conversations and discourse segmentations rather than expository essays. In this paper two discourse structures of expository essays are proposed in Van Kuppevelt's framework. The proposed structure can test the soundness of the ways of developing expository essays.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117196230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of Acoustic Space in 3 to 5 Years Old Hindi Speaking Children 3至5岁印地语儿童声空间的发展
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.41
Vaishna Narang, Garima Dalal, Deepshikha Misra
Many studies have described the acoustics of speech focusing on the speech of adults, but only a few have analyzed children's speech. Studies on development of language in children often include their articulatory speech patterns but the process of speech development is only partially understood. This research focuses on the development of Acoustic/Vowel space in Hindi speaking children. The study assumes that the acoustic space is continuously being redefined and modified in order to achieve and maintain a certain perceptual contrast and attempts to explore how acoustic space develops in children from three to five years of age.An acoustic study of seven peripheral vowels of Hindi, this study uses the first two formants of the vowels to arrive at a graphic representation of the acoustic space for peripheral vowels as articulated by the subjects under study. The area of the vowel/ acoustic space is then calculated using Irregular Polygon Area Calculator. The development of acoustic space in six Hindi speaking children from 3 to 5 years of age shows interesting results which are presented in this paper.
许多研究都把注意力集中在成年人的语音上,但只有少数研究分析了儿童的语音。对儿童语言发展的研究往往包括儿童的发音语言模式,但对儿童语言发展的过程却只有部分的了解。本研究的重点是印地语儿童的声学/元音空间的发展。本研究假设声空间不断被重新定义和修改,以达到和保持一定的感知对比,并试图探索3 - 5岁儿童声空间的发展过程。这项研究对印地语的七个外围元音进行了声学研究,使用元音的前两个共振峰来得出被研究对象所表达的外围元音的声学空间的图形表示。然后使用不规则多边形面积计算器计算元音/声学空间的面积。本文对6名3 ~ 5岁印地语儿童的声空间发展进行了有趣的研究。
{"title":"Development of Acoustic Space in 3 to 5 Years Old Hindi Speaking Children","authors":"Vaishna Narang, Garima Dalal, Deepshikha Misra","doi":"10.1109/IALP.2011.41","DOIUrl":"https://doi.org/10.1109/IALP.2011.41","url":null,"abstract":"Many studies have described the acoustics of speech focusing on the speech of adults, but only a few have analyzed children's speech. Studies on development of language in children often include their articulatory speech patterns but the process of speech development is only partially understood. This research focuses on the development of Acoustic/Vowel space in Hindi speaking children. The study assumes that the acoustic space is continuously being redefined and modified in order to achieve and maintain a certain perceptual contrast and attempts to explore how acoustic space develops in children from three to five years of age.An acoustic study of seven peripheral vowels of Hindi, this study uses the first two formants of the vowels to arrive at a graphic representation of the acoustic space for peripheral vowels as articulated by the subjects under study. The area of the vowel/ acoustic space is then calculated using Irregular Polygon Area Calculator. The development of acoustic space in six Hindi speaking children from 3 to 5 years of age shows interesting results which are presented in this paper.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115244942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Context-Based Persian Multi-document Summarization (Global View) 基于上下文的波斯语多文档摘要(全局视图)
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.53
Asef Poormasoomi, M. Kahani, Saeed Varasteh Yazdi, Hossein Kamyar
Multi-document summarization is the automatic extraction of information from multiple documents of the same topic. This paper proposes a new method, using LSA, for extracting the global context of a topic and removes sentence redundancy using SRL and WordNet semantic similarity for Persian language. In the previous approaches, the focus was on the sentence features (local view) as the main and basic unit of text. In this paper, the sentences are selected based on the main context hidden in the all documents of a topic. The experimental results show that our proposed method outperforms other Persian multi-document systems.
多文档摘要是从同一主题的多个文档中自动提取信息。本文提出了一种新的方法,利用LSA提取主题的全局上下文,并利用SRL和WordNet语义相似度去除波斯语的句子冗余。在以前的方法中,重点是句子特征(局部视图)作为文本的主要和基本单位。在本文中,基于隐藏在一个主题的所有文档中的主要上下文来选择句子。实验结果表明,该方法优于其他波斯语多文档系统。
{"title":"Context-Based Persian Multi-document Summarization (Global View)","authors":"Asef Poormasoomi, M. Kahani, Saeed Varasteh Yazdi, Hossein Kamyar","doi":"10.1109/IALP.2011.53","DOIUrl":"https://doi.org/10.1109/IALP.2011.53","url":null,"abstract":"Multi-document summarization is the automatic extraction of information from multiple documents of the same topic. This paper proposes a new method, using LSA, for extracting the global context of a topic and removes sentence redundancy using SRL and WordNet semantic similarity for Persian language. In the previous approaches, the focus was on the sentence features (local view) as the main and basic unit of text. In this paper, the sentences are selected based on the main context hidden in the all documents of a topic. The experimental results show that our proposed method outperforms other Persian multi-document systems.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122610375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Using HTML Tags to Improve Parallel Resources Extraction 使用HTML标签改进并行资源提取
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.23
Yanhui Feng, Yu Hong, Wei Tang, Jianmin Yao, Qiaoming Zhu
This paper proposes a new approach to extract parallel resources (including bilingual sentences and bilingual terms) from bilingual web pages, which have a primary language and a secondary language (the second language is often the translation to primary language). Our method is composed of four tasks: 1) parsing the web page into a DOM tree and segmenting inner texts of each node into series of monolingual snippets; 2) selecting adjacent snippet pairs in different languages and with higher translation scores as seeds for the next task; 3) constructing comprehensive wrappers from selected seeds, which save both HTML and surface formatting styles; 4) mining candidate instances and selecting good instances by their similarities with seeds. In this paper, we first propose to segment text by HTML tags, and select potential parallel resources by ranking all extracted candidates. According to the experimental results, our method can be applied to bilingual pages written in any other pair of languages. Experimental results also show that our approaches are effective in improving the parallel resources extraction.
本文提出了一种从双语网页中提取平行资源(包括双语句子和双语术语)的新方法,这些双语网页具有第一语言和第二语言(第二语言通常是对第一语言的翻译)。我们的方法由四个任务组成:1)将网页解析成DOM树,并将每个节点的内部文本分割成一系列的单语片段;2)选择不同语言中相邻且翻译分数较高的片段对作为下一个任务的种子;3)从选定的种子构建全面的包装器,保存HTML和表面格式样式;4)挖掘候选实例,根据与种子的相似度选择好实例。在本文中,我们首先提出通过HTML标签对文本进行分割,并通过对所有提取的候选资源进行排序来选择潜在的并行资源。实验结果表明,我们的方法可以应用于任何其他语言对的双语页面。实验结果也表明,我们的方法在提高并行资源提取方面是有效的。
{"title":"Using HTML Tags to Improve Parallel Resources Extraction","authors":"Yanhui Feng, Yu Hong, Wei Tang, Jianmin Yao, Qiaoming Zhu","doi":"10.1109/IALP.2011.23","DOIUrl":"https://doi.org/10.1109/IALP.2011.23","url":null,"abstract":"This paper proposes a new approach to extract parallel resources (including bilingual sentences and bilingual terms) from bilingual web pages, which have a primary language and a secondary language (the second language is often the translation to primary language). Our method is composed of four tasks: 1) parsing the web page into a DOM tree and segmenting inner texts of each node into series of monolingual snippets; 2) selecting adjacent snippet pairs in different languages and with higher translation scores as seeds for the next task; 3) constructing comprehensive wrappers from selected seeds, which save both HTML and surface formatting styles; 4) mining candidate instances and selecting good instances by their similarities with seeds. In this paper, we first propose to segment text by HTML tags, and select potential parallel resources by ranking all extracted candidates. According to the experimental results, our method can be applied to bilingual pages written in any other pair of languages. Experimental results also show that our approaches are effective in improving the parallel resources extraction.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123166737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Applying Grapheme, Word, and Syllable Information for Language Identification in Code Switching Sentences 应用字形、词和音节信息进行语码转换句的语言识别
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.34
Y. Yeong, T. Tan
In this paper, we propose an automatic language identification approach for code switching sentences by using the morphological structures and sequence of the syllable. The approach was tested on Malay-English code switching sentences. The proposed language identification approach achieves 90.75% in term of accuracy on the vocabularies. Our approach was further improved by combining the knowledge from other level in the sentence: word and alphabet. The additional information further improves the accuracy of our language identification method to 96.36%.
在本文中,我们提出了一种利用音节的形态结构和顺序来自动识别语码转换句的方法。该方法在马来语-英语代码转换句中进行了测试。所提出的语言识别方法在词汇上的准确率达到90.75%。我们的方法得到了进一步的改进,将其他层次的知识结合到句子中:单词和字母。这些额外的信息进一步提高了我们的语言识别方法的准确率,达到96.36%。
{"title":"Applying Grapheme, Word, and Syllable Information for Language Identification in Code Switching Sentences","authors":"Y. Yeong, T. Tan","doi":"10.1109/IALP.2011.34","DOIUrl":"https://doi.org/10.1109/IALP.2011.34","url":null,"abstract":"In this paper, we propose an automatic language identification approach for code switching sentences by using the morphological structures and sequence of the syllable. The approach was tested on Malay-English code switching sentences. The proposed language identification approach achieves 90.75% in term of accuracy on the vocabularies. Our approach was further improved by combining the knowledge from other level in the sentence: word and alphabet. The additional information further improves the accuracy of our language identification method to 96.36%.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131480606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Linear Regression for Prosody Prediction via Convex Optimization 基于凸优化的韵律预测线性回归
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.75
Ling Cen, M. Dong, P. Chan
In this paper, a L1 regularized linear regression based method is proposed to model the relationship between the linguistic features and prosodic parameters in Text-to-Speech (TTS) synthesis. By formulating prosodic prediction as a convex problem, it can be solved using very efficient numerical method. The performance can be similar to that of the Classification and Regression Tree (CART), a widely used approach for prosodic prediction. However, the computational load can be as low as 76% of that required by CART.
本文提出了一种基于L1正则化线性回归的文本到语音合成中语言特征与韵律参数之间关系的建模方法。通过将韵律预测表述为一个凸问题,它可以用非常有效的数值方法来求解。其性能可以类似于分类回归树(CART),这是一种广泛使用的韵律预测方法。然而,计算负载可以低至CART所需的76%。
{"title":"Linear Regression for Prosody Prediction via Convex Optimization","authors":"Ling Cen, M. Dong, P. Chan","doi":"10.1109/IALP.2011.75","DOIUrl":"https://doi.org/10.1109/IALP.2011.75","url":null,"abstract":"In this paper, a L1 regularized linear regression based method is proposed to model the relationship between the linguistic features and prosodic parameters in Text-to-Speech (TTS) synthesis. By formulating prosodic prediction as a convex problem, it can be solved using very efficient numerical method. The performance can be similar to that of the Classification and Regression Tree (CART), a widely used approach for prosodic prediction. However, the computational load can be as low as 76% of that required by CART.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133347300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WordNet Editor to Refine Indonesian Language Lexical Database WordNet编辑器完善印尼语词汇数据库
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.59
Gunawan, J. Wijoyo, I. K. E. Purnama, M. Hariadi
This paper describes an approach for editing Indonesian Language Lexical Database especially noun category and its relations. The purpose of this editor is to refine Indonesian Lexical Database that was developed in our previous researches. The visualization of the editor is using graph library with some modifications and additions. Furthermore, this editor will be web based so that everyone can participate to improve Indonesian Language Lexical Database. There is an administrator role that had to accept or reject any suggestion for the changes suggested by any member. We believe that this editing approach can also be used to improve WordNet developed in other languages.
本文介绍了一种编辑印尼语词汇库,特别是名词范畴及其关系的方法。本编辑器的目的是完善我们以前研究中开发的印尼语词汇数据库。编辑器的可视化使用图形库,并进行了一些修改和补充。此外,这个编辑器将基于网络,以便每个人都可以参与改进印尼语词汇数据库。管理员角色必须接受或拒绝任何成员提出的更改建议。我们相信这种编辑方法也可以用来改进用其他语言开发的WordNet。
{"title":"WordNet Editor to Refine Indonesian Language Lexical Database","authors":"Gunawan, J. Wijoyo, I. K. E. Purnama, M. Hariadi","doi":"10.1109/IALP.2011.59","DOIUrl":"https://doi.org/10.1109/IALP.2011.59","url":null,"abstract":"This paper describes an approach for editing Indonesian Language Lexical Database especially noun category and its relations. The purpose of this editor is to refine Indonesian Lexical Database that was developed in our previous researches. The visualization of the editor is using graph library with some modifications and additions. Furthermore, this editor will be web based so that everyone can participate to improve Indonesian Language Lexical Database. There is an administrator role that had to accept or reject any suggestion for the changes suggested by any member. We believe that this editing approach can also be used to improve WordNet developed in other languages.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131350631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Context Imperative Sentences of Modern Chinese 现代汉语的语境祈使句
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.48
Hao Zhao, Kaihong Yang
The present study of the imperative-expression category in modern Chinese mainly focuses on form the mood imperative sentences in general meaning. Besides the mood imperative sentences, the imperative expression also concludes those sentences that have imperative functions in context. These sentences can be divided into the omissive-imperative sentences and the zero-imperative sentences according to the explicitness of the imperative commanding. This paper mainly examines the types and the information transmitting characteristics of the two different imperative sentences.
目前对现代汉语祈使句范畴的研究主要集中在一般意义上的语气祈使句的构成。祈使句除了语气祈使句外,祈使句还概括了那些在语境中具有祈使句功能的句子。根据祈使句的明确性,可将祈使句分为省略祈使句和零祈使句。本文主要考察了两种祈使句的类型和信息传递特点。
{"title":"The Context Imperative Sentences of Modern Chinese","authors":"Hao Zhao, Kaihong Yang","doi":"10.1109/IALP.2011.48","DOIUrl":"https://doi.org/10.1109/IALP.2011.48","url":null,"abstract":"The present study of the imperative-expression category in modern Chinese mainly focuses on form the mood imperative sentences in general meaning. Besides the mood imperative sentences, the imperative expression also concludes those sentences that have imperative functions in context. These sentences can be divided into the omissive-imperative sentences and the zero-imperative sentences according to the explicitness of the imperative commanding. This paper mainly examines the types and the information transmitting characteristics of the two different imperative sentences.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116369124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-reference Resolution in Vietnamese Documents Based on Support Vector Machines 基于支持向量机的越南语文献共同参考解析
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.63
Duc-Trong Le, Mai-Vu Tran, Tri-Thanh Nguyen, Quang-Thuy Ha
Co-reference resolution task still poses many challenges due to the complexity of the Vietnamese language, and the lack of standard Vietnamese linguistic resources. Based on the mention-pair model of Rahman and Ng. (2009) and the characteristics of Vietnamese, this paper proposes a model using support vector machines (SVM) to solve the co-reference in Vietnamese documents. The corpus used in experiments to evaluate the proposed model was constructed from 200 articles in cultural and social categories from vnexpress.net newspaper website. The results of the initial experiments of the proposed model achieved 76.51% accuracy in comparison with that of the baseline model of 73.79% with similar features.
由于越南语的复杂性和缺乏标准的越南语语言资源,共同指称解析任务仍然面临许多挑战。基于Rahman和Ng的提及对模型。(2009)和越南语的特点,本文提出了一个使用支持向量机(SVM)的模型来解决越南语文档中的共同引用问题。实验中使用的语料库是由vexpress.net报纸网站上的200篇文化和社会类文章构建而成的。初步实验结果表明,该模型的准确率为76.51%,而相似特征的基线模型的准确率为73.79%。
{"title":"Co-reference Resolution in Vietnamese Documents Based on Support Vector Machines","authors":"Duc-Trong Le, Mai-Vu Tran, Tri-Thanh Nguyen, Quang-Thuy Ha","doi":"10.1109/IALP.2011.63","DOIUrl":"https://doi.org/10.1109/IALP.2011.63","url":null,"abstract":"Co-reference resolution task still poses many challenges due to the complexity of the Vietnamese language, and the lack of standard Vietnamese linguistic resources. Based on the mention-pair model of Rahman and Ng. (2009) and the characteristics of Vietnamese, this paper proposes a model using support vector machines (SVM) to solve the co-reference in Vietnamese documents. The corpus used in experiments to evaluate the proposed model was constructed from 200 articles in cultural and social categories from vnexpress.net newspaper website. The results of the initial experiments of the proposed model achieved 76.51% accuracy in comparison with that of the baseline model of 73.79% with similar features.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"64 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120982940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2011 International Conference on Asian Language Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1