首页 > 最新文献

2011 International Conference on Asian Language Processing最新文献

英文 中文
Acoustic Space in Motor Disorders of Speech: Two Case Studies 言语运动障碍的声空间:两个个案研究
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.25
Vaishna Narang, Deepshikha Misra, Garima Dalal
Studies on acoustic space have strengthened the view that vowels are acoustically and perceptually defined in terms of their relative positioning in vowel space. Every speaker identifies an optimal vowel space within which perceptual, phonological contrast is maintained. This is an interdisciplinary study involving speech pathology, physics of speech and neurology of speech. Two case studies of dysarthria presented in this paper are -- one Parkinson's disease and one case of acute ischemic stroke with age-gender-language matched controls. A detailed acoustic analysis shows how acoustic space gets considerably reduced, in both PD and stroke, and in these two very different kinds of dysarthrias the acoustic space is also modified very differently. The study also examines the third formant to show that the higher formants are consistently lowered in both PD and stroke. Hypokinetic speech production in these cases is reflected in lower intensity. The results have significant applications in clinical acoustics and in the theoretical fields of neurology of speech, linguistics and phonology.
声学空间的研究强化了元音在声学和感知上的定义,即元音在元音空间中的相对位置。每个说话者都确定一个最佳的元音空间,在这个空间内保持感知和语音的对比。这是一项涉及语言病理学、语言物理学和语言神经学的跨学科研究。本文介绍了两个构音障碍的病例研究,一个是帕金森病,一个是急性缺血性中风,对照组为年龄、性别、语言匹配。详细的声学分析表明,在帕金森病和中风中,声学空间是如何大大减少的,在这两种截然不同的构音障碍中,声学空间的变化也非常不同。该研究还检查了第三峰,表明在帕金森病和中风中,较高的峰持续降低。在这些情况下,低动力的言语产生反映在较低的强度上。研究结果在临床声学以及语音神经学、语言学和音系学等理论领域具有重要的应用价值。
{"title":"Acoustic Space in Motor Disorders of Speech: Two Case Studies","authors":"Vaishna Narang, Deepshikha Misra, Garima Dalal","doi":"10.1109/IALP.2011.25","DOIUrl":"https://doi.org/10.1109/IALP.2011.25","url":null,"abstract":"Studies on acoustic space have strengthened the view that vowels are acoustically and perceptually defined in terms of their relative positioning in vowel space. Every speaker identifies an optimal vowel space within which perceptual, phonological contrast is maintained. This is an interdisciplinary study involving speech pathology, physics of speech and neurology of speech. Two case studies of dysarthria presented in this paper are -- one Parkinson's disease and one case of acute ischemic stroke with age-gender-language matched controls. A detailed acoustic analysis shows how acoustic space gets considerably reduced, in both PD and stroke, and in these two very different kinds of dysarthrias the acoustic space is also modified very differently. The study also examines the third formant to show that the higher formants are consistently lowered in both PD and stroke. Hypokinetic speech production in these cases is reflected in lower intensity. The results have significant applications in clinical acoustics and in the theoretical fields of neurology of speech, linguistics and phonology.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116357609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Issues with the Unergative/Unaccusative Classification of the Intransitive Verbs 不及物动词的非否定/非宾格分类问题
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.54
Nitesh Surtani, Khushboo Jha, Soma Paul
The paper abandons a strict two-way sub-classification of intransitive verbs into unaccuasative and unergative for Hindi and proposes a distribution plotting of the same in a diffusion chart. The diagnostics tests that Bhatt (2003) applied on Hindi data are ranked for their efficiency of attributing correct sub-class to verbs. The diffusion chart shows that a tripartite classification handles the issue of classification of intransitive verbs in a better manner than the classical binary approach. The tripartite classification is as follows: (1) Verbs that take animate subject and are compatible with adverb of volitionality; (2) Verbs that take animate subject but are not compatible with adverb of volitionality; and (3) Verbs that take inanimate subject. The classification is of immense advantage for various NLP tasks such as machine translation, natural language generation.
本文放弃了印度语中不及物动词严格的双向子分类,即非准确动词和非否定动词,并在扩散图中提出了它们的分布图。Bhatt(2003)在印地语数据上应用的诊断测试因其为动词赋予正确子类的效率而排名。扩散图表明,与经典的二元分类方法相比,三元分类方法更好地处理了不及物动词的分类问题。这三方面的分类是:(1)带有动性主语并与意志性副词相容的动词;(2)带有动性主语但与意志性副词不相容的动词;(3)主语为无生命主语的动词。该分类对于机器翻译、自然语言生成等各种NLP任务具有巨大的优势。
{"title":"Issues with the Unergative/Unaccusative Classification of the Intransitive Verbs","authors":"Nitesh Surtani, Khushboo Jha, Soma Paul","doi":"10.1109/IALP.2011.54","DOIUrl":"https://doi.org/10.1109/IALP.2011.54","url":null,"abstract":"The paper abandons a strict two-way sub-classification of intransitive verbs into unaccuasative and unergative for Hindi and proposes a distribution plotting of the same in a diffusion chart. The diagnostics tests that Bhatt (2003) applied on Hindi data are ranked for their efficiency of attributing correct sub-class to verbs. The diffusion chart shows that a tripartite classification handles the issue of classification of intransitive verbs in a better manner than the classical binary approach. The tripartite classification is as follows: (1) Verbs that take animate subject and are compatible with adverb of volitionality; (2) Verbs that take animate subject but are not compatible with adverb of volitionality; and (3) Verbs that take inanimate subject. The classification is of immense advantage for various NLP tasks such as machine translation, natural language generation.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115897276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Joint Decoding for Chinese Word Segmentation and POS Tagging Using Character-Based and Word-Based Discriminative Models 基于字符和词判别模型的汉语分词和词性标注联合译码
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.24
Xinxin Li, Xuan Wang, Lin Yao
For Chinese word segmentation and POS tagging problem, both character-based and word-based discriminative approaches can be used. Experiments show that these two approaches bring different errors and can complement each other. In this paper, we propose a joint decoding model based on both character-based and word-based models using multi-beam search algorithm. Experimental results show that the joint decoding model outperforms character-based and word-based baseline models.
对于汉语分词和词性标注问题,可以采用基于字符的判别方法和基于词的判别方法。实验表明,这两种方法误差不同,可以互补。本文提出了一种基于多波束搜索算法的基于字符和词的联合解码模型。实验结果表明,联合解码模型优于基于字符和基于单词的基线模型。
{"title":"Joint Decoding for Chinese Word Segmentation and POS Tagging Using Character-Based and Word-Based Discriminative Models","authors":"Xinxin Li, Xuan Wang, Lin Yao","doi":"10.1109/IALP.2011.24","DOIUrl":"https://doi.org/10.1109/IALP.2011.24","url":null,"abstract":"For Chinese word segmentation and POS tagging problem, both character-based and word-based discriminative approaches can be used. Experiments show that these two approaches bring different errors and can complement each other. In this paper, we propose a joint decoding model based on both character-based and word-based models using multi-beam search algorithm. Experimental results show that the joint decoding model outperforms character-based and word-based baseline models.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127020677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Phoneme-Level Articulator Dynamics for Pronunciation Animation 语音动画的音素级发音器动态
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.13
Sheng Li, Lan Wang, En Qi
Speech visualization can be extended to a task of pronunciation animation for language learners. In this paper, a three dimensional English articulation database is recorded using Carstens Electro-Magnetic Articulograph (EMA AG500). An HMM-based visual synthesis method for continuous speech is implemented to recover 3D articulatory information. The synthesized articulations are then compared to the EMA recordings for objective evaluation. Using a data-driven 3D talking head, the distinctions between the confusable phonemes can be depicted through both external and internal articulatory movements. The experiments have demonstrated that the HMM-based synthesis with limited training data can achieve the minimum RMS error of less than 2mm. The synthesized articulatory movements can be used for computer assisted pronunciation training.
语音可视化可以扩展为语言学习者的发音动画任务。本文用Carstens电磁发音仪(EMA AG500)记录了一个三维英语发音数据库。实现了一种基于hmm的连续语音视觉合成方法,以恢复三维发音信息。然后将合成的关节与EMA记录进行比较以进行客观评价。使用数据驱动的3D说话头,可以通过外部和内部发音运动来描绘容易混淆的音素之间的区别。实验表明,在训练数据有限的情况下,基于hmm的合成可以实现最小均方根误差小于2mm。合成的发音动作可用于计算机辅助发音训练。
{"title":"The Phoneme-Level Articulator Dynamics for Pronunciation Animation","authors":"Sheng Li, Lan Wang, En Qi","doi":"10.1109/IALP.2011.13","DOIUrl":"https://doi.org/10.1109/IALP.2011.13","url":null,"abstract":"Speech visualization can be extended to a task of pronunciation animation for language learners. In this paper, a three dimensional English articulation database is recorded using Carstens Electro-Magnetic Articulograph (EMA AG500). An HMM-based visual synthesis method for continuous speech is implemented to recover 3D articulatory information. The synthesized articulations are then compared to the EMA recordings for objective evaluation. Using a data-driven 3D talking head, the distinctions between the confusable phonemes can be depicted through both external and internal articulatory movements. The experiments have demonstrated that the HMM-based synthesis with limited training data can achieve the minimum RMS error of less than 2mm. The synthesized articulatory movements can be used for computer assisted pronunciation training.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129690948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Non-native Accent Pronunciation Modeling in Automatic Speech Recognition 自动语音识别中的非母语口音发音建模
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.65
Basem H. A. Ahmed, T. Tan
In this paper, we proposed an approach to model the pronunciation of non-native accented speech for automatic speech recognition system. The proposed method consists of two phases: phones adaptation and pronunciation generalization. In phones adaptation, we identify the phones used by non-native speakers compared to the standard phones, and then remove the mismatch, as a result of the influence from mother tongue. In pronunciation adaptation, we predict the pronunciations of words by non-native speakers. The results shown the proposed approach reduce the WER from 44.8% to 41.9%.
本文提出了一种用于语音自动识别系统的非母语重音语音建模方法。该方法包括两个阶段:语音适应和语音泛化。在电话适应中,我们将非母语人士使用的电话与标准电话进行比较,然后消除由于母语影响而产生的不匹配。在发音适应中,我们预测非母语人士的发音。结果表明,该方法可将WER从44.8%降低到41.9%。
{"title":"Non-native Accent Pronunciation Modeling in Automatic Speech Recognition","authors":"Basem H. A. Ahmed, T. Tan","doi":"10.1109/IALP.2011.65","DOIUrl":"https://doi.org/10.1109/IALP.2011.65","url":null,"abstract":"In this paper, we proposed an approach to model the pronunciation of non-native accented speech for automatic speech recognition system. The proposed method consists of two phases: phones adaptation and pronunciation generalization. In phones adaptation, we identify the phones used by non-native speakers compared to the standard phones, and then remove the mismatch, as a result of the influence from mother tongue. In pronunciation adaptation, we predict the pronunciations of words by non-native speakers. The results shown the proposed approach reduce the WER from 44.8% to 41.9%.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"345 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132024778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion 基于错误驱动的汉语拼音字符转换自适应语言建模
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.46
J. Huang, D. Powers
The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coefficient but we never can predict what style of language users have and what new domain will appear. This paper presents a statistical error-driven adaptive language modeling approach to Chinese Pinyin input system. This model can be incrementally adapted when an error occurs during Pinyin-to-Character converting time. It significantly improves Pinyin-to-Character conversion rate.
当训练数据和转换数据的特征不同时,会严重影响汉字拼音转换的性能。由于自然语言具有高度的可变性和不确定性,不可能建立一个完整的、通用的语言模型来适应所有的任务。传统的自适应MAP模型使用混合系数将任务独立数据与任务相关数据混合,但无法预测用户的语言风格和新领域的出现。提出了一种统计误差驱动的自适应汉语拼音输入系统语言建模方法。当在拼音到字符转换期间发生错误时,可以逐步调整此模型。它显著提高了拼音到字符的转换率。
{"title":"Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion","authors":"J. Huang, D. Powers","doi":"10.1109/IALP.2011.46","DOIUrl":"https://doi.org/10.1109/IALP.2011.46","url":null,"abstract":"The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coefficient but we never can predict what style of language users have and what new domain will appear. This paper presents a statistical error-driven adaptive language modeling approach to Chinese Pinyin input system. This model can be incrementally adapted when an error occurs during Pinyin-to-Character converting time. It significantly improves Pinyin-to-Character conversion rate.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130184442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extracting Pseudo-Labeled Samples for Sentiment Classification Using Emotion Keywords 基于情感关键词的伪标记样本情感分类
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.61
Sophia Yat-Mei Lee, Daming Dai, Shoushan Li, K. Ahrens
Sentiment and emotion analysis have been traditionally established as independent research topics in NLP. Although they are two important aspects of subjective information and are closely related, there have been few attempts to combine the two analyses. As a preliminary attempt, we integrate emotion information into sentiment analysis by employing emotion keywords to help automatically extract pseudo-labeled samples. The extracted pseudo-labeled samples are then used as the initial training data to perform semi-supervised learning for sentiment classification. Experimental results across four domains show that our approach using emotion keywords is capable of extracting pseudo-labeled samples with high precision (about 90%). Moreover, the pseudo-labeled samples along with the semi-supervised learning approach further improve the classification performance.
情感和情绪分析历来是自然语言处理中一个独立的研究课题。虽然它们是主观信息的两个重要方面,并且密切相关,但很少有人尝试将这两种分析结合起来。作为初步尝试,我们将情感信息整合到情感分析中,利用情感关键词帮助自动提取伪标签样本。然后将提取的伪标记样本用作初始训练数据,进行半监督学习以进行情感分类。跨四个领域的实验结果表明,我们使用情感关键词的方法能够以较高的精度(约90%)提取伪标记样本。此外,伪标记样本和半监督学习方法进一步提高了分类性能。
{"title":"Extracting Pseudo-Labeled Samples for Sentiment Classification Using Emotion Keywords","authors":"Sophia Yat-Mei Lee, Daming Dai, Shoushan Li, K. Ahrens","doi":"10.1109/IALP.2011.61","DOIUrl":"https://doi.org/10.1109/IALP.2011.61","url":null,"abstract":"Sentiment and emotion analysis have been traditionally established as independent research topics in NLP. Although they are two important aspects of subjective information and are closely related, there have been few attempts to combine the two analyses. As a preliminary attempt, we integrate emotion information into sentiment analysis by employing emotion keywords to help automatically extract pseudo-labeled samples. The extracted pseudo-labeled samples are then used as the initial training data to perform semi-supervised learning for sentiment classification. Experimental results across four domains show that our approach using emotion keywords is capable of extracting pseudo-labeled samples with high precision (about 90%). Moreover, the pseudo-labeled samples along with the semi-supervised learning approach further improve the classification performance.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126641534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Query Reformulation Model Using Markov Graphic Method 基于马尔科夫图方法的查询重构模型
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.62
Jiali Zuo, Mingwen Wang
Information retrieval model is still can not achieve satisfactory performance after decades of development. One of the reasons is the queries can not express information need precisely. Researches have shown that query reformulation can improve the performance of retrieval model. In this paper, we propose a query reformulation model, which use Markov network to represent term relationship to obtain useful information from corpus to reformulate query. Experimental results show that our model can avoid topic drift and then improve the retrieval performance.
信息检索模型经过几十年的发展,仍然不能达到令人满意的性能。其中一个原因是查询不能准确表达信息需求。研究表明,查询重构可以提高检索模型的性能。本文提出了一种查询重表述模型,利用马尔可夫网络表示术语关系,从语料库中获取有用信息进行查询重表述。实验结果表明,该模型可以避免主题漂移,从而提高检索性能。
{"title":"A Query Reformulation Model Using Markov Graphic Method","authors":"Jiali Zuo, Mingwen Wang","doi":"10.1109/IALP.2011.62","DOIUrl":"https://doi.org/10.1109/IALP.2011.62","url":null,"abstract":"Information retrieval model is still can not achieve satisfactory performance after decades of development. One of the reasons is the queries can not express information need precisely. Researches have shown that query reformulation can improve the performance of retrieval model. In this paper, we propose a query reformulation model, which use Markov network to represent term relationship to obtain useful information from corpus to reformulate query. Experimental results show that our model can avoid topic drift and then improve the retrieval performance.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115014669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Research on Multi-document Summarization Model Based on Dynamic Manifold-Ranking 基于动态流形排序的多文档摘要模型研究
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.55
Meiling Liu, Honge Ren, Dequan Zheng, T. Zhao
This paper introduces a model to describe the dynamic evolution of network information, identifying and analyzing the document collection on the same topic in different stages. In order to characterize the dynamic relationship of evolutionary content differences, this paper presents a dynamic multi-document summarization model, which is called the Dynamic Manifold-Ranking Model (DMRM). Some experiments were conducted on the Update Task test data from TAC2008, and results of new model were compared with results from the TAC2008 evaluation. This comparison demonstrated the effectiveness of the model.
本文引入了一个描述网络信息动态演化的模型,对同一主题不同阶段的文献集合进行识别和分析。为了描述进化内容差异的动态关系,本文提出了一种动态多文档摘要模型,称为动态流形排序模型(DMRM)。在TAC2008的更新任务测试数据上进行了实验,并将新模型的结果与TAC2008的评估结果进行了比较。这一对比证明了该模型的有效性。
{"title":"Research on Multi-document Summarization Model Based on Dynamic Manifold-Ranking","authors":"Meiling Liu, Honge Ren, Dequan Zheng, T. Zhao","doi":"10.1109/IALP.2011.55","DOIUrl":"https://doi.org/10.1109/IALP.2011.55","url":null,"abstract":"This paper introduces a model to describe the dynamic evolution of network information, identifying and analyzing the document collection on the same topic in different stages. In order to characterize the dynamic relationship of evolutionary content differences, this paper presents a dynamic multi-document summarization model, which is called the Dynamic Manifold-Ranking Model (DMRM). Some experiments were conducted on the Update Task test data from TAC2008, and results of new model were compared with results from the TAC2008 evaluation. This comparison demonstrated the effectiveness of the model.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114745902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Integrated Approach Using Conditional Random Fields for Named Entity Recognition and Person Property Extraction in Vietnamese Text 基于条件随机场的越南语文本命名实体识别与人物属性提取集成方法
Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.37
Hoang-Quynh Le, Mai-Vu Tran, Nhat-Nam Bui, N. Phan, Quang-Thuy Ha
Personal names are among one of the most frequently searched items in web search engines and a person entity is always associated with numerous properties. In this paper, we propose an integrated model to recognize person entity and extract relevant values of a pre-defined set of properties related to this person simultaneously for Vietnamese. We also design a rich feature set by using various kind of knowledge resources and a apply famous machine learning method CRFs to improve the results. The obtained results show that our method is suitable for Vietnamese with the average result is 84 % of precision, 82.56% of recall and 83.39 % of F-measure. Moreover, performance time is pretty good, and the results also show the effectiveness of our feature set.
个人姓名是网络搜索引擎中最常搜索的条目之一,个人实体总是与许多属性相关联。在本文中,我们提出了一个集成模型来识别人实体,并同时提取与越南人相关的预定义属性集的相关值。我们还利用各种知识资源设计了丰富的特征集,并应用著名的机器学习方法CRFs来改进结果。结果表明,该方法适用于越南语,平均准确率为84%,召回率为82.56%,F-measure率为83.39%。此外,性能时间也相当不错,结果也表明了我们的特征集的有效性。
{"title":"An Integrated Approach Using Conditional Random Fields for Named Entity Recognition and Person Property Extraction in Vietnamese Text","authors":"Hoang-Quynh Le, Mai-Vu Tran, Nhat-Nam Bui, N. Phan, Quang-Thuy Ha","doi":"10.1109/IALP.2011.37","DOIUrl":"https://doi.org/10.1109/IALP.2011.37","url":null,"abstract":"Personal names are among one of the most frequently searched items in web search engines and a person entity is always associated with numerous properties. In this paper, we propose an integrated model to recognize person entity and extract relevant values of a pre-defined set of properties related to this person simultaneously for Vietnamese. We also design a rich feature set by using various kind of knowledge resources and a apply famous machine learning method CRFs to improve the results. The obtained results show that our method is suitable for Vietnamese with the average result is 84 % of precision, 82.56% of recall and 83.39 % of F-measure. Moreover, performance time is pretty good, and the results also show the effectiveness of our feature set.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117083362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2011 International Conference on Asian Language Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1