2011 International Conference on Asian Language Processing最新文献

英文中文

Exploring Both Flat and Structured Features for Number Type Identification of Chinese Personal Noun Phrases 汉语人称名词短语数型识别的平面化与结构化特征探讨

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.69

Jun Lang

Different from English, Chinese does not explicitly show grammatical number information by inflection. The Number information in a Chinese sentence is implied by the noun phrase itself and its surrounding context. In this paper, we explore diverse features, including both flat and structured, for number identification of Chinese personal noun phrase. The flat features explore the knowledge within the noun phrase while the structured features capture the surrounding context information of the noun phrase in the parse tree of the given sentence. These two kinds of features together with kernel-based SVM are utilized in this study. Evaluation on the ACE 2005 corpus shows that our method achieves 89.23% in accuracy, which significantly advances the state-of-the-art.

与英语不同的是，汉语并不通过屈折变化来明确表达语法数字信息。汉语句子中的数字信息是由名词短语本身及其周围的语境所暗示的。本文探讨了汉语人称名词短语数字识别的平面化和结构化特征。扁平特征探索名词短语中的知识，而结构化特征在给定句子的解析树中捕获名词短语的周围上下文信息。本文将这两种特征与基于核的支持向量机结合使用。对ACE 2005语料库的评估表明，该方法的准确率达到89.23%，大大提高了目前的研究水平。

引用次数: 1

Mining Parallel Data from Comparable Corpora via Triangulation 利用三角剖分法从可比语料库中挖掘并行数据

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.57

T. Do, E. Castelli, L. Besacier

This paper improves an unsupervised method for extracting parallel sentence pairs from a comparable corpus by using the triangulation through a third language. Before, an unsupervised method for extracting parallel sentence pairs from a comparable corpus has been proposed. This method is based on technique of cross-language information retrieval with iterative process and requires no more additional parallel data. The method has been validated on the Vietnamese-French and Vietnamese-English bilingual data. In this paper, we address the problem of using triangulation through a third language to improve the parallel data mining processes: English is used in the Vietnamese-French parallel data mining process, and French is used in the Vietnamese-English parallel data mining process. The experiments conducted show that using triangulation can improve the quality of the extracted data and the quality of the translation system as well.

本文利用第三语言的三角剖分，改进了一种从可比语料库中提取平行句对的无监督方法。在此之前，已经提出了一种从可比语料库中提取平行句对的无监督方法。该方法基于迭代过程的跨语言信息检索技术，不需要额外的并行数据。该方法在越法和越英双语数据上进行了验证。在本文中，我们解决了通过第三语言使用三角测量来改进并行数据挖掘过程的问题:在越南语-法语并行数据挖掘过程中使用英语，在越南语-英语并行数据挖掘过程中使用法语。实验结果表明，使用三角剖分可以提高提取数据的质量和翻译系统的质量。

引用次数: 5

Improving Chinese Dependency Parsing with Self-Disambiguating Patterns 用自消歧模式改进汉语依存句法分析

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.36

Likun Qiu, Lei Wu, Kai Zhao, Changjian Hu, Lingpeng Kong

To solve the data sparseness problem in dependency parsing, most previous studies used features constructed from large-scale auto-parsed data. Unlike previous work, we propose a new approach to improve dependency parsing with context-free dependency triples (CDT) extracted by using self-disambiguating patterns (SDP). The use of SDP makes it possible to avoid the dependency on a baseline parser and explore the influence of different types of substructures one by one. Additionally, taking the available CDTs as seeds, a label propagation process is used to tag a large number of unlabeled word pairs as CDTs. Experiments show that, when CDT features are integrated into a maximum spanning tree (MST) dependency parser, the new parser improves significantly over the baseline MST parser. Comparative results also show that CDTs with dependency relation labels perform much better than CDT without dependency relation label.

为了解决依赖解析中的数据稀疏性问题，以往的研究大多采用大规模自动解析数据构建特征。与以前的工作不同，我们提出了一种新的方法，通过使用自消歧模式(SDP)提取上下文无关的依赖三元组(CDT)来改进依赖解析。使用SDP可以避免对基线解析器的依赖，并逐个探索不同类型子结构的影响。另外，以可用的cdt为种子，通过标签传播过程将大量未标记的词对标记为cdt。实验表明，当CDT特征集成到最大生成树(MST)依赖解析器中时，新的解析器比基线MST解析器有了显著的改进。对比结果还表明，带依赖关系标签的CDT比不带依赖关系标签的CDT性能要好得多。

引用次数: 0

An Experimental Study on Vietnamese Speech Synthesis 越南语语音合成的实验研究

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.40

Liping Kui, Jian Yang, Bin He, Enxing Hu

The modern Vietnamese is a monosyllabic tone language. Each syllable can be marked with initial, final and tone. In this paper, Vietnamese speech synthesis system is realized by using a trainable HMM-based speech synthesis method. The basic synthesis units of this system are initials and finals. According to the characteristics of Vietnamese, we have conducted such works as collecting corpus, recording, labeling, determining the phonemes list, and designing context attributes and question set. Then Vietnamese speech synthesis system is constructed by using the STRAIGHT synthesizer under the HTS platform. At last, we conduct a subjective test to synthetic speech signals. The results of preliminary evaluation show that the intelligibility of the utterances is approximately 100%, and the quality of synthesis speech is from fair to good.

现代越南语是一种单音节语调语言。每个音节都可以标上声母、韵母和声调。本文采用一种可训练的基于hmm的语音合成方法实现了越南语语音合成系统。这个系统的基本合成单位是声母和韵母。根据越南语的特点，我们进行了语料库收集、记录、标注、确定音素表、设计语境属性和问题集等工作。然后利用HTS平台下的STRAIGHT合成器构建越南语语音合成系统。最后，对合成的语音信号进行主观测试。初步评价结果表明，语音的可理解度约为100%，合成语音的质量从一般到良好。

引用次数: 1

How Vietnamese Attitudes can be Recognized and Confused: Cross-Cultural Perception and Speech Prosody Analysis 越南语的态度如何被识别和混淆:跨文化感知和语音韵律分析

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.39

Dang-Khoa Mac, E. Castelli, V. Aubergé, A. Rilliard

Prosodic attitudes, or social affects, are main part of face-to-face interaction and linked to the language through the culture. This paper presents a study on prosodic attitudes in Vietnamese, a tonal language. Perception experiments on 16 Vietnamese attitudes were carried out with Vietnamese and French participants. The results revealed perception differences between native and non-native listeners. As attitudinal expression are partially carried through speech prosody, an analysis was also carried out, in order to have a better understanding of why these attitudes are recognized or confused, and to bring out some prosodic characteristics of Vietnamese social affects.

韵律态度或社会影响是面对面交流的重要组成部分，并通过文化与语言联系在一起。本文对声调语言越南语的韵律态度进行了研究。对越南和法国的参与者进行了16种越南态度的感知实验。结果揭示了母语和非母语听众之间的感知差异。由于态度的表达部分是通过言语韵律来进行的，为了更好地理解这些态度被认可或混淆的原因，并揭示越南社会情感的一些韵律特征，我们也进行了分析。

引用次数: 5

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.44

Diem Thi Hoang Le, AiTi Aw

Transliteration is the transformation of word in original language to another language based on its pronunciation. Back transliteration is the transformation of already transliterated word in another language back to its original form. This backward process is in nature more challenging than the forward direction because of more information lost. In many cases, the back transliteration can return almost exact result, which has a minor difference in spelling compared with the original word form. We propose in this work a lexical word similarity for dictionary matching in order to re-rank the candidates and enhance the performance of a grapheme-based location name back transliteration. This method is experimented on Vietnamese-English language pair and showed improvement.

音译是将原语中的单词根据其发音转换成另一种语言的过程。回音译是指将另一种语言中已经音译的单词转换成原形式。这种向后的过程在本质上比向前的过程更具挑战性，因为会丢失更多的信息。在许多情况下，反向音译可以返回几乎准确的结果，与原始单词形式相比，拼写略有不同。在这项工作中，我们提出了词典匹配的词汇词相似度，以便重新排序候选词并提高基于字素的位置名称反音译的性能。该方法在越英语对上进行了实验，取得了较好的效果。

引用次数: 1

Extracting Pseudo-Labeled Samples for Sentiment Classification Using Emotion Keywords 基于情感关键词的伪标记样本情感分类

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.61

Sophia Yat-Mei Lee, Daming Dai, Shoushan Li, K. Ahrens

Sentiment and emotion analysis have been traditionally established as independent research topics in NLP. Although they are two important aspects of subjective information and are closely related, there have been few attempts to combine the two analyses. As a preliminary attempt, we integrate emotion information into sentiment analysis by employing emotion keywords to help automatically extract pseudo-labeled samples. The extracted pseudo-labeled samples are then used as the initial training data to perform semi-supervised learning for sentiment classification. Experimental results across four domains show that our approach using emotion keywords is capable of extracting pseudo-labeled samples with high precision (about 90%). Moreover, the pseudo-labeled samples along with the semi-supervised learning approach further improve the classification performance.

情感和情绪分析历来是自然语言处理中一个独立的研究课题。虽然它们是主观信息的两个重要方面，并且密切相关，但很少有人尝试将这两种分析结合起来。作为初步尝试，我们将情感信息整合到情感分析中，利用情感关键词帮助自动提取伪标签样本。然后将提取的伪标记样本用作初始训练数据，进行半监督学习以进行情感分类。跨四个领域的实验结果表明，我们使用情感关键词的方法能够以较高的精度(约90%)提取伪标记样本。此外，伪标记样本和半监督学习方法进一步提高了分类性能。

引用次数: 1

Joint Decoding for Chinese Word Segmentation and POS Tagging Using Character-Based and Word-Based Discriminative Models 基于字符和词判别模型的汉语分词和词性标注联合译码

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.24

Xinxin Li, Xuan Wang, Lin Yao

For Chinese word segmentation and POS tagging problem, both character-based and word-based discriminative approaches can be used. Experiments show that these two approaches bring different errors and can complement each other. In this paper, we propose a joint decoding model based on both character-based and word-based models using multi-beam search algorithm. Experimental results show that the joint decoding model outperforms character-based and word-based baseline models.

对于汉语分词和词性标注问题，可以采用基于字符的判别方法和基于词的判别方法。实验表明，这两种方法误差不同，可以互补。本文提出了一种基于多波束搜索算法的基于字符和词的联合解码模型。实验结果表明，联合解码模型优于基于字符和基于单词的基线模型。

引用次数: 3

Issues with the Unergative/Unaccusative Classification of the Intransitive Verbs 不及物动词的非否定/非宾格分类问题

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.54

Nitesh Surtani, Khushboo Jha, Soma Paul

The paper abandons a strict two-way sub-classification of intransitive verbs into unaccuasative and unergative for Hindi and proposes a distribution plotting of the same in a diffusion chart. The diagnostics tests that Bhatt (2003) applied on Hindi data are ranked for their efficiency of attributing correct sub-class to verbs. The diffusion chart shows that a tripartite classification handles the issue of classification of intransitive verbs in a better manner than the classical binary approach. The tripartite classification is as follows: (1) Verbs that take animate subject and are compatible with adverb of volitionality; (2) Verbs that take animate subject but are not compatible with adverb of volitionality; and (3) Verbs that take inanimate subject. The classification is of immense advantage for various NLP tasks such as machine translation, natural language generation.

本文放弃了印度语中不及物动词严格的双向子分类，即非准确动词和非否定动词，并在扩散图中提出了它们的分布图。Bhatt(2003)在印地语数据上应用的诊断测试因其为动词赋予正确子类的效率而排名。扩散图表明，与经典的二元分类方法相比，三元分类方法更好地处理了不及物动词的分类问题。这三方面的分类是:(1)带有动性主语并与意志性副词相容的动词;(2)带有动性主语但与意志性副词不相容的动词;(3)主语为无生命主语的动词。该分类对于机器翻译、自然语言生成等各种NLP任务具有巨大的优势。

引用次数: 3

Acoustic Space in Motor Disorders of Speech: Two Case Studies 言语运动障碍的声空间:两个个案研究

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.25

Vaishna Narang, Deepshikha Misra, Garima Dalal

Studies on acoustic space have strengthened the view that vowels are acoustically and perceptually defined in terms of their relative positioning in vowel space. Every speaker identifies an optimal vowel space within which perceptual, phonological contrast is maintained. This is an interdisciplinary study involving speech pathology, physics of speech and neurology of speech. Two case studies of dysarthria presented in this paper are -- one Parkinson's disease and one case of acute ischemic stroke with age-gender-language matched controls. A detailed acoustic analysis shows how acoustic space gets considerably reduced, in both PD and stroke, and in these two very different kinds of dysarthrias the acoustic space is also modified very differently. The study also examines the third formant to show that the higher formants are consistently lowered in both PD and stroke. Hypokinetic speech production in these cases is reflected in lower intensity. The results have significant applications in clinical acoustics and in the theoretical fields of neurology of speech, linguistics and phonology.

声学空间的研究强化了元音在声学和感知上的定义，即元音在元音空间中的相对位置。每个说话者都确定一个最佳的元音空间，在这个空间内保持感知和语音的对比。这是一项涉及语言病理学、语言物理学和语言神经学的跨学科研究。本文介绍了两个构音障碍的病例研究，一个是帕金森病，一个是急性缺血性中风，对照组为年龄、性别、语言匹配。详细的声学分析表明，在帕金森病和中风中，声学空间是如何大大减少的，在这两种截然不同的构音障碍中，声学空间的变化也非常不同。该研究还检查了第三峰，表明在帕金森病和中风中，较高的峰持续降低。在这些情况下，低动力的言语产生反映在较低的强度上。研究结果在临床声学以及语音神经学、语言学和音系学等理论领域具有重要的应用价值。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 International Conference on Asian Language Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀