2014 International Conference on Asian Language Processing (IALP)最新文献

英文中文

Polarity detection of Turkish comments on technology companies 土耳其人对科技公司评论的极性检测

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973514

Gözde Gül Şahin, Harun Resit Zafer, E. Adali

In this study, comments about technology brands are collected from a popular Turkish website, eksisözlük, and classified as positive or negative. Turkish text is preprocessed with different kinds of filters and then modeled with 1-gram, 2-grams and 3-grams language models. Naive Bayes (NB), Support Vector Machines (SVM) and K nearest neighbor (KNN) classifiers are applied on different configurations of preprocessing techniques, language models and linguistic attributes for comparison. We measured best F-measure as 0,696 on our test dataset.

在这项研究中，关于科技品牌的评论是从一个受欢迎的土耳其网站eksisözlük上收集的，并分为正面或负面。土耳其语文本用不同的过滤器进行预处理，然后用1克、2克和3克语言模型建模。将朴素贝叶斯(NB)、支持向量机(SVM)和K近邻(KNN)分类器应用于预处理技术、语言模型和语言属性的不同配置进行比较。我们在测试数据集上测量的最佳f值为0,696。

引用次数: 3

A comparative study of nominal predicate sentences (NPS) and SHI (be) sentences 名谓语句与be句的比较研究

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973481

Li-jun Zhang

Nominal predicate sentences (NPS) in Chinese can be divided into two categories: NPS1 and NPS2. NPS1 that can be transformed into SHI (be) sentences are mainly assertive and static, whereas NPS2 that can't be transformed into SHI sentences are mainly descriptive, declarative and dynamic. There are two main differences between NPS and SHI sentences: different styles and different prominent degrees of information focus.

汉语名词性谓语句可分为两类:名词性谓语句和名词性谓语句。可以转化为SHI (be)句的NPS1主要是自信的、静态的，而不能转化为SHI (be)句的NPS2主要是描述性的、陈述性的、动态性的。NPS句和SHI句的主要区别有两点:风格不同和信息焦点的突出程度不同。

引用次数: 0

Improving malt dependency parser using a simple grammar-driven unlexicalised dependency parser 使用简单的语法驱动的非词汇化依赖解析器改进麦芽依赖解析器

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973482

Anil Krishna Eragani, V. Kuchibhotla

In this paper, we present an approach to integrate unlexicalised grammatical features into Malt dependency parser. Malt parser is a lexicalised parser, and like every lexicalised parser, it is prone to data sparseness. We aim to address this problem by providing features from an unlexicalised parser. Contrary to lexicalised parsers, unlexicalised parsers are known for their robustness. We build a simple unlexicalised grammatical parser with POS tag sequences as grammar rules. We use the features from the grammatical parser as additional features to Malt. We achieved improvements of about 0.17-0.30% (UAS) on both English and Hindi state-of-the-art Malt results.

在本文中，我们提出了一种将非词汇化语法特征集成到麦芽依赖解析器中的方法。Malt解析器是一个词法化的解析器，与所有词法化的解析器一样，它容易出现数据稀疏性。我们的目标是通过提供来自非词汇化解析器的特性来解决这个问题。与词汇化解析器相反，非词汇化解析器以其健壮性而闻名。我们用POS标记序列作为语法规则构建了一个简单的非词汇化语法解析器。我们使用语法解析器中的特性作为Malt的附加特性。我们在英语和印地语最先进的麦芽结果上实现了约0.17-0.30% (UAS)的改进。

引用次数: 1

Building an Indonesian rule-based part-of-speech tagger 建立一个基于印度尼西亚规则的词性标注器

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973521

Rashel Fam, A. Luthfi, A. Dinakaramani, R. Manurung

This paper describes work on a part-of-speech tagger for the Indonesian language by employing a rule-based approach. The system tokenizes documents while also considering multi-word expressions and recognizes named entities. It then applies tags to every token, starting from closed-class words to open-class words and disambiguates the tags based on a set of manually defined rules. The system currently obtains an accuracy of 79% on a manually tagged corpus of roughly 250.000 tokens.

本文介绍了采用基于规则的方法对印尼语词性标注器的工作。系统对文档进行标记，同时还考虑多词表达式并识别命名实体。然后，它将标记应用于每个令牌，从封闭类单词到开放类单词，并根据一组手动定义的规则消除标记的歧义。该系统目前在大约25万个标记的人工标记语料库上获得了79%的准确率。

引用次数: 44

A bottom-up method for analyzing the domain of a sentence group 一种自底向上的句子组域分析方法

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973473

Xiangfeng Wei, Quan Zhang, Yi Yuan, Zhejie Chi

Sentence group is a linguistic unit between a sentence and an article. It is mapped into contextual element, one of the four layers of linguistic conceptual space in HNC (Hierarchical Network of Concepts) theory. The contextual element is composed of three components: domain, situation and background, with domain in the head of them. To extract the domain and situation of a sentence group, this paper proposed a bottom-up method, which extracts domain-related conceptual symbols from words, and then obtains the domain of a sentence according to the frequencies of the domain-related words and their semantic roles in the sentence. Finally it got the domain of a sentence group by merging sentences with the same domain and got the boundary of the sentence group. The experiment shows that this method can tackle some types of sentence groups well in real corpus. However, there are still a lot of details to be studied in extracting the domain, confirming the boundary of sentence group and extracting the framework of a sentence group.

句群是介于句子和冠词之间的语言单位。它被映射为上下文元素，这是HNC (hierarchy Network of Concepts)理论中语言概念空间的四层之一。语境要素由三部分组成:领域、情境和背景，其中领域居于首位。为了提取句子组的领域和情境，本文提出了一种自下而上的方法，即从词中提取领域相关的概念符号，然后根据领域相关词在句子中的频率及其语义角色得到句子的领域。最后通过对具有相同域的句子进行归并得到句子组的域，得到句子组的边界。实验表明，该方法可以很好地处理真实语料库中某些类型的句子组。但是，在提取领域、确定句群边界、提取句群框架等方面，仍有很多细节有待研究。

引用次数: 0

A novel query expansion method for military news retrieval service 一种新的军事新闻检索服务查询扩展方法

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973491

Liang-Chu Chen, Wen-Tsan Chao, Chia-Jung Hsieh

Since most search engines retrieve documents strictly based on keywords, they cannot obtain other content that is similar in idea but different in keywords. Therefore, semantic query expansion is very important and ontology is a critical foundation for supporting semantic query expansion. Ontology has been used in Information Retrieval, Data Category, Library Sciences and Medical Sciences; however, its use is rare in the Military Domain. There are two purposes for this research. The first is to use a “Military Dictionary” database as a fundamental and combine it with the procedure of formal concept analysis to automatically construct the relationship between military ontology and vocabulary concepts. The second is to use military news from the “Defense Technology Military Database” as a training data resource, to design a novel query expansion with the Keyword to Formal Concept Query Expansion (K2FCQE) algorithm and then to proceed query mode verification. The results of this research verify that the K2FCQE is more efficient than other query expansions.

由于大多数搜索引擎严格根据关键词检索文档，因此无法获得其他思想相似但关键词不同的内容。因此，语义查询扩展是非常重要的，而本体是支持语义查询扩展的关键基础。本体在信息检索、数据分类、图书馆学、医学等领域得到广泛应用;然而，它的使用是罕见的军事领域。这项研究有两个目的。一是以《军事词典》数据库为基础，结合形式概念分析程序，自动构建军事本体与词汇概念之间的关系。二是利用《国防科技军事数据库》中的军事新闻作为训练数据资源，采用关键字到形式概念查询扩展(K2FCQE)算法设计一种新的查询扩展，并进行查询模式验证。本研究的结果验证了K2FCQE比其他查询扩展更有效。

引用次数: 2

A maximum entropy based reordering model for Mongolian-Chinese SMT with morphological information 基于形态学信息的蒙汉SMT最大熵重排序模型

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973484

Zhenxin Yang, Miao Li, Zede Zhu, Lei Chen, Linyu Wei, Shaoqi Wang

Different order between Mongolian and Chinese and the scarcity of parallel corpus are the main problems in Mongolian-Chinese statistical machine translation (SMT). We propose a method that adopts morphological information as the features of the maximum entropy based phrase reordering model for Mongolian-Chinese SMT. By taking advantage of the Mongolian morphological information, we add Mongolian stem and affix as phrase boundary information and use a maximum entropy model to predict reordering of neighbor blocks. To some extent, our method can alleviate the influence of reordering caused by the data sparseness. In addition, we further add part-of-speech (POS) as the features in the reordering model. Experiments show that the approach outperforms the maximum entropy model using only boundary words information and provides a maximum improvement of 0.8 BLEU score increment over baseline.

蒙汉统计机器翻译中存在的主要问题是蒙汉语料库的缺乏和语序的差异。提出了一种以形态信息为特征的基于最大熵的蒙汉SMT短语重排模型。利用蒙古语的形态信息，加入蒙古语词干和词缀作为短语边界信息，利用最大熵模型预测相邻块的重排序。在一定程度上，我们的方法可以缓解由于数据稀疏性导致的重排序的影响。此外，我们进一步增加词性(POS)作为重排序模型的特征。实验表明，该方法优于仅使用边界词信息的最大熵模型，并且在基线上提供了0.8个BLEU分数增量的最大改进。

引用次数: 3

A computer-aided Chinese pronunciation training program for English-speaking learners 为英语学习者提供的计算机辅助汉语发音训练程序

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973499

Yi Qin, Guonian Wang

This paper introduces a pilot study on incorporating effective teaching methods in computer-aided pronunciation training (CAPT) programs to help English-speaking learners acquire Mandarin lexical tones by using speech analysis software. It is proved that CAPT programs help learners identify relevant acoustic cues and discern the four tones in Chinese language, which they found hard to differentiate and imitate. Acoustic analyses of the pitch track comparisons between pre- and post-training productions in the form of visual display of speakers' pitch curves (tracks) reveal the nature of the improving process for the learner. Acoustic images also indicate that post-training tone curves (tracks) approximate native norms to a greater degree than pre-training tone tracks. The methodology developed hereby may provide a platform for more efficient Chinese learning.

本文介绍了一项将有效的教学方法纳入计算机辅助发音训练(CAPT)程序的试点研究，通过语音分析软件帮助英语学习者习得普通话词汇声调。事实证明，CAPT程序可以帮助学习者识别相关的声音线索，识别汉语中难以区分和模仿的四个声调。对训练前后的音轨进行声学分析比较，以视觉显示说话者的音轨曲线的形式，揭示了学习者提高过程的本质。声学图像也表明，训练后的音调曲线(轨迹)比训练前的音调轨迹更接近自然规范。本文所开发的方法可以为更有效的汉语学习提供一个平台。

引用次数: 3

Imperative sentences with assertive mood 语气坚定的祈使句

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973478

Pu Li, Hao Zhao

Imperative sentences with assertive mood(ISAM), being positioned between typical declarative sentences and typical imperative sentences, appear as declarative sentences, but perform the function of imperative sentences. They are characterized by their verbs indicating action classification and the verbs are named “performative verbs”. The essay firstly explains why an imperative sentence with assertive mood can perform imperative function considering its formation and reveals the way an imperative sentence with assertive mood takes to transform from a declarative sentence to an imperative sentence. Secondly, performative verbs are categorized and their distances with imperative function are presented. Finally, based on the elements attached to the performative verbs, ISAM are classified into four categories.

带自信语气的祈使句处于典型陈述句和典型祈使句之间，虽然以陈述句的形式出现，但却发挥着祈使句的功能。它们的特点是用动词表示动作分类，这些动词被称为“执行动词”。本文首先从祈使句的构成入手，解释了祈使句具有祈使句功能的原因，揭示了祈使句由陈述句转变为祈使句的过程。其次，对执行动词进行分类，并指出其与祈使句功能的距离。最后，根据执行动词所附带的要素，将ISAM分为四类。

引用次数: 0

Acoustic model merging using acoustic models from multilingual speakers for automatic speech recognition 利用多语说话者的声学模型进行声学模型合并，实现语音自动识别

2014 International Conference on Asian Language Processing (IALP)

Pub Date : 2014-12-04 DOI: 10.1109/IALP.2014.6973492

T. Tan, L. Besacier, B. Lecouteux

Many studies have explored on the usage of existing multilingual speech corpora to build an acoustic model for a target language. These works on multilingual acoustic modeling often use multilingual acoustic models to create an initial model. This initial model created is often suboptimal in decoding speech of the target language. Some speech of the target language is then used to adapt and improve the initial model. In this paper however, we investigate multilingual acoustic modeling in enhancing an acoustic model of the target language for automatic speech recognition system. The proposed approach employs context dependent acoustic model merging of a source language to adapt acoustic model of a target language. The source and target language speech are spoken by speakers from the same country. Our experiments on Malay and English automatic speech recognition shows relative improvement in WER from 2% to about 10% when multilingual acoustic model was employed.

许多研究都在探索利用现有的多语言语料库来建立目标语言的声学模型。这些关于多语言声学建模的工作通常使用多语言声学模型来创建初始模型。这种初始模型在解码目标语言语音时往往不是最优的。然后使用目标语言的一些语音来适应和改进初始模型。然而，在本文中，我们研究了多语言声学建模，以增强自动语音识别系统中目标语言的声学模型。该方法采用上下文相关的源语言声学模型合并来适应目标语言的声学模型。源语和目的语是由同一国家的说话者说的。我们对马来语和英语自动语音识别的实验表明，采用多语言声学模型时，识别率从2%提高到10%左右。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2014 International Conference on Asian Language Processing (IALP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀