2013 International Conference on Asian Language Processing最新文献

英文中文

An Improved MMSE-LSA Speech Enhancement Algorithm Based on Human Auditory Masking Property 一种基于人听觉掩蔽特性的改进MMSE-LSA语音增强算法

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.49

Yong Zhang, Yi Y. Liu

An improved speech enhancement algorithm based on minimum mean square error of log-spectral amplitude estimator and masking property of human auditory system is proposed in this paper. The short-time spectral amplitude is estimated based on the minimum mean square error of log-spectral amplitude estimator. Residual musical noise can be masked by exploiting the masking properties of the human auditory system and a psycho acoustically motivated weighting filter is designed. The performance of our proposed algorithm is tested under various types of noise and at different SNR levels, and the results show that our proposed algorithm gives a superior performance as compared to the conventional algorithm and the residual musical noise is effectively masked.

提出了一种基于对数谱振幅估计的最小均方误差和人类听觉系统掩蔽特性的改进语音增强算法。利用对数谱幅值估计器的最小均方差估计短时谱幅值。利用人类听觉系统的掩蔽特性，可以对残留的音乐噪声进行掩蔽，设计了一种心理声动机加权滤波器。在各种噪声和不同信噪比下测试了本文算法的性能，结果表明，本文算法的性能优于传统算法，并且有效地掩盖了残留的音乐噪声。

引用次数: 2

Implementation of Chinese-Uyghur Bilateral EBMT System 汉维双边EBMT系统的实施

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.26

Kahaerjiang Abiderexiti, Tianfang Yao, Tuergen Yibulayin, Aishan Wumaier, Yasen Yiming

This work makes first attempt towards Chinese-Uyghur bilateral Machine Translation (MT) using an example based approach. We have developed a Chinese-Uyghur bilateral EBMT system by constructing bilingual corpus, implementing its storage format, developing similarity computation algorithm and recombining similar sentence. Finally, we evaluate our system by human judges, experimental result shows that there is still room for improvement for translation quality of our system.

本文首次尝试了基于实例的汉维双边机器翻译。通过构建双语语料库、实现双语语料库的存储格式、开发相似度计算算法和相似句的重组，开发了汉维双边EBMT系统。最后，用人工评委对系统进行评价，实验结果表明，系统的翻译质量仍有提高的空间。

引用次数: 2

Judgment, Extraction and Selective Restriction of Chinese Eventive Verb 汉语事件动词的判断、提取与选择限制

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.9

Mengxiang Wang, Houfeng Wang, Longkai Zhang

Eventive verb is a small class of Chinese verbs. Compared with other general verbs, it has different syntactic, semantic features. This article refers the verbs which can follow VP and must describe a complete event with the help of other verbs or the implicit verb as eventive verbs. Based on this, it gives the formal decision criteria of eventive verbs, and makes the quantitative and qualitative analysis of the internal connection and selective restrictions for the eventive verbs, which will be useful to set rules of the verb collocation.

事件动词是汉语动词的一个小类别。与其他一般动词相比，它具有不同的句法、语义特征。本文介绍了可以跟在副谓语后面的动词，这些动词必须借助其他动词或隐含动词作为事件动词来描述一个完整的事件。在此基础上，给出了事件动词的形式判定标准，并对事件动词的内在联系和选择限制进行了定量和定性分析，有助于动词搭配规则的制定。

引用次数: 1

Feature Abstraction for Lightweight and Accurate Chinese Word Segmentation 面向轻量准确中文分词的特征提取

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.65

Le Tian, Xipeng Qiu, Xuanjing Huang

Chinese word segmentation (CWS) is an important and necessary problem to analyze Chinese texts. The state-of-art CWS systems are mostly based on sequence labeling algorithm and use the discriminative model with millions of overlapping binary features. However, there are few works on porting these systems to the devices with limited computing capacity and memory. In this paper, we focus on two challenges in Chinese word segmentation: (1) low accuracy of out-of-vocabulary word and (2) huge feature space. To resolve these two difficult problems, we propose a method to abstract the original input on both character and feature levels. We group the "similar'' features to generate more abstract representation. Experimental results show that feature abstraction can greatly reduce the feature space with a comparable performance.

汉语分词是汉语文本分析中一个重要而必要的问题。现有的水警系统大多基于序列标记算法，使用具有数百万个重叠二元特征的判别模型。然而，将这些系统移植到计算能力和内存有限的设备上的工作很少。本文主要研究了汉语分词中存在的两个问题:(1)词汇外词的准确率低;(2)特征空间大。为了解决这两个难题，我们提出了一种从特征和特征两个层次对原始输入进行抽象的方法。我们将“相似”的特征分组以生成更抽象的表示。实验结果表明，特征抽象可以在相当的性能下大大减少特征空间。

引用次数: 0

Uyghur Stem-Suffix Segmentation and POS-Tagging Based on Functional Suffixes 基于功能词缀的维吾尔语干词缀分割与pos标注

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.72

Haibo Wang, Yiqing Zu, Lang Wu, Litip Tohti

As an agglutinative language, Uyghur has rich functional suffixes. This study is using functional suffixes (including suffix stringes) to define POS inventory for stem-suffix segmentation and word POS-Tagging based on Prof. Litip Tohti's studies on Uyghur generative syntax. To test the feasibility of functional suffix definition system, we conducted stem-suffix segmentation in a big text corpus (270,000 words, without repetition). About 14,100 most frequently used suffix strings generated. The limited number of suffix strings can reflect the feature of huge amount of Uyghur syntactic components. We tag the suffix strings as one unit based on Prof. Litip Tohti's study, and tag the words in corpus based on the suffix instead of the stem. This is a new try, we hope to improve the performance of Uyghur language processing.

维吾尔语作为一种黏着语，具有丰富的功能后缀。本研究在力提普·土赫提教授维吾尔语生成句法研究的基础上，利用功能后缀(包括后缀字符串)定义词性词性库，用于词性词性分词和词性词性标注。为了测试功能后缀定义系统的可行性，我们在一个大文本语料库(27万单词，无重复)中进行了词干-词尾分割。生成了大约14,100个最常用的后缀字符串。有限的后缀字符串可以反映维吾尔语句法成分数量庞大的特点。基于Litip Tohti教授的研究，我们将词尾串标记为一个单位，并根据词尾而不是词干来标记语料库中的单词。这是一个新的尝试，我们希望能够提高维吾尔语处理的性能。

引用次数: 0

Application of Tucker Decomposition in Speech Signal Feature Extraction 塔克分解在语音信号特征提取中的应用

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.50

Lidong Yang, Jing Wang, Xiang Xie, Jingming Kuang

Speech signal feature extraction is an important part of speech recognition system. We present Tucker decomposition to extract speech feature. Firstly, the preprocessed speech signal is decomposed via three-level Wavelet transform, and the information in different scales is obtained. Next, the conventional feature parameters are extracted from the different scales, and a 3-order speech tensor (frames, scales, feature parameters) could be created. Then, the tensor is decomposed by Tucker decomposition, and projection matrices in different mode are obtained. Thirdly, matrix product is performed between speech tensor and projection matrices in each mode, and mapped results are metricized. Finally, feature system in high order space is built, in other words, speech feature matrices are obtained. The feature system can fully express speech signal features. These matrices can be used for model training and speech recognition. Numerical experiments support the advantage of Tucker decomposition over conventional methods for speech signal feature extraction, furthermore, it is robust to noisy speech.

语音信号特征提取是语音识别系统的重要组成部分。采用Tucker分解提取语音特征。首先，对预处理后的语音信号进行三阶小波变换，得到不同尺度的信息;其次，从不同尺度提取常规特征参数，创建一个三阶语音张量(帧、尺度、特征参数)。然后，对张量进行Tucker分解，得到不同模式下的投影矩阵。第三，在每种模式下对语音张量和投影矩阵进行矩阵积，并对映射结果进行度量。最后，构建高阶空间特征系统，即得到语音特征矩阵。特征系统可以充分表达语音信号的特征。这些矩阵可以用于模型训练和语音识别。数值实验证明了塔克分解在语音信号特征提取方面优于传统方法的优势，并且对噪声语音具有较强的鲁棒性。

引用次数: 2

The Frame Design of Mongolian Noun Semantic Network 蒙古语名词语义网络的框架设计

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.73

Hasi, Enbo Tang

Lexical semantic network is one of the important ways to express lexical semantics, which plays an important role in semantic analysis, semantic disambiguation, semantic reasoning and semantic summarization etc. With the deep development of semantic processing, the research work of Mongolian information processing highly needs the support of lexical semantic network. This research referred to the construction theory of Chinese and English Word Net, proposing a frame designing method of Mongolian noun semantic network which is compatible with Word Net, and basically realized the presentation of Mongolian synonym sets and the construction of semantic relations between these sets. This article mainly introduced the construction method of the semantic relations in the Mongolian noun semantic network.

词汇语义网络是表达词汇语义的重要方式之一，在语义分析、语义消歧、语义推理和语义总结等方面发挥着重要作用。随着语义处理的深入发展，蒙古语信息处理的研究工作迫切需要词汇语义网络的支持。本研究借鉴了英汉词网的构建理论，提出了一种与Word网兼容的蒙古语名词语义网络框架设计方法，基本实现了蒙古语同义词集的表示和同义词集之间语义关系的构建。本文主要介绍了蒙古语名词语义网络中语义关系的构建方法。

引用次数: 3

Information Retrieval Model Combining Sentence Level Retrieval 结合句子级检索的信息检索模型

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.76

Jiali Zuo, Mingwen Wang, Jianyi Wan, Wenbing Luo

To get better performance, Some researchers have proposed relative work to exploit the position and proximity information of query terms in language model. However these models need large quantity of training data and its computation complexity is comparatively high. This paper presents an information retrieval model combining sentence level retrieval and use sentence as a unit to compute the relevant degree of the sentence to query. Experiment results show our model can get better performance than baseline models.

为了获得更好的性能，一些研究人员提出了在语言模型中利用查询词的位置和接近度信息的相关工作。但这些模型需要大量的训练数据，且计算复杂度较高。本文提出了一种结合句子级检索和以句子为单位计算句子与查询的关联度的信息检索模型。实验结果表明，该模型的性能优于基准模型。

引用次数: 2

Combination of ROSVM and LR for Spam Filter 垃圾邮件过滤中ROSVM和LR的结合

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.60

Yadong Wang, Haoliang Qi, Hong Deng, Yong Han

Spam filter benefits from two state-of-the-art discriminative models: Logistic Regression (LR) and Relaxed Online Support Vector Machine (ROSVM). It is natural that two models reach their optimal performance after different training examples. We presented a combination model which integrated LR and ROSVM into a unified one. We divided the training process into two phases. In the first phase, LR was used as filtering model to train and learn, at the same time ROSVM accepted the right result to learn. In the second phase, ROSVM was used as filtering model to train after a point which was found in experiments. Experimental results on the public data sets (TREC06-c, TREC06-p, TREC07-p) showed that the combination of ROSVM and LR spam filter gave the better performance than LR filter and ROSVM filter in immediate feedback.

垃圾邮件过滤得益于两种最先进的判别模型:逻辑回归(LR)和宽松在线支持向量机(ROSVM)。两种模型在不同的训练样例下达到最优性能是很自然的。我们提出了一个将LR和ROSVM集成为一个统一模型的组合模型。我们将培训过程分为两个阶段。第一阶段使用LR作为滤波模型进行训练和学习，同时ROSVM接受正确的结果进行学习。在第二阶段，使用ROSVM作为滤波模型，对实验中发现的点进行训练。在公共数据集(TREC06-c, TREC06-p, TREC07-p)上的实验结果表明，ROSVM和LR组合的垃圾邮件过滤器在即时反馈方面的性能优于LR滤波器和ROSVM滤波器。

引用次数: 0

A Study on Japanese Students (Intermediate Level) Reading Chinese Texts With or Without Marks for Word Boundaries 日本中级学生阅读有与没有词界标记的汉语文本的研究

2013 International Conference on Asian Language Processing

Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.38

Yu Peng

By comparing the Eye Movement on data of Japanese students reading Chinese texts with or without marks for word boundaries, the researcher has found that: (1) To Japanese students, the Chinese words represent more Psychological reality than Chinese characters. (2) The reading efficiency of Japanese students could be immensely improved by inserting word boundary marks. It is, therefore, suggested that when publishing Chinese textbooks intended for Japanese students, we adjust the way we do typesetting today and insert word boundary marks into the text.

通过对比日本学生阅读中文文本时的眼动数据，研究者发现:(1)对于日本学生来说，中文文字比汉字更能代表心理现实。(2)日本学生的阅读效率可以通过插入单词边界标记得到极大的提高。因此，建议我们在出版面向日本学生的中文教科书时，调整我们现在的排版方式，在文本中插入单词边界标记。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2013 International Conference on Asian Language Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀