2011 International Conference on Asian Language Processing最新文献

英文中文

Linguistic Competency Model for Intentional Agent 意向主体的语言能力模型

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.71

Sivakumar Ramakrishnan, V. Mohanan

The linguistic competency structure which has been modeled as Linguistic Competency Model (LCM) is a new and an extensive work of discourse analysis of an Intentional Intelligent Agent. This model is able to explain explicitly the role of linguistic competency in a discourse and incorporated the intention as an embodied entity into the cognitive process of discourse. This model managed to address the hermeneutical process grammar as instrument of linguistic competency skill in the social interactive structure of discourse. This LCM has established a discourse structure which can semantically interpret and hermeneutically analyze the psychological temporal semantic- pragmatic linkages and movement of discourse. The embedded structure of physical spatio-temporal contiguity of verbals or events linkages also can be systematically associated with psychological temporal semantic- pragmatic movement in this LCM. Therefore this LCM model will give a new insight into an understanding of the composition and the characteristic of a hermeneutic discourse especially to define the linguistic competency in it.

语言胜任力模型(linguistic competency Model, LCM)是意向智能体语篇分析中一项新的、广泛的工作。该模型能够明确地解释语言能力在语篇中的作用，并将意图作为一种具体实体纳入语篇的认知过程。该模型设法解决了解释学过程语法作为话语社会互动结构中语言能力技能的工具。该模型建立了一种语篇结构，可以对语篇的心理时间语义语用联系和运动进行语义解释和解释学分析。语言或事件联系的物理时空连续性嵌入结构也可以系统地与心理时间语义-语用运动联系起来。因此，LCM模型将为理解解释学话语的组成和特征，特别是定义解释学话语中的语言能力提供新的视角。

引用次数: 1

Search Results Clustering Based on a Linear Weighting Method of Similarity 基于相似度线性加权法的搜索结果聚类

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.72

Dequan Zheng, Haibo Liu, T. Zhao

The cluster of search results can facilitate users in finding the needed from massive information. But the effect of the traditional text clustering has been verified not good enough. Lingo Algorithm, which adopts LSI for clustering, generates candidate labels first, then distributes the documents, and forms the clusters finally. On the basis of Lingo Algorithm, this paper presents a linear weighted method of Single-Pass improvement, which integrates HowNet semantic similarity and cosine similarity, fuses and rediscovers clusters, and extracting the cluster labels. The experiments have showed that our method it achieves a good results in clusters in the form of purity and F-measure.

搜索结果的聚类可以方便用户从海量信息中找到需要的信息。但是传统的文本聚类方法的聚类效果并不理想。Lingo算法采用大规模集成电路(LSI)进行聚类，首先生成候选标签，然后分发文档，最后形成聚类。在Lingo算法的基础上，提出了一种单次改进的线性加权方法，将HowNet语义相似度和余弦相似度相结合，对聚类进行融合和再发现，提取聚类标签。实验表明，该方法在聚类的纯度和f值方面都取得了较好的效果。

引用次数: 1

Two Ontological Approaches to Building an Intergrated Semantic Network for Yami ka-Verbs 构建雅米卡动词语义网络的两种本体论方法

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.26

Meng-Chien Yang, Si-Wei Huang, D. V. Rau

This paper describes a proposed ontological language processing system for integrating two semantic sets for a group of important verbs with the prefix Kain Yami, an Austronesian language in Taiwan. The two semantic sets represent two different classification approaches. One approach follows the concepts and rules of WordNet and the other uses the metaphors in Yami indigenous knowledge. The ontologies are used for classification and semantic integration. The results of implementation are used for building the Yami lexical database. This paper illustrates how the methodology and framework used in classifying Yami can be applied to Austronesia language processing.

本文提出了一种本体语言处理系统，用于整合台湾南岛语族一组重要动词的两个语义集。这两个语义集代表了两种不同的分类方法。一种方法遵循WordNet的概念和规则，另一种方法使用雅米土著知识中的隐喻。本体用于分类和语义集成。实现的结果用于构建Yami词汇数据库。本文阐述了雅米语分类的方法和框架如何应用于南岛语的语言处理。

引用次数: 0

Research of Noun Phrase Coreference Resolution 名词短语共指消解研究

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.32

Junwei Gao, Fang Kong, Peifeng Li, Qiaoming Zhu

Coreference resolution is an important subtask in natural language processing systems. The process of it is to find whether two expressions in natural language refer to the same entity in the world. Machine learning approaches to this problem have been reasonably successful, operating primarily by recasting the problem as a classification task. A great deal of research has been done on this task in English, using approaches ranging from those based on linguistics to those based on machine learning. In Chinese, however, much less work has been done in this area. The lack of public resources is a big problem in the research of Chinese NLP. The other problem is that some features are more difficult to get than those features of English. In this paper, We present a noun phrase coreference system that refers to the work of Soon et al. (2001). We also explore the impact of various features on our system's performance. Experiments on the Chinese portion of OntoNotes 3.0 show that the platform achieves a good performance.

共指解析是自然语言处理系统中的一项重要子任务。它的过程是发现自然语言中的两个表达式是否指的是世界上同一个实体。解决这个问题的机器学习方法已经相当成功，主要是通过将问题重新转换为分类任务来操作。大量的研究已经在英语中完成，使用的方法从基于语言学的到基于机器学习的。然而，在中国，在这方面做的工作要少得多。公共资源的缺乏是汉语自然语言处理研究的一大问题。另一个问题是，有些特征比英语的那些特征更难获得。在本文中，我们提出了一个名词短语共指系统，该系统参考了Soon等人(2001)的工作。我们还探讨了各种特性对系统性能的影响。在OntoNotes 3.0的中文部分进行的实验表明，该平台取得了良好的性能。

引用次数: 1

Automatic Construction of Chinese-Mongolian Parallel Corpora from the Web Based on the New Heuristic Information 基于新启发式信息的网络汉蒙平行语料库自动构建

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.17

Zede Zhu, Miao Li, Lei Chen, Shouguo Zheng

Parallel corpora are important resources in data-driven natural language processing domain. Concerning the issues such as the scale, comprehensiveness and timeliness, the existing Chinese-Mongolian parallel corpora are significantly limited in practical use. Reviewing the traditional heuristic information used to identify major languages parallel web pages, this paper focuses on exploring new heuristic information to improve the performance of identifying Chinese-Mongolian parallel pages. Based on these heuristics, support vector machine is used to classify webs as parallel pages or non-parallel pages. Experiment gains a precision rate of 95% and a recall rate of 88%. This paper makes preliminary research in automatically constructing minority languages parallel corpora from the web.

并行语料库是数据驱动的自然语言处理领域的重要资源。现有的汉蒙平行语料库在规模、全面性、时效性等方面存在较大的局限性。本文在回顾传统的主要语言平行网页识别启发式信息的基础上，重点探索新的启发式信息，以提高汉蒙平行网页的识别性能。基于这些启发式方法，使用支持向量机将网页分类为并行页面或非并行页面。实验获得了95%的准确率和88%的召回率。本文对基于网络的少数民族语言并行语料库的自动构建进行了初步研究。

引用次数: 2

Research on Element Sub-sentence in Chinese-English Patent Machine Translation 汉英专利机器翻译中要素子句的研究

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.29

Zhiying Liu, Yaohong Jin, Yu-huan Chi

This paper presents an approach to translate element sub-sentences which widely exist in Chinese patent documents. Element sub-sentence is a kind of language chunk in a sentence in which one part of sub-sentence is the headword and others are attributives, or the modifier-head phrase of VP+NP or NP+VP structure. Element sub-sentence can be divided into three types in HNC theory: EK (predicate) sub-sentence, GBK1 (subject) sub-sentence and GBK2 (object) sub-sentence. In this paper, we give the method to detect the sub-sentence boundaries, analyze the characteristic of each type of element sub-sentences from the structure and semantics, and discover the Chinese-English translation rules of each type. By using the processing strategies most of the Chinese-English translation problem about the element sub-sentence can be perfectly solvable on an online patent MT system in SIPO.

本文提出了一种翻译中文专利文献中广泛存在的要素子句的方法。元素子句是句子中一个部分为标题词，其他部分为定语，或VP+NP或NP+VP结构的修饰语-标题短语的一种语言块。HNC理论将元素子句分为三种类型:EK(谓语)子句、GBK1(主语)子句和GBK2(宾语)子句。本文给出了子句边界的检测方法，从结构和语义上分析了每一类元素子句的特征，并发现了每一类元素子句的汉英翻译规则。在国家知识产权局的在线专利机器翻译系统上，采用该处理策略可以很好地解决大部分要素子句的汉英翻译问题。

引用次数: 4

Graph-Based Language Model of Long-Distance Dependency 基于图的远程依赖语言模型

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.49

Faguo Zhou, Xingang Yu

In the natural language processing and its related fields, the classic text representation methods seldom consider the role of the words order and long-distance dependency in the texts for the semantic representation. In this paper, we discussed current situation and problems of the statistical language models, especially for Head-driven statistical language model and Head-driven Phrase Structure Grammar (HPSG). And then the development and realization methods of the long-distance dependency language model simply introduced. At last graph-based long-distance dependency language model was proposed in the paper.

在自然语言处理及其相关领域中，经典的文本表示方法很少考虑文本中的词序和距离依赖在语义表示中的作用。本文讨论了统计语言模型的现状和存在的问题，特别是头部驱动的统计语言模型和头部驱动的短语结构语法(HPSG)。然后简单介绍了远程依赖语言模型的开发和实现方法。最后提出了基于图的远程依赖语言模型。

引用次数: 1

Research on Cross-Document Coreference of Chinese Person Name 中文人名的跨文献参考研究

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.30

Ji Ni, Fang Kong, Peifeng Li, Qiaoming Zhu

In reality, different persons often have the same person name. The Person Cross Document Co-reference Resolution is a task, which requires that all and only the textual mentions of an entity of type Person be individuated in a collection of text documents. In this paper, we implement a Chinese Person Name Cross Document Co-reference Resolution System. First, we utilize name identification module to recognize all person names of the texts, and then classify the document collection of same person name by rules preliminarily, and at last, we compute similarities of each classification based on VSM, according to the prior similarities, the system get the final classification results. We test the system on 30 usual Chinese names of the corpus provided by CLP, and average F measure is 85.9%.

在现实中，不同的人往往有相同的人名。Person跨文档共同引用解析是一项任务，它要求在文本文档集合中对Person类型实体的所有且仅所有文本提及进行个性化处理。在本文中，我们实现了一个中文人名跨文档协同参考解析系统。首先，我们利用姓名识别模块对文本中的所有人名进行识别，然后对同一人名的文档集合进行初步的规则分类，最后，我们基于VSM计算每次分类的相似度，根据先验相似度，系统得到最终的分类结果。我们对CLP提供的语料库中的30个常用中文名进行了测试，平均F值为85.9%。

引用次数: 1

An Automatic Linguistics Approach for Persian Document Summarization 波斯语文献摘要的自动语言学方法

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.52

Hossein Kamyar, M. Kahani, Mohsen Kamyar, Asef Poormasoomi

In this paper we propose a novel technique for summarizing a text based on the linguistics properties of text elements and semantic chains among them. In most summarization approaches, the major consideration is the statistical properties of text elements such as term frequency. Here we use centering theory which helps us to recognize semantic chains in a text, for proposing a new automatic single document summarization approach. For processing a text by centering theory and extracting a coherent summery, a processing pipeline should be constructed. This pipeline consists of several components such as co-reference resolution, semantic role labeling and POS [Part of speech] tagging.

本文提出了一种基于文本元素及其语义链的文本摘要技术。在大多数摘要方法中，主要考虑的是文本元素的统计属性，如词频率。在这里，我们利用中心理论来帮助我们识别文本中的语义链，提出了一种新的自动单文档摘要方法。为了对文本进行集中理论和提取连贯摘要的处理，需要构建一个处理流水线。该管道由共同引用解析、语义角色标注和词性标注等组件组成。

引用次数: 2

Research on the Uyghur Information Database for Information Processing 面向信息处理的维吾尔语信息库研究

2011 International Conference on Asian Language Processing

Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.79

Yusup Ebeydulla, Hesenjan Abliz, Azragul Yusup

Although the "grammatical rule + dictionary" is the traditional pattern for natural language processing, it can be hard to explain the combination of words in language. If all word combinations are entered into a database, the grammar and the information system would be simplified. The necessity, the methods and the principles of establishing the phrase information database of Uyghur language will be discussed in the paper on the basis of the review of the "little grammar in the big word storehouse".

虽然“语法规则+词典”是自然语言处理的传统模式，但很难解释语言中单词的组合。如果所有的单词组合都输入到数据库中，语法和信息系统就会简化。本文在回顾“大词库中的小语法”的基础上，探讨了建立维吾尔语短语信息库的必要性、方法和原则。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 International Conference on Asian Language Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀