Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)最新文献

英文中文

Pragmatic analysis based query expansion for Chinese cuisine QA service system 基于语用分析的中餐问答服务系统查询扩展

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587785

Ling Xia, F. Ren

This paper proposes a query expansion method for cooking question answering system based on pragmatic analysis. In our approach, the results of question analysis is used. The original queries are generated by means of the question subject, then the query terms are expanded based on pragmatic function. When submitting the expended queries to Google search engine to retrieve related passages, we get an overall improvement of 36.2% on the mean average precision.

提出了一种基于语用分析的烹饪问答系统查询扩展方法。在我们的方法中，使用了问题分析的结果。首先通过问题主题生成原始查询，然后根据语用功能对查询词进行扩展。当将扩展查询提交到Google搜索引擎检索相关段落时，我们在平均精度上得到了36.2%的总体提高。

引用次数: 0

Image understanding for converting images into natural language text sentences 将图像转换为自然语言文本句子的图像理解

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587864

N. Bourbakis

The efficient processing, association and understanding of multimedia based events or multi-modal information is a very important research field with a great variety of applications, such as knowledge discovery, document understanding, human computer interaction, etc. A good approach to this important issue is the development of a common platform for converting different modalities (such as images, text, etc) into the same medium and associating them for efficient processing and understanding. Thus, this talk here presents the development of a methodology capable for automatically converting images into natural language (NL) text sentences using image processing-analysis methods and graphs with attributes for object recognition, and image understanding. Then it converts graph representations into NL text sentences. Moreover, it presents a methodology for transforming NL sentences into Graph representations and then into Stochastic Petri-nets (SPN) descriptions in order to offer a common model of representation of multimodal information and at the same time a way of associating “activities or changes” in image frames for events representation and interpretation. The selection of the SPN graph model is due to its capability for efficiently representing structural and functional knowledge where other models cannot. Simple illustrative examples are provided for proving the concept presented here.

基于多媒体的事件或多模态信息的高效处理、关联和理解是一个非常重要的研究领域，在知识发现、文档理解、人机交互等方面有着广泛的应用。解决这一重要问题的一个好方法是开发一个通用平台，将不同的模式(如图像、文本等)转换为相同的媒介，并将它们关联起来，以便进行有效的处理和理解。因此，本演讲将介绍一种方法的发展，该方法能够使用图像处理分析方法和具有对象识别和图像理解属性的图形，将图像自动转换为自然语言(NL)文本句子。然后将图形表示转换为自然语言文本句子。此外，它提出了一种将自然语言句子转换为图表示，然后转换为随机Petri-nets (SPN)描述的方法，以提供多模态信息表示的通用模型，同时提供一种将图像帧中的“活动或变化”关联起来的方法，用于事件表示和解释。选择SPN图模型是因为它能够有效地表示结构和功能知识，而其他模型则不能。提供了简单的说明性示例来证明这里提出的概念。

{"title":"Image understanding for converting images into natural language text sentences","authors":"N. Bourbakis","doi":"10.1109/NLPKE.2010.5587864","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587864","url":null,"abstract":"The efficient processing, association and understanding of multimedia based events or multi-modal information is a very important research field with a great variety of applications, such as knowledge discovery, document understanding, human computer interaction, etc. A good approach to this important issue is the development of a common platform for converting different modalities (such as images, text, etc) into the same medium and associating them for efficient processing and understanding. Thus, this talk here presents the development of a methodology capable for automatically converting images into natural language (NL) text sentences using image processing-analysis methods and graphs with attributes for object recognition, and image understanding. Then it converts graph representations into NL text sentences. Moreover, it presents a methodology for transforming NL sentences into Graph representations and then into Stochastic Petri-nets (SPN) descriptions in order to offer a common model of representation of multimodal information and at the same time a way of associating “activities or changes” in image frames for events representation and interpretation. The selection of the SPN graph model is due to its capability for efficiently representing structural and functional knowledge where other models cannot. Simple illustrative examples are provided for proving the concept presented here.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131050916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A combination method of CRF with syntactic rules to identify opinion_holder 一种结合CRF和语法规则来识别opinion_holder的方法

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587848

Yuan Kuang, Yanquan Zhou, Huacan He

This paper presents another aspect of sentiment analysis: identifying opinion_holder in the opinionated sentences. To extract opinion_holder, we firstly explore Conditional Random Field(CRF) based on six features including contextual, opinionated_trigger words, POS tags, named entity, dependency and proposed sentence structure feature, and dependency is adjusted to be better helpful for containing contextual dependency information. Then we propose two novel syntactic rules with opinionated_trigger words to directly identify opinion_holder from the parse trees. The results show that the precision from CRF is much higher than that of syntactic rules, while the recall is lower than. So we combine CRF with syntactic rules used as additional three features including HolderNode, ChunkPosition and Paths for the CRF to train our model. The combination results of the system illustrate the higher recall and higher F-measure under the almost same high precision.

本文介绍了情感分析的另一个方面:在自以为是的句子中识别opinion_holder。为了提取opinion_holder，我们首先基于上下文、opinionated_trigger词、POS标签、命名实体、依赖关系和建议句子结构特征六个特征挖掘条件随机场(Conditional Random Field, CRF)，并对依赖关系进行调整，以更好地帮助包含上下文依赖信息。然后，我们提出了两种新的带有opinionated_trigger词的句法规则，从解析树中直接识别opinion_holder。结果表明，CRF的准确率远高于句法规则，而查全率则低于句法规则。因此，我们将CRF与语法规则结合使用，作为额外的三个特征，包括HolderNode, ChunkPosition和Paths，用于CRF训练我们的模型。系统的组合结果表明，在几乎相同的高精度下，系统具有较高的查全率和较高的f值。

引用次数: 3

Affix-augmented stem-based language model for persian 波斯语词缀增强词干语言模型

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587823

Heshaam Faili, H. Ravanbakhsh

Language modeling is used in many NLP applications like machine translation, POS tagging, speech recognition and information retrieval. It assigns a probability to a sequence of words. This task becomes a challenging problem for high inflectional languages. In this paper we investigate standard statistical language models on the Persian as an inflectional language. We propose two variations of morphological language models that rely on a morphological analyzer to manipulate the dataset before modeling. Then we discuss shortcoming of these models, and introduce a novel approach that exploits the structure of the language and produces more accurate. Experimental results are encouraging especially when we use n-gram models with small training dataset.

语言建模用于许多NLP应用，如机器翻译、词性标注、语音识别和信息检索。它为单词序列分配一个概率。对于高屈折变化的语言来说，这是一个具有挑战性的问题。本文研究了波斯语作为一种屈折变化语言的标准统计语言模型。我们提出了两种形态学语言模型的变体，它们依赖于形态学分析器在建模之前对数据集进行操作。然后讨论了这些模型的不足之处，并介绍了一种利用语言结构的新方法。实验结果令人鼓舞，特别是当我们使用n-gram模型和小训练数据集时。

引用次数: 2

Co-construction of ontology-based knowledge base through the Web: Theory and practice 基于本体的Web知识库协同构建:理论与实践

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587804

Keliang Zhang, Qinlong Fei

Ontology-based knowledge base plays an increasingly important role in improving the precision and recall rate of a retrieval system. Based on Distributed Learning theory, a novel approach for the co-construction of ontology-based knowledge base is explored. Making use of the platform set up for the co-construction and sharing of domain-specific knowledge through the Web, we constructed an ontology-based knowledge base of airborne radar field. This study is expected to contribute to the effective improvement of precision and recall rate of information retrieval in the airborne radar field. Hopefully, the mode we designed and adopted for the co-construction and sharing of domain-specific knowledge base could be enlightening for other similar studies.

基于本体的知识库在提高检索系统的查准率和查全率方面发挥着越来越重要的作用。基于分布式学习理论，探索了一种基于本体的知识库协同构建的新方法。利用基于Web的领域知识共建共享平台，构建了基于本体的机载雷达领域知识库。期望本研究能为有效提高机载雷达领域的信息检索精度和查全率做出贡献。希望我们设计和采用的领域知识库共建共享模式能够对其他类似研究有所启发。

引用次数: 0

Sentiment word identification using the maximum entropy model 基于最大熵模型的情感词识别

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587811

Xiaoxu Fei, Huizhen Wang, Jingbo Zhu

This paper addresses the issue of sentiment word identification given an opinionated sentence, which is very important in sentiment analysis tasks. The most common way to tackle this problem is to utilize a readily available sentiment lexicon such as HowNet or SentiWordNet to determine whether a word is a sentiment word. However, in practice, words existing in the lexicon sometimes can not express sentiment tendency in a certain context while other words out of the lexicon do express. To address this challenge, this paper presents an approach based on maximum-entropy classification model to identify sentiment words given an opinionated sentence. Experimental results show that our approach outperforms baseline lexicon-based methods.

本文研究了在情感分析任务中非常重要的一个问题，即给定一个自以为是句子的情感词识别问题。解决这个问题最常见的方法是利用现成的情感词典，如HowNet或SentiWordNet来确定一个词是否为情感词。然而，在实践中，词典中存在的词汇有时不能表达特定语境下的情感倾向，而词典外的词汇却能表达情感倾向。为了解决这一挑战，本文提出了一种基于最大熵分类模型的方法来识别给定固执己见句子的情感词。实验结果表明，我们的方法优于基于词典的基线方法。

引用次数: 16

Syntactic correlations of prosodic phrase in broadcasting news speech 广播新闻讲话中韵律短语的句法关联

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587770

Yu Zou, Jiyuan Wu, W. He, Min Hou, Yonglin Teng

The interrelation between prosody and syntax becomes more and more important in speech processing. This paper is intended to analyze the syntactic correlations of prosodic phrase in broadcasting news speech. The research results in the followings: Firstly, the C-PP, which there is a stable prosodic pattern of pitch contour within its rhythmic chunking, has a flexible syntactic structure and stable semantic expression. Secondly, we find that the syntactic structure is more complex than the prosodic structure, and some conjunction and particle more likely attached to the end of left structure or the beginning of right one and form a prosodic word. If it has just four lexical words including the conjunction or particle they form a prosodic word by itself. That is to say, it has very great flexibility in prosodic structures for conjunctions and particles.

韵律与句法的相互关系在语音处理中变得越来越重要。本文旨在分析广播新闻语音中韵律短语的句法关联。研究结果表明:第一，节奏组块内存在稳定的音高等高韵律模式的C-PP具有灵活的句法结构和稳定的语义表达。其次，我们发现句法结构比韵律结构更复杂，一些连词和小品更有可能附着在左结构的末尾或右结构的开头，形成韵律词。如果它只有四个词汇，包括连词或小品词，它们本身就构成了一个韵律词。也就是说，它在连词和小品的韵律结构上有很大的灵活性。

引用次数: 1

A study on disambiguation of structure “prep+n1+de+n2” for Chinese information processing 汉语信息处理中“prep+n1+de+n2”结构的消歧研究

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587784

Song Gao, Yiyi Zhao, Haitao Liu, Zhiwei Feng

According to Potential Ambiguity Theory, we analyzed “prep+n1+ de+n2” phrase in this paper. We focus on how to make computer automatically detect and process such syntactic ambiguity structure. The purpose is to raise the accuracy rate of the automatic identification and analysis of natural language. At the same time, we take this structure for example, in order to help study other potential ambiguity structures.

根据潜在歧义理论，本文对“prep+n1+ de+n2”短语进行了分析。重点研究了如何使计算机自动检测和处理这种句法歧义结构。目的是提高自然语言自动识别与分析的准确率。同时，我们以这个结构为例，以帮助研究其他潜在的歧义结构。

引用次数: 0

Identifying emotion topic — An unsupervised hybrid approach with Rhetorical Structure and Heuristic Classifier 情感主题识别——一种基于修辞结构和启发式分类器的无监督混合方法

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587777

Dipankar Das, Sivaji Bandyopadhyay

This paper describes an unsupervised hybrid approach to identify emotion topic(s) from English blog sentences. The baseline system is based on object related dependency relations from parsed constituents. However, the inclusion of the topic related thematic roles present in the verb based syntactic argument structure improves the performance of the baseline system. The argument structures are extracted using VerbNet. The unsupervised hybrid approach consists of two phases; firstly, the information of Rhetorical Structure (RS) is extracted to identify the target span corresponding to the emotional expression from each sentence. Secondly, as an individual target span contains one or more topics corresponding to an emotional expression, a Heuristic Classifier (HC) is designed to identify each of the topic spans associated in the target span. The classifier uses the information of Emotion Holder (EH), Named Entities (NE) and four types of Similarity features to identify the phrase level components of the topic spans. The system achieves average recall, precision and F-score of 60.37%, 57.49% and 58.88% respectively with respect to all emotion classes on 500 annotated sentences containing single or multiple emotion topics.

本文描述了一种从英语博客句子中识别情感主题的无监督混合方法。基线系统基于来自已解析组件的对象相关依赖关系。然而，在基于动词的句法参数结构中包含与主题相关的主题角色可以提高基线系统的性能。参数结构是使用vernet提取的。无监督混合方法包括两个阶段;首先，提取修辞结构信息，识别每句话的情感表达对应的目标语段;其次，由于单个目标范围包含一个或多个与情感表达相对应的主题，设计了启发式分类器(HC)来识别目标范围中关联的每个主题范围。该分类器利用情感持有人(EH)、命名实体(NE)信息和四种相似度特征来识别主题跨度的短语级成分。该系统在500个包含单个或多个情感主题的标注句子中，对所有情感类别的平均召回率、准确率和f分分别达到60.37%、57.49%和58.88%。

{"title":"Identifying emotion topic — An unsupervised hybrid approach with Rhetorical Structure and Heuristic Classifier","authors":"Dipankar Das, Sivaji Bandyopadhyay","doi":"10.1109/NLPKE.2010.5587777","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587777","url":null,"abstract":"This paper describes an unsupervised hybrid approach to identify emotion topic(s) from English blog sentences. The baseline system is based on object related dependency relations from parsed constituents. However, the inclusion of the topic related thematic roles present in the verb based syntactic argument structure improves the performance of the baseline system. The argument structures are extracted using VerbNet. The unsupervised hybrid approach consists of two phases; firstly, the information of Rhetorical Structure (RS) is extracted to identify the target span corresponding to the emotional expression from each sentence. Secondly, as an individual target span contains one or more topics corresponding to an emotional expression, a Heuristic Classifier (HC) is designed to identify each of the topic spans associated in the target span. The classifier uses the information of Emotion Holder (EH), Named Entities (NE) and four types of Similarity features to identify the phrase level components of the topic spans. The system achieves average recall, precision and F-score of 60.37%, 57.49% and 58.88% respectively with respect to all emotion classes on 500 annotated sentences containing single or multiple emotion topics.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129041439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Weakly supervised relevance feedback based on an improved language model 基于改进语言模型的弱监督相关反馈

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

Pub Date : 2010-09-30 DOI: 10.1109/NLPKE.2010.5587859

Xinsheng Li, Si Li, Weiran Xu, Guang Chen, Jun Guo

Relevance feedback, which traditionally uses the terms in the relevant documents to enrich the user's initial query, is an effective method for improving retrieval performance. This approach has another problem is that Relevance feedback assumes that most frequent terms in the feedback documents are useful for the retrieval. In fact, the reports of some experiments show that it does not hold in reality many expansion terms identified in traditional approaches are indeed unrelated to the query and harmful to the retrieval. In this paper, we propose to select better and more relevant documents with a clustering algorithm. And then we present an improved Language Model to help us identify the good terms from those relevant documents. Ours experiments on the 2008 TREC collection show that retrieval effectiveness can be much improved when the improved Language Model is used.

相关反馈是一种提高检索性能的有效方法，传统上使用相关文档中的术语来丰富用户的初始查询。这种方法的另一个问题是相关性反馈假设反馈文档中最常见的术语对检索有用。事实上，一些实验报告表明，它在现实中并不成立，许多在传统方法中识别的扩展术语确实与查询无关，并且对检索有害。在本文中，我们提出了一种聚类算法来选择更好和更相关的文档。然后，我们提出了一个改进的语言模型，以帮助我们从这些相关文档中识别出好的术语。我们在2008年的TREC集合上的实验表明，使用改进的语言模型可以大大提高检索效率。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀