首页 > 最新文献

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)最新文献

英文 中文
Maximum entropy based emotion classification of Chinese blog sentences 基于最大熵的中文博客句子情感分类
Cheng Wang, Changqin Quan, F. Ren
At present there are increasing studies on the classification of textual emotions. Especially with the rapid developments of Internet technology, classifying blog emotions has become a new research field. In this paper, we classified the sentence emotion using the machine learning method based on the maximum entropy model and the Chinese emotion corpus (Ren-CECps)*. Ren-CECps contains eight basic emotion categories (expect, joy, love, surprise, anxiety, sorrow, hate and anger), which presents us with the opportunity to systematically analyze the complex human emotions. Three features (keywords, POS and intensity) were considered for sentence emotion classification, and three aspect experiments have been carried out: 1) classification of any two emotions, 2) classification of eight emotions, and 3) classification of positive and negative emotions. The highest classification accuracies of the three aspect experiments were 90.62%, 35.66% and 73.96%, respectively.
目前,对语篇情感分类的研究越来越多。特别是随着网络技术的飞速发展,博客情感分类成为一个新的研究领域。本文采用基于最大熵模型和中文情感语料库(Ren-CECps)*的机器学习方法对句子情感进行分类。Ren-CECps包含八种基本的情感类别(期待、喜悦、爱、惊奇、焦虑、悲伤、仇恨和愤怒),为我们提供了系统分析复杂的人类情感的机会。考虑了关键词、词性和强度三个特征对句子情绪进行分类,并进行了三个方面的实验:1)对任意两种情绪进行分类,2)对八种情绪进行分类,3)对积极情绪和消极情绪进行分类。三个方面实验的最高分类准确率分别为90.62%、35.66%和73.96%。
{"title":"Maximum entropy based emotion classification of Chinese blog sentences","authors":"Cheng Wang, Changqin Quan, F. Ren","doi":"10.1109/NLPKE.2010.5587798","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587798","url":null,"abstract":"At present there are increasing studies on the classification of textual emotions. Especially with the rapid developments of Internet technology, classifying blog emotions has become a new research field. In this paper, we classified the sentence emotion using the machine learning method based on the maximum entropy model and the Chinese emotion corpus (Ren-CECps)*. Ren-CECps contains eight basic emotion categories (expect, joy, love, surprise, anxiety, sorrow, hate and anger), which presents us with the opportunity to systematically analyze the complex human emotions. Three features (keywords, POS and intensity) were considered for sentence emotion classification, and three aspect experiments have been carried out: 1) classification of any two emotions, 2) classification of eight emotions, and 3) classification of positive and negative emotions. The highest classification accuracies of the three aspect experiments were 90.62%, 35.66% and 73.96%, respectively.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126633942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Graph-based service quality evaluation through mining Web reviews 通过挖掘Web评论进行基于图的服务质量评价
Suke Li, Jinmei Hao, Zhong Chen
This work tries to find a possible solution to the basic research problem: how to conduct service quality evaluation through mining Web reviews? To address this problem, this work proposes a novel approach to service quality evaluation which has two essential subtasks: 1) finding the most important service aspects, and 2) measuring service quality using ranked service aspects. We propose three graph-based ranking models to rank service aspects and a simple linear method of measuring service quality. Empirical experimental results show all our three methods outperform the approach of Noun Frequency. We also show the effectiveness of our service quality evaluation method by conducting intensive regression experiments.
本文试图为如何通过挖掘Web评论进行服务质量评价这一基础性研究问题找到一个可能的解决方案。为了解决这个问题,本工作提出了一种新的服务质量评估方法,该方法有两个基本的子任务:1)找到最重要的服务方面,2)使用排名服务方面来衡量服务质量。我们提出了三种基于图的排名模型来对服务方面进行排名,并提出了一种简单的线性方法来衡量服务质量。实验结果表明,三种方法均优于名词频率方法。通过深入的回归实验,验证了服务质量评价方法的有效性。
{"title":"Graph-based service quality evaluation through mining Web reviews","authors":"Suke Li, Jinmei Hao, Zhong Chen","doi":"10.1109/NLPKE.2010.5587817","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587817","url":null,"abstract":"This work tries to find a possible solution to the basic research problem: how to conduct service quality evaluation through mining Web reviews? To address this problem, this work proposes a novel approach to service quality evaluation which has two essential subtasks: 1) finding the most important service aspects, and 2) measuring service quality using ranked service aspects. We propose three graph-based ranking models to rank service aspects and a simple linear method of measuring service quality. Empirical experimental results show all our three methods outperform the approach of Noun Frequency. We also show the effectiveness of our service quality evaluation method by conducting intensive regression experiments.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116845825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An map based sentence ranking approach to automatic summarization 一种基于地图的句子自动摘要排序方法
Xiaofeng Wu, Chengqing Zong
While the current main stream of automatic summarization is to extract sentences, that is, to use various machine learning methods to give each sentence of a document a score and get the highest sentences according to a ratio. This is quite similar to the current more and more active field —learning to rank. A few pair-wised learning to rank approaches have been tested for query summarization. In this paper we are the pioneers to use a new general summarization approach based on learning to rank approach, and adopt a list-wised optimizing object MAP to extract sentences from documents, which is a widely used evaluation measure in information retrieval (IR). Specifically, we use SVMMAP toolkit which can give global optimal solution to train and score each sentences. Our experiment results shows that our approach could outperform the stand-of-the-art pair-wised approach greatly by using the same features, and even slightly better then the reported best result which based on sequence labeling approach CRF.
而目前主流的自动摘要是提取句子,即使用各种机器学习方法给文档的每个句子打分,并根据比例得到最高的句子。这很类似于目前越来越活跃的领域——学习排名。一些配对学习排序方法已经被测试用于查询摘要。在本文中,我们率先采用了一种新的基于学习排序方法的通用摘要方法,并采用列表优化对象MAP从文档中提取句子,这是信息检索(information retrieval, IR)中广泛使用的一种评价方法。具体来说,我们使用能够给出全局最优解的SVMMAP工具包对每个句子进行训练和评分。实验结果表明,我们的方法在使用相同特征的情况下,大大优于现有的配对方法,甚至略优于基于序列标记方法CRF的最佳结果。
{"title":"An map based sentence ranking approach to automatic summarization","authors":"Xiaofeng Wu, Chengqing Zong","doi":"10.1109/NLPKE.2010.5587824","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587824","url":null,"abstract":"While the current main stream of automatic summarization is to extract sentences, that is, to use various machine learning methods to give each sentence of a document a score and get the highest sentences according to a ratio. This is quite similar to the current more and more active field —learning to rank. A few pair-wised learning to rank approaches have been tested for query summarization. In this paper we are the pioneers to use a new general summarization approach based on learning to rank approach, and adopt a list-wised optimizing object MAP to extract sentences from documents, which is a widely used evaluation measure in information retrieval (IR). Specifically, we use SVMMAP toolkit which can give global optimal solution to train and score each sentences. Our experiment results shows that our approach could outperform the stand-of-the-art pair-wised approach greatly by using the same features, and even slightly better then the reported best result which based on sequence labeling approach CRF.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129786677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimum edit distance-based text matching algorithm 基于最小编辑距离的文本匹配算法
Yu Zhao, Huixing Jiang, Xiaojie Wang
This paper proposes a measurement based on Minimum Edit Distance (MED) to the similarity between two sets of MultiWord Expressions (MWEs), which we use to calculate matching degree between two documents. We test the matching algorithm in the position searching system. Experiments show that the new measurement has higher performance than the cosine distance.
本文提出了一种基于最小编辑距离(MED)的多词表达式相似度度量方法,用于计算两篇文档之间的匹配度。我们在位置搜索系统中对匹配算法进行了测试。实验表明,该方法比余弦距离法具有更高的性能。
{"title":"Minimum edit distance-based text matching algorithm","authors":"Yu Zhao, Huixing Jiang, Xiaojie Wang","doi":"10.1109/NLPKE.2010.5587852","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587852","url":null,"abstract":"This paper proposes a measurement based on Minimum Edit Distance (MED) to the similarity between two sets of MultiWord Expressions (MWEs), which we use to calculate matching degree between two documents. We test the matching algorithm in the position searching system. Experiments show that the new measurement has higher performance than the cosine distance.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129939663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Passive and active contribution to multilingual lexical resources through online cultural activities 通过在线文化活动对多语言词汇资源的被动和主动贡献
Mohammad Daoud, K. Kageura, C. Boitet, A. Kitamoto, Daoud M. Daoud
In this paper we are proposing a contribution scheme for multilingual (preterminology). Preterminology is a lexical resource of unconfirmed terminology. We are explaining the difficulties of building lexical resources collaboratively. And we suggest a scheme of active and passive contributions that will ease the process and satisfy the Contribution Factors. We experimented our passive and active approaches with the Digital Silk Road Project where we analyzed visitors' behavior to find interesting trends and terminology candidates. And we built a contribution gateway that interacts with the visitors in order to motivate them to contribute while they are doing their usual activities.
在本文中,我们提出了一个多语言(前术语)的贡献方案。预术语是未确认术语的词汇资源。我们正在解释协作构建词汇资源的困难。在此基础上,提出了一种主动和被动分担的方案,以简化流程并满足分担因素。我们在数字丝绸之路项目中尝试了被动和主动的方法,通过分析访问者的行为来发现有趣的趋势和候选术语。我们建立了一个贡献门户,与访问者互动,以激励他们在进行日常活动时做出贡献。
{"title":"Passive and active contribution to multilingual lexical resources through online cultural activities","authors":"Mohammad Daoud, K. Kageura, C. Boitet, A. Kitamoto, Daoud M. Daoud","doi":"10.1109/NLPKE.2010.5587808","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587808","url":null,"abstract":"In this paper we are proposing a contribution scheme for multilingual (preterminology). Preterminology is a lexical resource of unconfirmed terminology. We are explaining the difficulties of building lexical resources collaboratively. And we suggest a scheme of active and passive contributions that will ease the process and satisfy the Contribution Factors. We experimented our passive and active approaches with the Digital Silk Road Project where we analyzed visitors' behavior to find interesting trends and terminology candidates. And we built a contribution gateway that interacts with the visitors in order to motivate them to contribute while they are doing their usual activities.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121296811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Automatic classification of documents by formality 按形式自动分类文件
Fadi Abu Sheikha, D. Inkpen
This paper addresses the task of classifying documents into formal or informal style. We studied the main characteristics of each style in order to choose features that allowed us to train classifiers that can distinguish between the two styles. We built our data set by collecting documents for both styles, from different sources. We tested several classification algorithms, namely Decision Trees, Naïve Bayes, and Support Vector Machines, to choose the classifier that leads to the best classification results. We performed attribute selection in order to determine the contribution of each feature to our model.
本文解决了将文档分类为正式或非正式风格的任务。我们研究了每种风格的主要特征,以便选择能够让我们训练能够区分两种风格的分类器的特征。我们通过收集来自不同来源的两种风格的文档来构建数据集。我们测试了几种分类算法,即决策树,Naïve贝叶斯和支持向量机,以选择导致最佳分类结果的分类器。为了确定每个特征对模型的贡献,我们执行了属性选择。
{"title":"Automatic classification of documents by formality","authors":"Fadi Abu Sheikha, D. Inkpen","doi":"10.1109/NLPKE.2010.5587767","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587767","url":null,"abstract":"This paper addresses the task of classifying documents into formal or informal style. We studied the main characteristics of each style in order to choose features that allowed us to train classifiers that can distinguish between the two styles. We built our data set by collecting documents for both styles, from different sources. We tested several classification algorithms, namely Decision Trees, Naïve Bayes, and Support Vector Machines, to choose the classifier that leads to the best classification results. We performed attribute selection in order to determine the contribution of each feature to our model.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126478324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Research on automatic recognition of Tibetan personal names based on multi-features 基于多特征的藏文人名自动识别研究
Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang
Tibetan name has strong religious and cultural connotations, and the construction is different from Chinese name. So far, there is limited research on Tibetan name recognition, and current method for recognition of Chinese names does not work on Tibetan names. Therefore, through the analysis of Tibetan names characteristics, this paper proposes an automatic recognition method of Tibetan name based on multi-features. This method uses the internal features of names, contextual features and boundary features of names, and establishes the dictionary and feature base of Tibetan names. Finally, an experiment is conducted, and the results prove the algorithm is effective.
藏语名称具有较强的宗教文化内涵,其结构与汉语名称不同。目前,对藏语人名识别的研究有限,现有的汉语人名识别方法也不适用于藏语人名。因此,本文通过对藏语人名特征的分析,提出了一种基于多特征的藏语人名自动识别方法。该方法利用人名的内部特征、上下文特征和边界特征,建立了藏文人名的词典和特征库。最后进行了实验,结果证明了该算法的有效性。
{"title":"Research on automatic recognition of Tibetan personal names based on multi-features","authors":"Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang","doi":"10.1109/NLPKE.2010.5587820","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587820","url":null,"abstract":"Tibetan name has strong religious and cultural connotations, and the construction is different from Chinese name. So far, there is limited research on Tibetan name recognition, and current method for recognition of Chinese names does not work on Tibetan names. Therefore, through the analysis of Tibetan names characteristics, this paper proposes an automatic recognition method of Tibetan name based on multi-features. This method uses the internal features of names, contextual features and boundary features of names, and establishes the dictionary and feature base of Tibetan names. Finally, an experiment is conducted, and the results prove the algorithm is effective.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134353563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A lightweight Chinese semantic dependency parsing model based on sentence compression 基于句子压缩的轻量级汉语语义依赖分析模型
Xin Wang, Weiwei Sun, Zhifang Sui
This paper is concerned with lightweight semantic dependency parsing for Chinese. We propose a novel sentence compression based model for semantic dependency parsing without using any syntactic dependency information. Our model divides semantic dependency parsing into two sequential sub-tasks: sentence compression and semantic dependency recognition. Sentence compression method is used to get backbone information of the sentence, conveying candidate heads of arguments to the next step. The bilexical semantic relations between words in the compressed sentence and predicates are then recognized in a pairwise way. We present encouraging results on the Chinese data set from CoNLL 2009 shared task. Without any syntactic information, our semantic dependency parsing model still outperforms the best reported system in the literature.
本文研究了一种轻量级的汉语语义依赖分析方法。我们提出了一种新的基于句子压缩的语义依赖分析模型,该模型不使用任何句法依赖信息。我们的模型将语义依赖分析分为两个连续的子任务:句子压缩和语义依赖识别。使用句子压缩方法获取句子的主干信息,将候选的论点头传递给下一步。然后以成对的方式识别压缩句子中单词和谓词之间的双元语义关系。我们在来自CoNLL 2009共享任务的中文数据集上展示了令人鼓舞的结果。在没有任何句法信息的情况下,我们的语义依赖分析模型仍然优于文献中报道的最好的系统。
{"title":"A lightweight Chinese semantic dependency parsing model based on sentence compression","authors":"Xin Wang, Weiwei Sun, Zhifang Sui","doi":"10.1109/NLPKE.2010.5587780","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587780","url":null,"abstract":"This paper is concerned with lightweight semantic dependency parsing for Chinese. We propose a novel sentence compression based model for semantic dependency parsing without using any syntactic dependency information. Our model divides semantic dependency parsing into two sequential sub-tasks: sentence compression and semantic dependency recognition. Sentence compression method is used to get backbone information of the sentence, conveying candidate heads of arguments to the next step. The bilexical semantic relations between words in the compressed sentence and predicates are then recognized in a pairwise way. We present encouraging results on the Chinese data set from CoNLL 2009 shared task. Without any syntactic information, our semantic dependency parsing model still outperforms the best reported system in the literature.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115594516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Comparative evaluation of two arabic speech corpora 两种阿拉伯语语料库的比较评价
Y. Alotaibi, A. Meftah
The aim of this paper is to conduct a constructive and comparative evaluation between two important Arabic corpora for two different Arabic dialects, namely, Saudi dialect corpus that was collected by King Abdulaziz City for Science and Technology (KACST), and a Levantine Arabic dialect corpus. Levantine dialect is spoken by ordinary Lebanese, Jordanian, Syrian, and Palestinian people. The later one was produced by the Linguistic Data Consortium (LDC). Advantages and disadvantages of these two corpora were presented and discussed. This discussion is aiming to help digital speech processing researchers to figure out the weakness and strength sides of these important corpora before considering them in their experiments. Moreover, this paper can motivate in designing, maintaining, distributing, and upgrading Arabic corpora to help Arabic language speech research communities.
本文旨在对阿卜杜勒阿齐兹国王科技城(King Abdulaziz City for Science and Technology, KACST)收集的沙特方言语料库和黎凡特阿拉伯方言语料库这两个重要的阿拉伯语语料库进行建设性的比较评价。黎凡特方言是普通黎巴嫩人、约旦人、叙利亚人和巴勒斯坦人使用的语言。后者是由语言数据联盟(LDC)制作的。介绍并讨论了这两种语料库的优缺点。本讨论旨在帮助数字语音处理研究人员在实验中考虑这些重要语料库之前,找出它们的优缺点。此外,本文可以激励阿拉伯文语料库的设计、维护、分发和升级,以帮助阿拉伯文语音研究团体。
{"title":"Comparative evaluation of two arabic speech corpora","authors":"Y. Alotaibi, A. Meftah","doi":"10.1109/NLPKE.2010.5587819","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587819","url":null,"abstract":"The aim of this paper is to conduct a constructive and comparative evaluation between two important Arabic corpora for two different Arabic dialects, namely, Saudi dialect corpus that was collected by King Abdulaziz City for Science and Technology (KACST), and a Levantine Arabic dialect corpus. Levantine dialect is spoken by ordinary Lebanese, Jordanian, Syrian, and Palestinian people. The later one was produced by the Linguistic Data Consortium (LDC). Advantages and disadvantages of these two corpora were presented and discussed. This discussion is aiming to help digital speech processing researchers to figure out the weakness and strength sides of these important corpora before considering them in their experiments. Moreover, this paper can motivate in designing, maintaining, distributing, and upgrading Arabic corpora to help Arabic language speech research communities.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115681821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Augmenting the automated extracted tree adjoining grammars by semantic representation 通过语义表示增强自动提取的树相邻语法
Heshaam Faili, A. Basirat
MICA [1] is a fast and accurate dependency parser for English that uses an automatically LTAG derived from Penn Treebank (PTB) using the Chen's approach [7]. However, there is no semantic representation related to its grammar. On the other hand, XTAG [20] grammar is a hand crafted LTAG that its elementary trees were enriched with the semantic representation by experts. The linguistic knowledge embedded in the XTAG grammar caused it to being used in wide variety of natural language applications. However, the current XTAG parser is not as fast and accurate as well as the MICA parser.
MICA[1]是一个快速准确的英语依赖解析器,它使用Chen的方法[7],使用从Penn Treebank (PTB)派生的自动LTAG。然而,没有与语法相关的语义表示。另一方面,XTAG[20]语法是手工制作的LTAG,其基本树由专家用语义表示进行了丰富。嵌入在XTAG语法中的语言知识使其在各种自然语言应用程序中得到广泛使用。但是,当前的XTAG解析器不像MICA解析器那样快速和准确。
{"title":"Augmenting the automated extracted tree adjoining grammars by semantic representation","authors":"Heshaam Faili, A. Basirat","doi":"10.1109/NLPKE.2010.5587766","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587766","url":null,"abstract":"MICA [1] is a fast and accurate dependency parser for English that uses an automatically LTAG derived from Penn Treebank (PTB) using the Chen's approach [7]. However, there is no semantic representation related to its grammar. On the other hand, XTAG [20] grammar is a hand crafted LTAG that its elementary trees were enriched with the semantic representation by experts. The linguistic knowledge embedded in the XTAG grammar caused it to being used in wide variety of natural language applications. However, the current XTAG parser is not as fast and accurate as well as the MICA parser.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124822055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1