Tunisian Arabic aeb Wordnet: Current State and Future Extensions

Nadia Karmani Ben Moussa, Hsan Soussou, A. Alimi
{"title":"Tunisian Arabic aeb Wordnet: Current State and Future Extensions","authors":"Nadia Karmani Ben Moussa, Hsan Soussou, A. Alimi","doi":"10.1109/ACLING.2015.7","DOIUrl":null,"url":null,"abstract":"Nowadays, Internet communication and especially informal Internet communication such as social networks, blogs, etc. is directing politic, economic, financial and social environments all over the world. Consequently, Internet monitoring is taking more and more scale particularly in Tunisia suffering from unsteadiness since the politic revolution in 2011. In a Tunisian context, Internet communication is characterized by the increasing use of aeb language (i.e. an Arabic dialect called Tunisian Arabic). Therefore, Tunisian Internet monitoring needs primarily aeb language processing tools, especially an aeb lexicon. However, few aeb lexicon were developed seen the lack of written resources. Some of these lexicons are created from Arabic lexicons. They cover aeb lexicon originally Arabic and ignore the large borrowed aeb lexicon. Others are build using the informal Web. In fact, they need a rigorous linguistic verification, correction and validation. In this case, we suggest building a standard, large and robust Wordnet taking in charge phonetic. Our Wordnet is created by the expand approach used for EuroWordnet building as in [12], based on the bilingual English-Tunisian Arabic dictionary Peace corps dictionary prepared by the linguists: R. Ben abdelkader, A. Ayed and A. Naouar [13], and the last version of Princeton Wordnet PWN 3.1. Moreover, it is modelized according to ISO-LMF by a switable Wordnet-LMF model for aeb language. In this paper, we present aeb wordnet building approach, describe its current state and propose extensions.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACLING.2015.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Nowadays, Internet communication and especially informal Internet communication such as social networks, blogs, etc. is directing politic, economic, financial and social environments all over the world. Consequently, Internet monitoring is taking more and more scale particularly in Tunisia suffering from unsteadiness since the politic revolution in 2011. In a Tunisian context, Internet communication is characterized by the increasing use of aeb language (i.e. an Arabic dialect called Tunisian Arabic). Therefore, Tunisian Internet monitoring needs primarily aeb language processing tools, especially an aeb lexicon. However, few aeb lexicon were developed seen the lack of written resources. Some of these lexicons are created from Arabic lexicons. They cover aeb lexicon originally Arabic and ignore the large borrowed aeb lexicon. Others are build using the informal Web. In fact, they need a rigorous linguistic verification, correction and validation. In this case, we suggest building a standard, large and robust Wordnet taking in charge phonetic. Our Wordnet is created by the expand approach used for EuroWordnet building as in [12], based on the bilingual English-Tunisian Arabic dictionary Peace corps dictionary prepared by the linguists: R. Ben abdelkader, A. Ayed and A. Naouar [13], and the last version of Princeton Wordnet PWN 3.1. Moreover, it is modelized according to ISO-LMF by a switable Wordnet-LMF model for aeb language. In this paper, we present aeb wordnet building approach, describe its current state and propose extensions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
突尼斯阿拉伯语aeb Wordnet:当前状态和未来扩展
如今,网络传播,尤其是社交网络、博客等非正式网络传播,正在影响着世界各地的政治、经济、金融和社会环境。因此,互联网监控的规模越来越大,特别是在2011年政治革命以来遭受不稳定的突尼斯。在突尼斯,互联网交流的特点是越来越多地使用aeb语言(即一种被称为突尼斯阿拉伯语的阿拉伯方言)。因此,突尼斯互联网监测主要需要aeb语言处理工具,尤其是aeb词典。然而,由于缺乏书面资源,很少有aeb词汇被开发出来。其中一些词汇是从阿拉伯语词汇中创建的。它们涵盖了原始阿拉伯语的aeb词汇,而忽略了大量借来的aeb词汇。其他的则是使用非正式的Web构建的。实际上,它们需要经过严格的语言验证、纠正和确认。在这种情况下,我们建议建立一个标准的、大型的、健壮的Wordnet来负责语音。我们的Wordnet是根据b[12]中用于欧洲Wordnet构建的扩展方法创建的,基于由语言学家R. Ben abdelkader, A. Ayed和A. Naouar b[13]编写的英语-突尼斯阿拉伯语双语词典和平队词典,以及普林斯顿Wordnet PWN 3.1的最后版本。此外,还根据ISO-LMF模型建立了一个可切换的Wordnet-LMF模型。在本文中,我们提出了aeb世界网的构建方法,描述了它的现状并提出了扩展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Which Configuration Works Best? An Experimental Study on Supervised Arabic Twitter Sentiment Analysis Increasing the Accuracy of Opinion Mining in Arabic Tunisian Arabic aeb Wordnet: Current State and Future Extensions A Named Entities Recognition System for Modern Standard Arabic using Rule-Based Approach Transducers Cascades for an Automatic Recognition of Arabic Named Entities in Order to Establish Links to Free Resources
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1