用于生成阿拉伯语规范化词汇的ALIF编辑器

Samia Ben Ismail, Hajer Maraoui, K. Haddar, Laurent Romary
{"title":"用于生成阿拉伯语规范化词汇的ALIF编辑器","authors":"Samia Ben Ismail, Hajer Maraoui, K. Haddar, Laurent Romary","doi":"10.1109/IACS.2017.7921948","DOIUrl":null,"url":null,"abstract":"The development of a normalized morpho-syntactic Arabic lexicon is not an easy task. In fact, many norms allow the structuration and representation of lexical data. The adoption of a stable standard will guarantee the interoperability and interchangeability of lexical resources. Still, research work that deals with normalization for Arabic lexical resources is not well developed yet, especially for some standards such as the TEI (Text Encoding Initiative). In this context, we aim at creating an Arabic lexicon editor with a constraint checker based on both the ISO standard LMF (Lexical Markup Framework) and the TEI guidelines. To develop this editor, we use a linguistic approach composed of several steps. The editor's prototype named ALIF can guarantee the construction of two types of output lexicon files: one in LMF and the other in TEI. The evaluation of this system is based upon a lexical database that contains all the derived and inflected forms generated from a lexicon of 10 000 canonical verbs. The results obtained were encouraging despite some flaws related to exceptional cases of difficult words.","PeriodicalId":180504,"journal":{"name":"2017 8th International Conference on Information and Communication Systems (ICICS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"ALIF editor for generating Arabic normalized lexicons\",\"authors\":\"Samia Ben Ismail, Hajer Maraoui, K. Haddar, Laurent Romary\",\"doi\":\"10.1109/IACS.2017.7921948\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The development of a normalized morpho-syntactic Arabic lexicon is not an easy task. In fact, many norms allow the structuration and representation of lexical data. The adoption of a stable standard will guarantee the interoperability and interchangeability of lexical resources. Still, research work that deals with normalization for Arabic lexical resources is not well developed yet, especially for some standards such as the TEI (Text Encoding Initiative). In this context, we aim at creating an Arabic lexicon editor with a constraint checker based on both the ISO standard LMF (Lexical Markup Framework) and the TEI guidelines. To develop this editor, we use a linguistic approach composed of several steps. The editor's prototype named ALIF can guarantee the construction of two types of output lexicon files: one in LMF and the other in TEI. The evaluation of this system is based upon a lexical database that contains all the derived and inflected forms generated from a lexicon of 10 000 canonical verbs. The results obtained were encouraging despite some flaws related to exceptional cases of difficult words.\",\"PeriodicalId\":180504,\"journal\":{\"name\":\"2017 8th International Conference on Information and Communication Systems (ICICS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 8th International Conference on Information and Communication Systems (ICICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IACS.2017.7921948\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IACS.2017.7921948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

发展一个规范化的形态句法阿拉伯语词汇不是一件容易的事。事实上,许多规范都允许对词法数据进行结构化和表示。采用稳定的标准将保证词汇资源的互操作性和互换性。然而,处理阿拉伯语词汇资源规范化的研究工作还没有得到很好的发展,特别是对于一些标准,如TEI (Text Encoding Initiative)。在此上下文中,我们的目标是创建一个带有约束检查器的阿拉伯语词典编辑器,该约束检查器基于ISO标准LMF(词法标记框架)和TEI指南。为了开发这个编辑器,我们使用了由几个步骤组成的语言方法。名为ALIF的编辑器原型可以保证构建两种类型的输出词典文件:一种在LMF中,另一种在TEI中。该系统的评估是基于一个词汇数据库,该数据库包含从10,000个规范动词的词典中生成的所有派生和屈折形式。结果令人鼓舞,尽管有一些与特殊情况下的困难单词有关的缺陷。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ALIF editor for generating Arabic normalized lexicons
The development of a normalized morpho-syntactic Arabic lexicon is not an easy task. In fact, many norms allow the structuration and representation of lexical data. The adoption of a stable standard will guarantee the interoperability and interchangeability of lexical resources. Still, research work that deals with normalization for Arabic lexical resources is not well developed yet, especially for some standards such as the TEI (Text Encoding Initiative). In this context, we aim at creating an Arabic lexicon editor with a constraint checker based on both the ISO standard LMF (Lexical Markup Framework) and the TEI guidelines. To develop this editor, we use a linguistic approach composed of several steps. The editor's prototype named ALIF can guarantee the construction of two types of output lexicon files: one in LMF and the other in TEI. The evaluation of this system is based upon a lexical database that contains all the derived and inflected forms generated from a lexicon of 10 000 canonical verbs. The results obtained were encouraging despite some flaws related to exceptional cases of difficult words.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Using behaviour-driven development with hardware-software co-design for autonomous load management Efficient 3D placement of a UAV using particle swarm optimization A brain friendly tool to facilitate research-teaching nexus: Mind maps HidroMORE 2: An optimized and parallel version of HidroMORE Comparative analysis of MCDM methods for product aspect ranking: TOPSIS and VIKOR
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1