Design and Development of Enhanced Morphological Analyzer for Ge’ez Verbs Using Memory Based Learning Algorithms

Gebremeskel Hagos Gebremedhin, F. Wang
{"title":"Design and Development of Enhanced Morphological Analyzer for Ge’ez Verbs Using Memory Based Learning Algorithms","authors":"Gebremeskel Hagos Gebremedhin, F. Wang","doi":"10.24940/theijst/2020/v8/i7/st2007-001","DOIUrl":null,"url":null,"abstract":"This paper is carefully designed for Ge’ez morphological analyzer. Ge’ez is the classical language of Ethiopia and still used as the liturgical language of Ethiopian Orthodox Tewahedo church. Many ancient literatures were written in Ge’ez. The literature includes religious texts and secular writings. The ancient philosophy, tradition, history and knowledge of Ethiopia were being written in Ge’ez. Morphological analyzer is one of the most important basic tools in automatic processing of any human language and analyses the naturally occurring word forms in a sentence and identifies the root word and its features. In this paper, MBL is used to automatically analyze the morphology of Ge’ez verbs via the concept of machine learning for training and analysis. TiMB’s IB2 and TRIBL2 algorithms have been used for the implementation. The performance of the system has been evaluated using 10-fold cross validation technique on the default and optimized parameter settings. The overall accuracy with optimized parameters using IB2 and TRIBL2 was 94.24% and 93.31%, respectively. Similarly, the overall precision, recall and F-score with optimized parameters using IB2 were 55.6%, 56.3% and 59.95%, respectively. In the same manner the precision, recall and F-score using TRIBL2 were 58.8%, 60.3% and 59.54%, respectively. Moreover, a learning curve was drawn. The graph showed that as the number of training dataset increase, the accuracy on unseen data can be increased. Therefore, IB2 algorithm shows better result thanTRIBL2 algorithm for Ge’ez verb morphology.","PeriodicalId":231256,"journal":{"name":"The International Journal of Science & Technoledge","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Journal of Science & Technoledge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24940/theijst/2020/v8/i7/st2007-001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper is carefully designed for Ge’ez morphological analyzer. Ge’ez is the classical language of Ethiopia and still used as the liturgical language of Ethiopian Orthodox Tewahedo church. Many ancient literatures were written in Ge’ez. The literature includes religious texts and secular writings. The ancient philosophy, tradition, history and knowledge of Ethiopia were being written in Ge’ez. Morphological analyzer is one of the most important basic tools in automatic processing of any human language and analyses the naturally occurring word forms in a sentence and identifies the root word and its features. In this paper, MBL is used to automatically analyze the morphology of Ge’ez verbs via the concept of machine learning for training and analysis. TiMB’s IB2 and TRIBL2 algorithms have been used for the implementation. The performance of the system has been evaluated using 10-fold cross validation technique on the default and optimized parameter settings. The overall accuracy with optimized parameters using IB2 and TRIBL2 was 94.24% and 93.31%, respectively. Similarly, the overall precision, recall and F-score with optimized parameters using IB2 were 55.6%, 56.3% and 59.95%, respectively. In the same manner the precision, recall and F-score using TRIBL2 were 58.8%, 60.3% and 59.54%, respectively. Moreover, a learning curve was drawn. The graph showed that as the number of training dataset increase, the accuracy on unseen data can be increased. Therefore, IB2 algorithm shows better result thanTRIBL2 algorithm for Ge’ez verb morphology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于记忆学习算法的改进型动词形态分析器的设计与开发
本文是为格氏形态分析仪精心设计的。格伊兹语是埃塞俄比亚的古典语言,至今仍被用作埃塞俄比亚东正教特瓦赫多教堂的礼拜语言。许多古代文献都是用革以斯语写的。文学作品包括宗教文本和世俗作品。埃塞俄比亚的古代哲学、传统、历史和知识都是用geez书写的。词形分析器是人类语言自动处理中最重要的基本工具之一,它分析句子中自然出现的词形,识别词根及其特征。在本文中,MBL通过机器学习的概念来自动分析Ge 'ez动词的形态,以进行训练和分析。TiMB的IB2和TRIBL2算法已被用于实现。使用10倍交叉验证技术对默认参数和优化参数设置进行了系统性能评估。IB2和TRIBL2优化后的总体准确度分别为94.24%和93.31%。同样,IB2优化后的总体查准率、查全率和f值分别为55.6%、56.3%和59.95%。同样,TRIBL2的准确率为58.8%,召回率为60.3%,F-score为59.54%。此外,还绘制了一条学习曲线。从图中可以看出,随着训练数据集数量的增加,对未见数据的准确率可以提高。因此,对于Ge’ez动词形态,IB2算法比tribl2算法表现出更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Further Application of Coefficient of Anisotropy and Reflection Coefficient as Indices in Delineating Groundwater Potential in Basement Complex of Ado Ekiti, Southwestern Nigeria Scientific Levels of Field Data Analysis in Computing Research Quantification and Comparison of Fractal Geometry of Complex Flow Patterns of Rock Analogue, Rock Analogue Mixed with Salt and Rock Analogue Which Flows around Marker Pens Health Risk Assessment of Sugarcane Grown in an Area under the Influence of Hospital Wastewater Geoelectric Investigation of Groundwater Potential within Mubiand Environs, Adamawa State, Northeastern Nigeria
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1