Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation

R. Hidayat, Sekar Minati
{"title":"Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation","authors":"R. Hidayat, Sekar Minati","doi":"10.14421/IJID.2019.08108","DOIUrl":null,"url":null,"abstract":"Qur'an, As-Sunnah, and Islamic old book have become the sources for Islam followers as sources of knowledge, wisdom, and law. But in daily life, there are still many Muslims who do not understand the meaning of the sentence in the Qur'an even though they read it every day. It becomes a challenge for Science and Engineering field academicians especially Informatics to explore and represent knowledge through intelligent system computing to answer various questions based on knowledge from the Qur'an. This research is creating an enabling computational environment for text mining the Qur'an, of which purpose is to facilitate people to understand each verse in the Qur'an. The classification experiment uses Support Vector Machine (SVM), Naive Bayes, k-Nearest Neighbor (kNN), and J48 Decision Tree classifier algorithms with Al-Baqarah verses translated to English and Indonesian as the dataset which was labeled by three most fundamental aspects of Islam: 'Iman' (faith), 'Ibadah' (worship), and 'Akhlaq' (virtues). Indonesian translation was processed by using the sastrawi package in Python to do the pre-processing and StringToWord Vector in WEKA with the TF-IDF method to implement the algorithms. The classification experiments are determined to measure accuracy, and f-measure, it tested with a percentage split 66% as the data training and the rest as the data testing. The decision from an experiment that was carried out by the classification results, SVM classifier algorithms have the overall best accuracy performance for the Indonesian translation of 81.443% and the Naïve Bayes classifier has the best accuracy for the English translation, which achieved 78.35%.","PeriodicalId":33558,"journal":{"name":"IJID International Journal on Informatics for Development","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJID International Journal on Informatics for Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14421/IJID.2019.08108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Qur'an, As-Sunnah, and Islamic old book have become the sources for Islam followers as sources of knowledge, wisdom, and law. But in daily life, there are still many Muslims who do not understand the meaning of the sentence in the Qur'an even though they read it every day. It becomes a challenge for Science and Engineering field academicians especially Informatics to explore and represent knowledge through intelligent system computing to answer various questions based on knowledge from the Qur'an. This research is creating an enabling computational environment for text mining the Qur'an, of which purpose is to facilitate people to understand each verse in the Qur'an. The classification experiment uses Support Vector Machine (SVM), Naive Bayes, k-Nearest Neighbor (kNN), and J48 Decision Tree classifier algorithms with Al-Baqarah verses translated to English and Indonesian as the dataset which was labeled by three most fundamental aspects of Islam: 'Iman' (faith), 'Ibadah' (worship), and 'Akhlaq' (virtues). Indonesian translation was processed by using the sastrawi package in Python to do the pre-processing and StringToWord Vector in WEKA with the TF-IDF method to implement the algorithms. The classification experiments are determined to measure accuracy, and f-measure, it tested with a percentage split 66% as the data training and the rest as the data testing. The decision from an experiment that was carried out by the classification results, SVM classifier algorithms have the overall best accuracy performance for the Indonesian translation of 81.443% and the Naïve Bayes classifier has the best accuracy for the English translation, which achieved 78.35%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
英汉古兰经翻译文本挖掘分类算法的比较分析
《古兰经》、《圣训》和伊斯兰旧书已成为伊斯兰信徒知识、智慧和法律的源泉。但在日常生活中,仍然有许多穆斯林即使每天都读《古兰经》,但他们不理解这句话的含义。通过智能系统计算来探索和表示知识,以《古兰经》中的知识为基础回答各种问题,这对科学和工程领域的学者,尤其是信息学来说是一个挑战。这项研究为文本挖掘《古兰经》创造了一个有利的计算环境,目的是帮助人们理解《古兰经中的每一节经文。分类实验使用支持向量机(SVM)、朴素贝叶斯(Naive Bayes)、k近邻(kNN)和J48决策树分类器算法,以翻译成英语和印尼语的Al-Baqarah诗句为数据集,由伊斯兰教的三个最基本方面标记:“Iman”(信仰)、“Ibadah”(崇拜)和“Akhlaq”(美德)。印尼语翻译使用Python中的sastrawi包进行预处理,使用WEKA中的StringToWord Vector和TF-IDF方法实现算法。分类实验被确定为测量准确性,f-measure,它以66%的百分比作为数据训练进行测试,其余作为数据测试。根据分类结果进行的实验决定,SVM分类器算法对印尼语翻译的总体准确率性能最好,为81.443%,而Naïve Bayes分类器对英语翻译的准确率最好,为78.35%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
6
审稿时长
8 weeks
期刊最新文献
Forecasting: Analyze Online and Offline Learning Mode with Machine Learning Algorithms Real-time Smartphone Usage Surveillance System Based on YOLOv5 Classifying High School Scholarship Recipients Using the K-Nearest Neighbor Algorithm Data Search Process Optimization using Brute Force and Genetic Algorithm Hybrid Method Quran Memorization Technologies and Methods: Literature Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1