改进古兰经本体实例分类框架性能的特征选择技术

Y. Purwati, F. S. Utomo, Nikmah Trinarsih, Hanif Hidayatulloh
{"title":"改进古兰经本体实例分类框架性能的特征选择技术","authors":"Y. Purwati, F. S. Utomo, Nikmah Trinarsih, Hanif Hidayatulloh","doi":"10.30630/joiv.7.2.1195","DOIUrl":null,"url":null,"abstract":"The Al-Quran is the sacred book of Muslims, and it provides God's word in the form of orders, instructions, and guidelines for people to follow to have happy lives both here and in the afterlife. Several earlier research has used ontologies to store the knowledge found in the Quran. The previous study focused on extracting the relationship between classes and instances or the \"is-a relation\" by classifying instances based on the referenced class. Based on the performance testing of the instances classification framework, the test results show that Support Vector Machine (SVM) with Term Frequency-Inverse Document Frequency (TF-IDF) and stemming operation had dropped the accuracy value to 65.41% when the test data size was increased to 30%. Likewise, with BPNN with TF-IDF and stemming operations. In the Indonesian Quran translation dataset with a test data size of 30%, the accuracy value drops to 57.86%. Instances classification based on the thematic topics of the Qur'an aims to connect verses (instances) to topics (classes) to get an overall picture of the topic and provide a better understanding to users. This study aims to apply the feature selection technique to the instances classification framework for the Al-Quran ontology and to analyze the impact of applying the feature selection technique to the framework with a small dataset and training data. The instances classification framework in this study consists of several stages: text-preprocessing, feature extraction, feature selection, and instances classification. We applied Chiq-Square as a technique to perform feature selection. SVM and BPNN as a classifier. Based on the experiment results, it can be concluded that the feature selection implementation using Chi-Square increases the value of precision, f-measure, and accuracy on the test data size from 40% to 60% in all datasets. The feature selection using Chi-Square and SVM classifier provides the highest precision value with a test data size of 60% on the Tafsir Quran dataset from the Ministry of Religious Affairs Indonesia: 64.36%. Furthermore, the feature selection implementation and BPNN classifier also increase the highest accuracy value with a test data size of 60% in the Quranic Tafsir dataset from the Ministry of Religion of the Republic of Indonesia: 63.09%.","PeriodicalId":32468,"journal":{"name":"JOIV International Journal on Informatics Visualization","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature Selection Technique to Improve the Instances Classification Framework Performance for Quran Ontology\",\"authors\":\"Y. Purwati, F. S. Utomo, Nikmah Trinarsih, Hanif Hidayatulloh\",\"doi\":\"10.30630/joiv.7.2.1195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Al-Quran is the sacred book of Muslims, and it provides God's word in the form of orders, instructions, and guidelines for people to follow to have happy lives both here and in the afterlife. Several earlier research has used ontologies to store the knowledge found in the Quran. The previous study focused on extracting the relationship between classes and instances or the \\\"is-a relation\\\" by classifying instances based on the referenced class. Based on the performance testing of the instances classification framework, the test results show that Support Vector Machine (SVM) with Term Frequency-Inverse Document Frequency (TF-IDF) and stemming operation had dropped the accuracy value to 65.41% when the test data size was increased to 30%. Likewise, with BPNN with TF-IDF and stemming operations. In the Indonesian Quran translation dataset with a test data size of 30%, the accuracy value drops to 57.86%. Instances classification based on the thematic topics of the Qur'an aims to connect verses (instances) to topics (classes) to get an overall picture of the topic and provide a better understanding to users. This study aims to apply the feature selection technique to the instances classification framework for the Al-Quran ontology and to analyze the impact of applying the feature selection technique to the framework with a small dataset and training data. The instances classification framework in this study consists of several stages: text-preprocessing, feature extraction, feature selection, and instances classification. We applied Chiq-Square as a technique to perform feature selection. SVM and BPNN as a classifier. Based on the experiment results, it can be concluded that the feature selection implementation using Chi-Square increases the value of precision, f-measure, and accuracy on the test data size from 40% to 60% in all datasets. The feature selection using Chi-Square and SVM classifier provides the highest precision value with a test data size of 60% on the Tafsir Quran dataset from the Ministry of Religious Affairs Indonesia: 64.36%. Furthermore, the feature selection implementation and BPNN classifier also increase the highest accuracy value with a test data size of 60% in the Quranic Tafsir dataset from the Ministry of Religion of the Republic of Indonesia: 63.09%.\",\"PeriodicalId\":32468,\"journal\":{\"name\":\"JOIV International Journal on Informatics Visualization\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JOIV International Journal on Informatics Visualization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30630/joiv.7.2.1195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOIV International Journal on Informatics Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30630/joiv.7.2.1195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}
引用次数: 0

摘要

《古兰经》是穆斯林的圣书,它以命令、指示和指引的形式提供了真主的话语,让人们在今生和来世都能过上幸福的生活。一些早期的研究已经使用本体来存储古兰经中的知识。以往的研究主要是基于引用的类对实例进行分类,提取类与实例之间的关系或“is-a关系”。基于实例分类框架的性能测试,测试结果表明,当测试数据量增加到30%时,采用词频-逆文档频率(TF-IDF)和词干提取操作的支持向量机(SVM)的准确率值下降到65.41%。同样,BPNN具有TF-IDF和词干提取操作。在印尼语《古兰经》翻译数据集中,当测试数据量为30%时,准确率下降到57.86%。基于古兰经主题的实例分类旨在将经文(实例)与主题(类)联系起来,以获得主题的整体图景,并为用户提供更好的理解。本研究旨在将特征选择技术应用于《古兰经》本体实例分类框架,并利用小数据集和训练数据分析将特征选择技术应用于该框架的影响。本研究的实例分类框架包括文本预处理、特征提取、特征选择和实例分类几个阶段。我们应用Chiq-Square作为一种技术来进行特征选择。SVM和BPNN作为分类器。根据实验结果,可以得出结论,使用卡方实现的特征选择在所有数据集上将测试数据大小的精度,f-measure和准确度从40%提高到60%。在印度尼西亚宗教事务部的Tafsir Quran数据集上,使用Chi-Square和SVM分类器进行特征选择的精度值最高,测试数据大小为60%:64.36%。此外,特征选择实现和BPNN分类器在印度尼西亚共和国宗教部的古兰经Tafsir数据集(63.09%)中以60%的测试数据量提高了最高准确率值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Feature Selection Technique to Improve the Instances Classification Framework Performance for Quran Ontology
The Al-Quran is the sacred book of Muslims, and it provides God's word in the form of orders, instructions, and guidelines for people to follow to have happy lives both here and in the afterlife. Several earlier research has used ontologies to store the knowledge found in the Quran. The previous study focused on extracting the relationship between classes and instances or the "is-a relation" by classifying instances based on the referenced class. Based on the performance testing of the instances classification framework, the test results show that Support Vector Machine (SVM) with Term Frequency-Inverse Document Frequency (TF-IDF) and stemming operation had dropped the accuracy value to 65.41% when the test data size was increased to 30%. Likewise, with BPNN with TF-IDF and stemming operations. In the Indonesian Quran translation dataset with a test data size of 30%, the accuracy value drops to 57.86%. Instances classification based on the thematic topics of the Qur'an aims to connect verses (instances) to topics (classes) to get an overall picture of the topic and provide a better understanding to users. This study aims to apply the feature selection technique to the instances classification framework for the Al-Quran ontology and to analyze the impact of applying the feature selection technique to the framework with a small dataset and training data. The instances classification framework in this study consists of several stages: text-preprocessing, feature extraction, feature selection, and instances classification. We applied Chiq-Square as a technique to perform feature selection. SVM and BPNN as a classifier. Based on the experiment results, it can be concluded that the feature selection implementation using Chi-Square increases the value of precision, f-measure, and accuracy on the test data size from 40% to 60% in all datasets. The feature selection using Chi-Square and SVM classifier provides the highest precision value with a test data size of 60% on the Tafsir Quran dataset from the Ministry of Religious Affairs Indonesia: 64.36%. Furthermore, the feature selection implementation and BPNN classifier also increase the highest accuracy value with a test data size of 60% in the Quranic Tafsir dataset from the Ministry of Religion of the Republic of Indonesia: 63.09%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JOIV International Journal on Informatics Visualization
JOIV International Journal on Informatics Visualization Decision Sciences-Information Systems and Management
CiteScore
1.40
自引率
0.00%
发文量
100
审稿时长
16 weeks
期刊最新文献
Composition Model of Organic Waste Raw Materials Image-Based To Obtain Charcoal Briquette Energy Potential Visualization Mapping of the Socio-Technical Architecture based on Tongkonan Traditional House Skew Correction and Image Cleaning Handwriting Recognition Using a Convolutional Neural Network 433Mhz based Robot using PID (Proportional Integral Derivative) for Precise Facing Direction Closer Look at Image Classification for Indonesian Sign Language with Few-Shot Learning Using Matching Network Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1