改进古兰经本体实例分类框架性能的特征选择技术

Q3 Decision Sciences JOIV International Journal on Informatics Visualization Pub Date : 2023-07-01 DOI:10.30630/joiv.7.2.1195

Y. Purwati, F. S. Utomo, Nikmah Trinarsih, Hanif Hidayatulloh

{"title":"改进古兰经本体实例分类框架性能的特征选择技术","authors":"Y. Purwati, F. S. Utomo, Nikmah Trinarsih, Hanif Hidayatulloh","doi":"10.30630/joiv.7.2.1195","DOIUrl":null,"url":null,"abstract":"The Al-Quran is the sacred book of Muslims, and it provides God's word in the form of orders, instructions, and guidelines for people to follow to have happy lives both here and in the afterlife. Several earlier research has used ontologies to store the knowledge found in the Quran. The previous study focused on extracting the relationship between classes and instances or the \"is-a relation\" by classifying instances based on the referenced class. Based on the performance testing of the instances classification framework, the test results show that Support Vector Machine (SVM) with Term Frequency-Inverse Document Frequency (TF-IDF) and stemming operation had dropped the accuracy value to 65.41% when the test data size was increased to 30%. Likewise, with BPNN with TF-IDF and stemming operations. In the Indonesian Quran translation dataset with a test data size of 30%, the accuracy value drops to 57.86%. Instances classification based on the thematic topics of the Qur'an aims to connect verses (instances) to topics (classes) to get an overall picture of the topic and provide a better understanding to users. This study aims to apply the feature selection technique to the instances classification framework for the Al-Quran ontology and to analyze the impact of applying the feature selection technique to the framework with a small dataset and training data. The instances classification framework in this study consists of several stages: text-preprocessing, feature extraction, feature selection, and instances classification. We applied Chiq-Square as a technique to perform feature selection. SVM and BPNN as a classifier. Based on the experiment results, it can be concluded that the feature selection implementation using Chi-Square increases the value of precision, f-measure, and accuracy on the test data size from 40% to 60% in all datasets. The feature selection using Chi-Square and SVM classifier provides the highest precision value with a test data size of 60% on the Tafsir Quran dataset from the Ministry of Religious Affairs Indonesia: 64.36%. Furthermore, the feature selection implementation and BPNN classifier also increase the highest accuracy value with a test data size of 60% in the Quranic Tafsir dataset from the Ministry of Religion of the Republic of Indonesia: 63.09%.","PeriodicalId":32468,"journal":{"name":"JOIV International Journal on Informatics Visualization","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature Selection Technique to Improve the Instances Classification Framework Performance for Quran Ontology\",\"authors\":\"Y. Purwati, F. S. Utomo, Nikmah Trinarsih, Hanif Hidayatulloh\",\"doi\":\"10.30630/joiv.7.2.1195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Al-Quran is the sacred book of Muslims, and it provides God's word in the form of orders, instructions, and guidelines for people to follow to have happy lives both here and in the afterlife. Several earlier research has used ontologies to store the knowledge found in the Quran. The previous study focused on extracting the relationship between classes and instances or the \\\"is-a relation\\\" by classifying instances based on the referenced class. Based on the performance testing of the instances classification framework, the test results show that Support Vector Machine (SVM) with Term Frequency-Inverse Document Frequency (TF-IDF) and stemming operation had dropped the accuracy value to 65.41% when the test data size was increased to 30%. Likewise, with BPNN with TF-IDF and stemming operations. In the Indonesian Quran translation dataset with a test data size of 30%, the accuracy value drops to 57.86%. Instances classification based on the thematic topics of the Qur'an aims to connect verses (instances) to topics (classes) to get an overall picture of the topic and provide a better understanding to users. This study aims to apply the feature selection technique to the instances classification framework for the Al-Quran ontology and to analyze the impact of applying the feature selection technique to the framework with a small dataset and training data. The instances classification framework in this study consists of several stages: text-preprocessing, feature extraction, feature selection, and instances classification. We applied Chiq-Square as a technique to perform feature selection. SVM and BPNN as a classifier. Based on the experiment results, it can be concluded that the feature selection implementation using Chi-Square increases the value of precision, f-measure, and accuracy on the test data size from 40% to 60% in all datasets. The feature selection using Chi-Square and SVM classifier provides the highest precision value with a test data size of 60% on the Tafsir Quran dataset from the Ministry of Religious Affairs Indonesia: 64.36%. Furthermore, the feature selection implementation and BPNN classifier also increase the highest accuracy value with a test data size of 60% in the Quranic Tafsir dataset from the Ministry of Religion of the Republic of Indonesia: 63.09%.\",\"PeriodicalId\":32468,\"journal\":{\"name\":\"JOIV International Journal on Informatics Visualization\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JOIV International Journal on Informatics Visualization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30630/joiv.7.2.1195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOIV International Journal on Informatics Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30630/joiv.7.2.1195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

摘要

《古兰经》是穆斯林的圣书，它以命令、指示和指引的形式提供了真主的话语，让人们在今生和来世都能过上幸福的生活。一些早期的研究已经使用本体来存储古兰经中的知识。以往的研究主要是基于引用的类对实例进行分类，提取类与实例之间的关系或“is-a关系”。基于实例分类框架的性能测试，测试结果表明，当测试数据量增加到30%时，采用词频-逆文档频率(TF-IDF)和词干提取操作的支持向量机(SVM)的准确率值下降到65.41%。同样，BPNN具有TF-IDF和词干提取操作。在印尼语《古兰经》翻译数据集中，当测试数据量为30%时，准确率下降到57.86%。基于古兰经主题的实例分类旨在将经文(实例)与主题(类)联系起来，以获得主题的整体图景，并为用户提供更好的理解。本研究旨在将特征选择技术应用于《古兰经》本体实例分类框架，并利用小数据集和训练数据分析将特征选择技术应用于该框架的影响。本研究的实例分类框架包括文本预处理、特征提取、特征选择和实例分类几个阶段。我们应用Chiq-Square作为一种技术来进行特征选择。SVM和BPNN作为分类器。根据实验结果，可以得出结论，使用卡方实现的特征选择在所有数据集上将测试数据大小的精度，f-measure和准确度从40%提高到60%。在印度尼西亚宗教事务部的Tafsir Quran数据集上，使用Chi-Square和SVM分类器进行特征选择的精度值最高，测试数据大小为60%:64.36%。此外，特征选择实现和BPNN分类器在印度尼西亚共和国宗教部的古兰经Tafsir数据集(63.09%)中以60%的测试数据量提高了最高准确率值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Feature Selection Technique to Improve the Instances Classification Framework Performance for Quran Ontology

The Al-Quran is the sacred book of Muslims, and it provides God's word in the form of orders, instructions, and guidelines for people to follow to have happy lives both here and in the afterlife. Several earlier research has used ontologies to store the knowledge found in the Quran. The previous study focused on extracting the relationship between classes and instances or the "is-a relation" by classifying instances based on the referenced class. Based on the performance testing of the instances classification framework, the test results show that Support Vector Machine (SVM) with Term Frequency-Inverse Document Frequency (TF-IDF) and stemming operation had dropped the accuracy value to 65.41% when the test data size was increased to 30%. Likewise, with BPNN with TF-IDF and stemming operations. In the Indonesian Quran translation dataset with a test data size of 30%, the accuracy value drops to 57.86%. Instances classification based on the thematic topics of the Qur'an aims to connect verses (instances) to topics (classes) to get an overall picture of the topic and provide a better understanding to users. This study aims to apply the feature selection technique to the instances classification framework for the Al-Quran ontology and to analyze the impact of applying the feature selection technique to the framework with a small dataset and training data. The instances classification framework in this study consists of several stages: text-preprocessing, feature extraction, feature selection, and instances classification. We applied Chiq-Square as a technique to perform feature selection. SVM and BPNN as a classifier. Based on the experiment results, it can be concluded that the feature selection implementation using Chi-Square increases the value of precision, f-measure, and accuracy on the test data size from 40% to 60% in all datasets. The feature selection using Chi-Square and SVM classifier provides the highest precision value with a test data size of 60% on the Tafsir Quran dataset from the Ministry of Religious Affairs Indonesia: 64.36%. Furthermore, the feature selection implementation and BPNN classifier also increase the highest accuracy value with a test data size of 60% in the Quranic Tafsir dataset from the Ministry of Religion of the Republic of Indonesia: 63.09%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊