Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350755
Allantutra Guslawa, Endroyono, S. M. S. Nugroho
The previous research related to financial statements audit mostly used single-label classification, such as opinion prediction, opinion identification, and opinion detection. We propose the use of multi-label classification to predict the “opinion and exceptions” using data from financial statements audit reports in Central Kalimantan province. We use financial ratios as attributes as well as opinion and exceptions as labels. In this research, we use three of Problem Transformation Methods, namely Binary Relevance (BR), Classifier Chains (CC) and Random k-labelsets (RAkEL), where each of will be combined with three of base classifiers such as J48, SMO, and Random Forest. The best evaluation metrics results for Hamming Loss is 0.19, for One-Error is 0.253, for Rank Loss is 0.16, and for Average Precision is 0.793.
{"title":"Problem transformation methods for prediction of opinion and exceptions in financial statements audit reports: Case for financial statements audit in central Kalimantan province","authors":"Allantutra Guslawa, Endroyono, S. M. S. Nugroho","doi":"10.1109/ICOIACT.2018.8350755","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350755","url":null,"abstract":"The previous research related to financial statements audit mostly used single-label classification, such as opinion prediction, opinion identification, and opinion detection. We propose the use of multi-label classification to predict the “opinion and exceptions” using data from financial statements audit reports in Central Kalimantan province. We use financial ratios as attributes as well as opinion and exceptions as labels. In this research, we use three of Problem Transformation Methods, namely Binary Relevance (BR), Classifier Chains (CC) and Random k-labelsets (RAkEL), where each of will be combined with three of base classifiers such as J48, SMO, and Random Forest. The best evaluation metrics results for Hamming Loss is 0.19, for One-Error is 0.253, for Rank Loss is 0.16, and for Average Precision is 0.793.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"60 1","pages":"747-752"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74620638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350701
Herlina, S. Wibirama, I. Ardiyanto
Gaze-based interaction in various digital technologies is a rapidly growing research area. Eye tracking provides an alternative input modality to control interactive contents in computers. Nowadays, eye tracking is not only expected to be a personal assistive technology, but also to be a controller for interactive contents in a public display. Instead of fixational eye movement, smooth pursuit eye movement has been used for object selection in gaze-based interactive applications. However, previous works did not consider various similarity measures for spontaneous object selection. Hence, no information on how different similarity measures affect performance of object selection. To fill this gap, we compared two similarity measures — Euclidean distance and Pearson's product moment coefficient — for object selection. We presented simple interactive applications containing four dynamic objects, each of which was presented subsequently or simultaneously. The participants were asked to select the objects by gazing and following the trajectory of the moving objects. Our results show that object selection with Euclidean distance achieved superior accuracy (78.65%) compared with object selection with Pearson's product moment coefficient (57.38%). In future, our results maybe used as a guideline for development of spontaneous gaze-based interactive application.
{"title":"Similarity measures of object selection in interactive applications based on smooth pursuit eye movements","authors":"Herlina, S. Wibirama, I. Ardiyanto","doi":"10.1109/ICOIACT.2018.8350701","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350701","url":null,"abstract":"Gaze-based interaction in various digital technologies is a rapidly growing research area. Eye tracking provides an alternative input modality to control interactive contents in computers. Nowadays, eye tracking is not only expected to be a personal assistive technology, but also to be a controller for interactive contents in a public display. Instead of fixational eye movement, smooth pursuit eye movement has been used for object selection in gaze-based interactive applications. However, previous works did not consider various similarity measures for spontaneous object selection. Hence, no information on how different similarity measures affect performance of object selection. To fill this gap, we compared two similarity measures — Euclidean distance and Pearson's product moment coefficient — for object selection. We presented simple interactive applications containing four dynamic objects, each of which was presented subsequently or simultaneously. The participants were asked to select the objects by gazing and following the trajectory of the moving objects. Our results show that object selection with Euclidean distance achieved superior accuracy (78.65%) compared with object selection with Pearson's product moment coefficient (57.38%). In future, our results maybe used as a guideline for development of spontaneous gaze-based interactive application.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"72 1","pages":"639-644"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84062025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350816
Ratna Sulistyowati, Suhartono, H. Kuswanto, Setiawan, Erni Tri Astuti
Forecasting of air passenger and cargo have a major influence on the master plan of the airport infrastructure development and investment by the civil airline. This research aims to obtain the most accurate predictive value of the air passenger and cargo at three international airports Indonesia, namely, Soekarno Hatta, I Gusti Ngurah Rai, and Juanda Airport. Those international airports are the three largest contributors to the number of air passengers and cargo volumes in Indonesia. This research uses a hybrid forecasting method that combines linear and nonlinear models. The combination of two linear and nonlinear models is able to obtain accurate predictions. The first phase is linear modeling with time series regression model (TSR) and Autoregressive Integrated Moving Average with Exogenous Factor (ARIMAX). In the second phase, the error of the linear model is analyzed by using machine learning methods such as Neural Network (NN) and Support Vector Regression (SVR) to capture nonlinear patterns. There are four hybrid models that be applied and compared, i.e. TSR-NN, TSR-SVR, ARIMAX-NN, and ARIMAX-SVR based on the Mean Absolute Percentage Error (MAPE). The results show that hybrid ARIMAX-NN and TSR-NN give more accurate prediction than hybrid TSR-SVR and ARIMAX-SVR.
{"title":"Hybrid forecasting model to predict air passenger and cargo in Indonesia","authors":"Ratna Sulistyowati, Suhartono, H. Kuswanto, Setiawan, Erni Tri Astuti","doi":"10.1109/ICOIACT.2018.8350816","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350816","url":null,"abstract":"Forecasting of air passenger and cargo have a major influence on the master plan of the airport infrastructure development and investment by the civil airline. This research aims to obtain the most accurate predictive value of the air passenger and cargo at three international airports Indonesia, namely, Soekarno Hatta, I Gusti Ngurah Rai, and Juanda Airport. Those international airports are the three largest contributors to the number of air passengers and cargo volumes in Indonesia. This research uses a hybrid forecasting method that combines linear and nonlinear models. The combination of two linear and nonlinear models is able to obtain accurate predictions. The first phase is linear modeling with time series regression model (TSR) and Autoregressive Integrated Moving Average with Exogenous Factor (ARIMAX). In the second phase, the error of the linear model is analyzed by using machine learning methods such as Neural Network (NN) and Support Vector Regression (SVR) to capture nonlinear patterns. There are four hybrid models that be applied and compared, i.e. TSR-NN, TSR-SVR, ARIMAX-NN, and ARIMAX-SVR based on the Mean Absolute Percentage Error (MAPE). The results show that hybrid ARIMAX-NN and TSR-NN give more accurate prediction than hybrid TSR-SVR and ARIMAX-SVR.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"20 1","pages":"442-447"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73395537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350723
Made Agus Putra Subali, R. Sarno, Yutika Amelia Effendi
In an industry that is often faced with the problem of optimizing compound goals to be achieved such as maximizing sales, maximizing total production, and production costs. Multi objective linear programming method can be applied effectively in production planning because it has a great chance to solve every different aspect of production planning. In this research, we will apply fuzzy goal programming method to optimize time and cost in Port Container Handling. The experimental results have been conducted using three variants with different cases provide match results the desired by the decision maker. From the three variants, the second variant provides the most optimal results compared to other variants.
{"title":"Time and cost optimization using fuzzy goal programming","authors":"Made Agus Putra Subali, R. Sarno, Yutika Amelia Effendi","doi":"10.1109/ICOIACT.2018.8350723","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350723","url":null,"abstract":"In an industry that is often faced with the problem of optimizing compound goals to be achieved such as maximizing sales, maximizing total production, and production costs. Multi objective linear programming method can be applied effectively in production planning because it has a great chance to solve every different aspect of production planning. In this research, we will apply fuzzy goal programming method to optimize time and cost in Port Container Handling. The experimental results have been conducted using three variants with different cases provide match results the desired by the decision maker. From the three variants, the second variant provides the most optimal results compared to other variants.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"32 1","pages":"471-476"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82343015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350735
Rahardyan Bisma Setya Putra, Ema Utami
Stemming algorithm Nazief & Andriani has been development in terms of the speed and the accuracy. One of its development is Flexible Affix Classification. Flexible Affix Classification improves the accuracy for reduplicated words confix-stripping. In its growth, Indonesian language is used in two ways: formal and non-formal. Non-formal language is commonly used in casual situations such as conversations and social media post (Facebook, Twitter, Instagram, etc.). To get the root of the word of a casual conversation or a social media post, stemming algorithm which can process the non-formal words with affixes is required. Stemming non-formal words can be used in various information retrievals such as sentiment analysis on twitter posts. Therefore, this study modifies Flexible Affix Classification to be able to do stemming on non-formal word. Modifications are made by adding a non-formal affix rule. The result of the research shows that the algorithm made in this research has 73.3% accuracy while the Flexible Affix Classification algorithm has 35% accuracy in processing 60 non-formal affixed words.
{"title":"Non-formal affixed word stemming in Indonesian language","authors":"Rahardyan Bisma Setya Putra, Ema Utami","doi":"10.1109/ICOIACT.2018.8350735","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350735","url":null,"abstract":"Stemming algorithm Nazief & Andriani has been development in terms of the speed and the accuracy. One of its development is Flexible Affix Classification. Flexible Affix Classification improves the accuracy for reduplicated words confix-stripping. In its growth, Indonesian language is used in two ways: formal and non-formal. Non-formal language is commonly used in casual situations such as conversations and social media post (Facebook, Twitter, Instagram, etc.). To get the root of the word of a casual conversation or a social media post, stemming algorithm which can process the non-formal words with affixes is required. Stemming non-formal words can be used in various information retrievals such as sentiment analysis on twitter posts. Therefore, this study modifies Flexible Affix Classification to be able to do stemming on non-formal word. Modifications are made by adding a non-formal affix rule. The result of the research shows that the algorithm made in this research has 73.3% accuracy while the Flexible Affix Classification algorithm has 35% accuracy in processing 60 non-formal affixed words.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"14 1","pages":"531-536"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78826577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350698
Syamsul, R. Syahputra, Suherman
This research is the application of control in distillation process of nutmeg oil. The control system is based on fuzzy logic, with two main parameters namely temperature and vapor pressure. Temperature settings in the range 90–120 °C. Steam pressure setting in 1–2.5 atmospheres range. The most optimal value of the fuzzy logic simulation is embedded in the micro controller to regulate the position of the gas flow valve. The experiments were carried out on distillation system by water and steam distillation boiler method. The fuel used in refining boilers is gas fuel. The capacity of the distillation system is 25 kg of dry nutmeg. Test results by applying controls without fuzzy logic and with fuzzy logic. From testing for 16 hours distillation system by applying fuzzy logic based control can optimize gas fuel energy by 20.3%.
{"title":"Control system based on fuzzy logic in nutmeg oil distillation process for energy optimization","authors":"Syamsul, R. Syahputra, Suherman","doi":"10.1109/ICOIACT.2018.8350698","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350698","url":null,"abstract":"This research is the application of control in distillation process of nutmeg oil. The control system is based on fuzzy logic, with two main parameters namely temperature and vapor pressure. Temperature settings in the range 90–120 °C. Steam pressure setting in 1–2.5 atmospheres range. The most optimal value of the fuzzy logic simulation is embedded in the micro controller to regulate the position of the gas flow valve. The experiments were carried out on distillation system by water and steam distillation boiler method. The fuel used in refining boilers is gas fuel. The capacity of the distillation system is 25 kg of dry nutmeg. Test results by applying controls without fuzzy logic and with fuzzy logic. From testing for 16 hours distillation system by applying fuzzy logic based control can optimize gas fuel energy by 20.3%.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"200 1","pages":"679-683"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76033167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350782
Syahrul Mustafa, A. Arief, M. B. Nappu
Good electric power system must have good power quality, including small power losses and voltage value at all buses do not exceed the tolerance limit. The tolerable limit of allowable voltage value is between 0.95 to 1.05 per unit. In this research, the quality of Bosowa Cement Industry, Maros' power system will be improved by using capacitors banks. This research will focus on optimum capacitor location determination and size using Genetic Algorithm (GA) to enhance low voltage and minimize power losses. It presents a model for simultaneously allocating bank capacitors for reactive power compensation in power system with a significant amount of dynamic rotating machine load. This research conducts several simulation of power flow before and after installation, optimization of capacitor placement with different bus candidates, to determine the location, number and capacity of capacitor then its economic analysis. The selection of a voltage dropped bus as a candidate only requires 7.400 kVar of 73 capacitor bank units with the value of each capacitor bank 100 kVar which aims to improve the quality of power at Bosowa Cement Industry, Maros, which is installed on several buses. Installation of capacitors can reduce power losses in the system from 901 kW to 801 kW.
{"title":"Optimal capacitor placement and economic analysis for reactive power compensation to improve system's efficiency at Bosowa Cement Industry, Maros","authors":"Syahrul Mustafa, A. Arief, M. B. Nappu","doi":"10.1109/ICOIACT.2018.8350782","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350782","url":null,"abstract":"Good electric power system must have good power quality, including small power losses and voltage value at all buses do not exceed the tolerance limit. The tolerable limit of allowable voltage value is between 0.95 to 1.05 per unit. In this research, the quality of Bosowa Cement Industry, Maros' power system will be improved by using capacitors banks. This research will focus on optimum capacitor location determination and size using Genetic Algorithm (GA) to enhance low voltage and minimize power losses. It presents a model for simultaneously allocating bank capacitors for reactive power compensation in power system with a significant amount of dynamic rotating machine load. This research conducts several simulation of power flow before and after installation, optimization of capacitor placement with different bus candidates, to determine the location, number and capacity of capacitor then its economic analysis. The selection of a voltage dropped bus as a candidate only requires 7.400 kVar of 73 capacitor bank units with the value of each capacitor bank 100 kVar which aims to improve the quality of power at Bosowa Cement Industry, Maros, which is installed on several buses. Installation of capacitors can reduce power losses in the system from 901 kW to 801 kW.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"1 1","pages":"778-783"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83936040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350719
Shabrina Choirunnisa, R. Sarno, A. Fauzan
A company that has several goals (multi-purpose) to achieve, such as maximizing total sales, maximize production and production capacity, must be in a long time, and time. The method used in this research is objective programming with the occurrence of imported goods activity and messages at container terminals. The calculation results with the programming method can determine the number of documents imported goods letters issued by the customs, minimize processing time and operating costs. Previously, this container log event will be analyzed and forecasted first using Excel Solver to see the results. So, the goal programming model we will put into LINGO to be optimized.
{"title":"Optimization of forecasted port container terminal performance using goal programming","authors":"Shabrina Choirunnisa, R. Sarno, A. Fauzan","doi":"10.1109/ICOIACT.2018.8350719","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350719","url":null,"abstract":"A company that has several goals (multi-purpose) to achieve, such as maximizing total sales, maximize production and production capacity, must be in a long time, and time. The method used in this research is objective programming with the occurrence of imported goods activity and messages at container terminals. The calculation results with the programming method can determine the number of documents imported goods letters issued by the customs, minimize processing time and operating costs. Previously, this container log event will be analyzed and forecasted first using Excel Solver to see the results. So, the goal programming model we will put into LINGO to be optimized.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"28 7 1","pages":"332-336"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82724598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350824
Izhan Fakhruzi
Class imbalance problem considerably often occurs in real life data setting, particularly in clinical datasets, in which case of a two class classification is not equally presented. This situation causes negative effect on the performance of neural networks that can lead the algorithm to overfit the data and have poor accuracy. Bagging is one of popular ensemble methods that is able to address class imbalance problem. Furthermore, bagging shows well performance with unstable classifiers such as neural networks. The experimental results show that the proposed method, bagging neural networks, has successfully addressed class imbalance problem on clinical diagnosis predictions.
{"title":"An artificial neural network with bagging to address imbalance datasets on clinical prediction","authors":"Izhan Fakhruzi","doi":"10.1109/ICOIACT.2018.8350824","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350824","url":null,"abstract":"Class imbalance problem considerably often occurs in real life data setting, particularly in clinical datasets, in which case of a two class classification is not equally presented. This situation causes negative effect on the performance of neural networks that can lead the algorithm to overfit the data and have poor accuracy. Bagging is one of popular ensemble methods that is able to address class imbalance problem. Furthermore, bagging shows well performance with unstable classifiers such as neural networks. The experimental results show that the proposed method, bagging neural networks, has successfully addressed class imbalance problem on clinical diagnosis predictions.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"5 1","pages":"895-898"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79087160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1109/ICOIACT.2018.8350792
Yoga Pristyanto, Irfan Pratama, A. F. Nugraha
In Educational Data Mining (EDM), researchers usually overlook the balance of the distribution on a dataset. It can seriously affect the result of the classification process. Theoretically, the majority of classifier assumed that the distribution of the data is relatively balanced. Hence, the performance of the classification algorithm just become less effective and need to be handled so the problem can be solved. This study will explain about imbalanced class on multiclass EDM dataset handling mechanism using the combination of SMOTE and OSS. SMOTE and OSS method provides balancing mechanism for the dataset's distribution, so that the classification results will be enhanced in terms of classification performance. The result shows that the combination of SMOTE and OSS can enhance the performance of SVM as the classification method that used in this study. Those combination of methods produce the accuracy, sensitivity, specificity, and g-mean score as high as 88.637%, 92.292%, 95.554%, 93.796% respectively. Hence, the SMOTE and OSS combination can be a viable solution for imbalanced class on EDM's multiclass dataset.
{"title":"Data level approach for imbalanced class handling on educational data mining multiclass classification","authors":"Yoga Pristyanto, Irfan Pratama, A. F. Nugraha","doi":"10.1109/ICOIACT.2018.8350792","DOIUrl":"https://doi.org/10.1109/ICOIACT.2018.8350792","url":null,"abstract":"In Educational Data Mining (EDM), researchers usually overlook the balance of the distribution on a dataset. It can seriously affect the result of the classification process. Theoretically, the majority of classifier assumed that the distribution of the data is relatively balanced. Hence, the performance of the classification algorithm just become less effective and need to be handled so the problem can be solved. This study will explain about imbalanced class on multiclass EDM dataset handling mechanism using the combination of SMOTE and OSS. SMOTE and OSS method provides balancing mechanism for the dataset's distribution, so that the classification results will be enhanced in terms of classification performance. The result shows that the combination of SMOTE and OSS can enhance the performance of SVM as the classification method that used in this study. Those combination of methods produce the accuracy, sensitivity, specificity, and g-mean score as high as 88.637%, 92.292%, 95.554%, 93.796% respectively. Hence, the SMOTE and OSS combination can be a viable solution for imbalanced class on EDM's multiclass dataset.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"192 1","pages":"310-314"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79560970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}