Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865489
Alifya Febriana, K. Muchtar, R. Dawood, Chih-Yang Lin
Coffee is one of the plantation commodities that plays a big role in the world economy. According to the classification of coffee, each type of coffee has various shapes and textures. Traditional human visual sorting of coffee beans is time-consuming, labor-intensive, and may result in low-quality coffee due to work stress and exhaustion. The contribution of this paper is twofold. First, a new dataset, called USK-Coffee, which contains a total of 8.000 images and is divided into 4 classes, is created and made publicly available. To the best of our knowledge, the USK-Coffee dataset is currently the most comprehensive green coffee bean dataset. Second, this study aims to offer a lightweight and understandable intelligent coffee bean sort accurately system that uses deep learning (DL) to assist farmers in sorting green bean arabica by variety. To be specific, this paper presents a baseline for classification performance on the dataset using the benchmark deep learning models, MobileNetV2, and ResNet-18. These models achieved an average classification accuracy of 81.31% and 81.12%, respectively. The dataset is available at: http://comvis.unsyiah.ac.id/usk-coffee/
{"title":"USK-COFFEE Dataset: A Multi-Class Green Arabica Coffee Bean Dataset for Deep Learning","authors":"Alifya Febriana, K. Muchtar, R. Dawood, Chih-Yang Lin","doi":"10.1109/CyberneticsCom55287.2022.9865489","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865489","url":null,"abstract":"Coffee is one of the plantation commodities that plays a big role in the world economy. According to the classification of coffee, each type of coffee has various shapes and textures. Traditional human visual sorting of coffee beans is time-consuming, labor-intensive, and may result in low-quality coffee due to work stress and exhaustion. The contribution of this paper is twofold. First, a new dataset, called USK-Coffee, which contains a total of 8.000 images and is divided into 4 classes, is created and made publicly available. To the best of our knowledge, the USK-Coffee dataset is currently the most comprehensive green coffee bean dataset. Second, this study aims to offer a lightweight and understandable intelligent coffee bean sort accurately system that uses deep learning (DL) to assist farmers in sorting green bean arabica by variety. To be specific, this paper presents a baseline for classification performance on the dataset using the benchmark deep learning models, MobileNetV2, and ResNet-18. These models achieved an average classification accuracy of 81.31% and 81.12%, respectively. The dataset is available at: http://comvis.unsyiah.ac.id/usk-coffee/","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130797426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865623
C. Dewi, E. Arisoesilaningsih, W. Mahmudy, Solimun
Susu banana fruits has a uniqueness, where is the difference of slightly ripe and ripe susu banana at the ripen stage is perfectly difficult to distinguish visually because both have almost the same yellow color. Therefore, this study performed identification using a fruit image-based computer vision to replace the human visual. The almost similar characteristics of susu banana at slightly ripe, ripe and riper stage were selected to get a dominant character that has a high influence. The ability of information gain (IG) and principal component analysis (PCA) and combined IG-PCA features selection was evaluated to determine the influence of correlation and probability of each feature on each class. Tests were conducted on clean-peeled and spotted peel susu banana with 3 levels of ripeness at the ripen stage to determine the impact of IG, PCA and combined IG-PCA on classification using extreme learning machines. The test results showed that the use of PCA in the clean-peeled with natural curing (group1) and spotted peel with chemicals curing (group3) was better than IG. In the group1, PCA also outperformed combined IG-PCA, but in the group3 the combined use of IG-PCA was better than IG and PCA. Although the use of feature selection at spotted peel with natural curing (group2) was resulted the lower accuracy, overall, the tests showed that the selected of dominant features in the classification could increase the recognition accuracy. The proposed method also proved could be used as an alternative in determining the ripen of susu bananas.
{"title":"Performance of Information Gain and PCA Feature Selection for Determining Ripen Susu Banana Fruits","authors":"C. Dewi, E. Arisoesilaningsih, W. Mahmudy, Solimun","doi":"10.1109/CyberneticsCom55287.2022.9865623","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865623","url":null,"abstract":"Susu banana fruits has a uniqueness, where is the difference of slightly ripe and ripe susu banana at the ripen stage is perfectly difficult to distinguish visually because both have almost the same yellow color. Therefore, this study performed identification using a fruit image-based computer vision to replace the human visual. The almost similar characteristics of susu banana at slightly ripe, ripe and riper stage were selected to get a dominant character that has a high influence. The ability of information gain (IG) and principal component analysis (PCA) and combined IG-PCA features selection was evaluated to determine the influence of correlation and probability of each feature on each class. Tests were conducted on clean-peeled and spotted peel susu banana with 3 levels of ripeness at the ripen stage to determine the impact of IG, PCA and combined IG-PCA on classification using extreme learning machines. The test results showed that the use of PCA in the clean-peeled with natural curing (group1) and spotted peel with chemicals curing (group3) was better than IG. In the group1, PCA also outperformed combined IG-PCA, but in the group3 the combined use of IG-PCA was better than IG and PCA. Although the use of feature selection at spotted peel with natural curing (group2) was resulted the lower accuracy, overall, the tests showed that the selected of dominant features in the classification could increase the recognition accuracy. The proposed method also proved could be used as an alternative in determining the ripen of susu bananas.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129105962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865411
Rafli A. Nugraha, H. Nuha
Humans are individuals who have different characteristics from other humans. Such as the shape of the face, fingerprints, corneas, and the sound of footsteps. The difference is then used for a security system or also called biometrics. Therefore, in the discussion of this research, the research was conducted to test the success rate or accuracy value of the two classification methods of the footstep recognition system that can detect more than one person, with the method used Mel Frequency Cepstral Coefficients (MFCC) as feature extraction, Artificial Neural Network (ANN) and Recurrent Neural Network (RNN) as footstep classification methods. From the two classification methods, the authors conducted research to try to build a footstep recognition system with the ANN classification method for the first system and the RNN classification method for the second system. The results of this study indicate that in the first system, using the ANN Classification method, the accuracy is 93.59, val_accuracy is 88.74, and the loss value is 44.18. Then for the second system, the results of the RNN classification method obtained an accuracy of 96.66, val_accuracy of 87, and a loss value of 0.84. There are differences in results between the ANN and RNN classification methods, that in this study the RNN classification method has an accuracy value of 3.07 which is higher than the ANN classification method. So in this study, the success rate of the foot tracking system using the RNN classification method is better than the ANN classification method.
{"title":"Footstep Recognition Using Feedforward Neural Network","authors":"Rafli A. Nugraha, H. Nuha","doi":"10.1109/CyberneticsCom55287.2022.9865411","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865411","url":null,"abstract":"Humans are individuals who have different characteristics from other humans. Such as the shape of the face, fingerprints, corneas, and the sound of footsteps. The difference is then used for a security system or also called biometrics. Therefore, in the discussion of this research, the research was conducted to test the success rate or accuracy value of the two classification methods of the footstep recognition system that can detect more than one person, with the method used Mel Frequency Cepstral Coefficients (MFCC) as feature extraction, Artificial Neural Network (ANN) and Recurrent Neural Network (RNN) as footstep classification methods. From the two classification methods, the authors conducted research to try to build a footstep recognition system with the ANN classification method for the first system and the RNN classification method for the second system. The results of this study indicate that in the first system, using the ANN Classification method, the accuracy is 93.59, val_accuracy is 88.74, and the loss value is 44.18. Then for the second system, the results of the RNN classification method obtained an accuracy of 96.66, val_accuracy of 87, and a loss value of 0.84. There are differences in results between the ANN and RNN classification methods, that in this study the RNN classification method has an accuracy value of 3.07 which is higher than the ANN classification method. So in this study, the success rate of the foot tracking system using the RNN classification method is better than the ANN classification method.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126253927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865390
Ridlho Khoirul Fachri, M. Z. Romdlony, M. R. Rosa
We implemented the Control Lyapunov-Barrier Function (CLBF) method on the Autonomous Mobile Robot (AMR) hardware using the inverse kinematics of four mecanum wheels. The CLBF method is used to obtain stability and safety in the system. The stability of the system is defined when the AMR is able to reach the specified equilibrium point and the safety of the system is defined when the AMR is able to avoid the existing unsafe state. Waypoint navigation is used to provide several points of equilibrium so that the robot can move to the desired coordinate points. In this paper, we do not use a local sensor such as an encoder, but use a global sensor, namely a camera, to read the coordinates of the AMR position. We use a microcontroller to receive the coordinates of the $x$ and $y$ positions of the BLOB detection. The test was carried out three times with each time testing through three waypoints and one predetermined unsafe state. This study resulted in the percentage value of the implementation success of 76.47%, this value is the result of a comparison of the path generated by the simulation with Matlab and the path from the AMR real plant.
{"title":"Multiple Waypoint Navigation for Mobile Robot Using Control Lyapunov-Barrier Function (CLBF)","authors":"Ridlho Khoirul Fachri, M. Z. Romdlony, M. R. Rosa","doi":"10.1109/CyberneticsCom55287.2022.9865390","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865390","url":null,"abstract":"We implemented the Control Lyapunov-Barrier Function (CLBF) method on the Autonomous Mobile Robot (AMR) hardware using the inverse kinematics of four mecanum wheels. The CLBF method is used to obtain stability and safety in the system. The stability of the system is defined when the AMR is able to reach the specified equilibrium point and the safety of the system is defined when the AMR is able to avoid the existing unsafe state. Waypoint navigation is used to provide several points of equilibrium so that the robot can move to the desired coordinate points. In this paper, we do not use a local sensor such as an encoder, but use a global sensor, namely a camera, to read the coordinates of the AMR position. We use a microcontroller to receive the coordinates of the $x$ and $y$ positions of the BLOB detection. The test was carried out three times with each time testing through three waypoints and one predetermined unsafe state. This study resulted in the percentage value of the implementation success of 76.47%, this value is the result of a comparison of the path generated by the simulation with Matlab and the path from the AMR real plant.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126856167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865291
Chiva Olivia Bilah, T. B. Adji, N. A. Setiawan
NLP (Natural Language Processing) has become the focus of research in recent years. NLP tasks have been implemented in various sectors and fields. The chatbot system is one of the NLP tasks, which functions to communicate with humans using natural language. Many researchers build models to represent the chatbot. To make a chatbot more powerful, the intent of the conversation a set of sentences representing a specific user's intention when interacting with the chatbot, must be classified. This classification will make the chatbot system more focused, which leads to providing appropriate answers. Humans can simply understand the meaning of different sentences with the same intent. However, a chatbot system will require a complex technique. Therefore, our work uses the CNN (Convolutional Neural Network) for intent detection in Indonesian Language Text using ATIS (Airline Travel Information System) dataset. CNN was selected because it can extract important features from input data, which makes it more efficient than other deep learning algorithms, in terms of memory and complexity. In our work, we also used GloVe (Global Vectors) embedding for generating an optimal intent classification model. The result shows that the GloVe model and CNN produce the best accuracy of 95.84%.
{"title":"Intent Detection on Indonesian Text Using Convolutional Neural Network","authors":"Chiva Olivia Bilah, T. B. Adji, N. A. Setiawan","doi":"10.1109/CyberneticsCom55287.2022.9865291","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865291","url":null,"abstract":"NLP (Natural Language Processing) has become the focus of research in recent years. NLP tasks have been implemented in various sectors and fields. The chatbot system is one of the NLP tasks, which functions to communicate with humans using natural language. Many researchers build models to represent the chatbot. To make a chatbot more powerful, the intent of the conversation a set of sentences representing a specific user's intention when interacting with the chatbot, must be classified. This classification will make the chatbot system more focused, which leads to providing appropriate answers. Humans can simply understand the meaning of different sentences with the same intent. However, a chatbot system will require a complex technique. Therefore, our work uses the CNN (Convolutional Neural Network) for intent detection in Indonesian Language Text using ATIS (Airline Travel Information System) dataset. CNN was selected because it can extract important features from input data, which makes it more efficient than other deep learning algorithms, in terms of memory and complexity. In our work, we also used GloVe (Global Vectors) embedding for generating an optimal intent classification model. The result shows that the GloVe model and CNN produce the best accuracy of 95.84%.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131395293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865634
Kien Trang, Hoang An Nguyen, Long TonThat, Hung Ngoc Do, B. Vuong
Millions of confirmed cancer cases have been reported worldwide as a result of the development of skin disease. One of the most essential stages in preventing disease development is early diagnosis and treatment. Nevertheless, due to similarities in appearance, location, color, and size, diagnosing skin lesions is a challenging feat which requires high standard human resources in the medical system. To address this problem, a machine-based skin disease diagnosis is introduced as a first step to aid in patient classification. Recently, deep learning in medical imaging is becoming a cutting-edge research trend in a variety of applications. In this research, an ensemble network from the pre-trained models ResNet50, MobileNetV3, and EfficientNet is proposed to classify skin diseases. Thanks to the major voting step, the advantages of distinct models are combined to improve the diagnosis of the classification process. The observations and results are based on the experiments performed with the HAM10000 dataset, which includes 7 different forms of skin disease. In comparison to the initial pre-trained models, the proposed model has a 98.3 % average accuracy and other assessment metrics indicate improved results.
{"title":"An Ensemble Voting Method of Pre-Trained Deep Learning Models for Skin Disease Identification","authors":"Kien Trang, Hoang An Nguyen, Long TonThat, Hung Ngoc Do, B. Vuong","doi":"10.1109/CyberneticsCom55287.2022.9865634","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865634","url":null,"abstract":"Millions of confirmed cancer cases have been reported worldwide as a result of the development of skin disease. One of the most essential stages in preventing disease development is early diagnosis and treatment. Nevertheless, due to similarities in appearance, location, color, and size, diagnosing skin lesions is a challenging feat which requires high standard human resources in the medical system. To address this problem, a machine-based skin disease diagnosis is introduced as a first step to aid in patient classification. Recently, deep learning in medical imaging is becoming a cutting-edge research trend in a variety of applications. In this research, an ensemble network from the pre-trained models ResNet50, MobileNetV3, and EfficientNet is proposed to classify skin diseases. Thanks to the major voting step, the advantages of distinct models are combined to improve the diagnosis of the classification process. The observations and results are based on the experiments performed with the HAM10000 dataset, which includes 7 different forms of skin disease. In comparison to the initial pre-trained models, the proposed model has a 98.3 % average accuracy and other assessment metrics indicate improved results.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131657813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865666
A. Ma’arif, Iswanto Suwarno, Wahyu Rahmaniar, H. Maghfiroh, Nia Maharani Raharja, Aninditya Anggari Nuryono
This study discusses the control of the liquid level of a tank system using the Proportional Integral Derivative (PID) control and Full State Feedback (FSB) control. Tank systems are widely used in industrial processes and require a controller so that the liquid level follows the needs. Determination of PID controller parameters was sought by using Matlab's PID tuning feature. Meanwhile, the FSB parameters was determined using the trial and error method. The research results based on the Simulink Matlab simulation showed that the PID and FSB controllers could control the liquid level of the tank system and reached the reference value. However, the system's response with FSB control was better than PID control with faster settling time and smaller overshoot.
{"title":"Liquid Tank Level Control with Proportional Integral Derivative (PID) and Full State Feedback (FSB)","authors":"A. Ma’arif, Iswanto Suwarno, Wahyu Rahmaniar, H. Maghfiroh, Nia Maharani Raharja, Aninditya Anggari Nuryono","doi":"10.1109/CyberneticsCom55287.2022.9865666","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865666","url":null,"abstract":"This study discusses the control of the liquid level of a tank system using the Proportional Integral Derivative (PID) control and Full State Feedback (FSB) control. Tank systems are widely used in industrial processes and require a controller so that the liquid level follows the needs. Determination of PID controller parameters was sought by using Matlab's PID tuning feature. Meanwhile, the FSB parameters was determined using the trial and error method. The research results based on the Simulink Matlab simulation showed that the PID and FSB controllers could control the liquid level of the tank system and reached the reference value. However, the system's response with FSB control was better than PID control with faster settling time and smaller overshoot.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133468990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865463
M. Riyadi, Oltfaz Rabakhir Rane, Afandi Amir, T. Prakoso, I. Setiawan
Brain-computer interface (BCI) technology is commonly used to describe the brain signal activity non-invasively. The development of EEG devices which is used to record brain activity continues to be carried out, both in terms of accuracy, suitability, computation, and cost. However, the complexity arises with increased electrode numbers. In some studies, minimizing the number of electrodes can be a solution to reduce computational time, cost, and the shape of the EEG device, without compromising the level of accuracy. By choosing particular electrodes which are highly related to the activity, the electrode usage can be reduced. This study used correlation coefficient method which is proposed to determine the best electrode pairs. Moreover, the electrodes which have similar features is eliminated. Based on the experimental and test results, it showed that the results were very good, where the average accuracy and F1 Score was increasing by 2% and 4% compared to the use of all electrodes, this increasing was followed by a decreasing in computation time with an average decreasing in debugging time by 35%.
{"title":"Method of Electroencephalography Electrode Selection for Motor Imagery Application","authors":"M. Riyadi, Oltfaz Rabakhir Rane, Afandi Amir, T. Prakoso, I. Setiawan","doi":"10.1109/CyberneticsCom55287.2022.9865463","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865463","url":null,"abstract":"Brain-computer interface (BCI) technology is commonly used to describe the brain signal activity non-invasively. The development of EEG devices which is used to record brain activity continues to be carried out, both in terms of accuracy, suitability, computation, and cost. However, the complexity arises with increased electrode numbers. In some studies, minimizing the number of electrodes can be a solution to reduce computational time, cost, and the shape of the EEG device, without compromising the level of accuracy. By choosing particular electrodes which are highly related to the activity, the electrode usage can be reduced. This study used correlation coefficient method which is proposed to determine the best electrode pairs. Moreover, the electrodes which have similar features is eliminated. Based on the experimental and test results, it showed that the results were very good, where the average accuracy and F1 Score was increasing by 2% and 4% compared to the use of all electrodes, this increasing was followed by a decreasing in computation time with an average decreasing in debugging time by 35%.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"28 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114018432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865339
S. Hasan, Wang Ruiqin, Md Gulzar Hussain
Document clustering is the compilation of docu-ments relating to textual content into classes or clusters. The primary objective is to group the documents that are internally logical but substantially different from each other. It is a vital method used in the retrieval of information, extraction of information and organization of records. Around 210 million people worldwide speak Bangla as a first or second language. With the passage of time, these computer-assisted approaches were also used in the Bangla language. However, not enough paper has represented the current state of research in Bangla Document Clustering. The ultimate aim of this work is to achieve the objective of testing K-Means clustering and Mini-Batch K-Means clustering algorithms and analysing the performance with silhouette score and homogeneity score of these algorithms for Bangla news text data. The findings shows that with TF-IDF both K-Mean and MiniBatch K-Mean algorithms gives silhouette score of 0.031 & 0.015 and homogeneity score of 0.33 & 0.27 for 11 clusters which is better than the results with CountVectorizer.
{"title":"Clustering Analysis of Bangla News Articles with TF-IDF & CV Using Mini-Batch K-Means and K-Means","authors":"S. Hasan, Wang Ruiqin, Md Gulzar Hussain","doi":"10.1109/CyberneticsCom55287.2022.9865339","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865339","url":null,"abstract":"Document clustering is the compilation of docu-ments relating to textual content into classes or clusters. The primary objective is to group the documents that are internally logical but substantially different from each other. It is a vital method used in the retrieval of information, extraction of information and organization of records. Around 210 million people worldwide speak Bangla as a first or second language. With the passage of time, these computer-assisted approaches were also used in the Bangla language. However, not enough paper has represented the current state of research in Bangla Document Clustering. The ultimate aim of this work is to achieve the objective of testing K-Means clustering and Mini-Batch K-Means clustering algorithms and analysing the performance with silhouette score and homogeneity score of these algorithms for Bangla news text data. The findings shows that with TF-IDF both K-Mean and MiniBatch K-Mean algorithms gives silhouette score of 0.031 & 0.015 and homogeneity score of 0.33 & 0.27 for 11 clusters which is better than the results with CountVectorizer.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116233181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1109/CyberneticsCom55287.2022.9865282
Lay Acheadeth, N. N. Qomariyah, Misa M. Xirinda
E-commerce is growing at a breakneck pace. As a result, online shopping has increased, which has increased online product reviews. Often, we come across Amazon products with thousands of reviews, and if we look closely we discover that some of them are completely unrelated to the product. In this study, we conducted research on how product review classification can assist in resolving the issue of comments on incorrect items. The method used in this research consists of 4 steps which are, data acquisition, data pre-processing, topic modeling, and text classification. Where Latent Dirichlet Allocation (LDA) was used as our topic modeling technique, and for text classification we used Support Vector Machine (SVM), Logistic Regression, and Multi-Layer Perceptron (MLP) classifiers. We found out that by combining both topic modeling and text classification, a powerful tool for handling this kind of problem was developed. Adding the topic modeling can improve the model's accuracy performance from 0.61 to 0.78. So, we can conclude that the topic modeling was useful in classifying the product reviews.
{"title":"Utilizing Topic Modelling in Customer Product Review for Classifying Baby Product","authors":"Lay Acheadeth, N. N. Qomariyah, Misa M. Xirinda","doi":"10.1109/CyberneticsCom55287.2022.9865282","DOIUrl":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865282","url":null,"abstract":"E-commerce is growing at a breakneck pace. As a result, online shopping has increased, which has increased online product reviews. Often, we come across Amazon products with thousands of reviews, and if we look closely we discover that some of them are completely unrelated to the product. In this study, we conducted research on how product review classification can assist in resolving the issue of comments on incorrect items. The method used in this research consists of 4 steps which are, data acquisition, data pre-processing, topic modeling, and text classification. Where Latent Dirichlet Allocation (LDA) was used as our topic modeling technique, and for text classification we used Support Vector Machine (SVM), Logistic Regression, and Multi-Layer Perceptron (MLP) classifiers. We found out that by combining both topic modeling and text classification, a powerful tool for handling this kind of problem was developed. Adding the topic modeling can improve the model's accuracy performance from 0.61 to 0.78. So, we can conclude that the topic modeling was useful in classifying the product reviews.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123780499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}