{"title":"影响分布训练数据对性能监督机器学习算法的影响","authors":"I. B. Suban, A. Emanuel","doi":"10.1109/ISRITI51436.2020.9315413","DOIUrl":null,"url":null,"abstract":"Almost all fields of life need Banknote. Even particular fields of life require banknotes in large quantities such as banks, transportation companies, and casinos. Therefore Banknotes are an essential component in carrying out all activities every day, especially those related to finance. Through technological advancements such as scanners and copy machine, it can provide the opportunity for anyone to commit a crime. The crime is like a counterfeit banknote. Many people still find it difficult to distinguish between a genuine banknote and counterfeit Banknote, that is because counterfeit Banknote produced have a high degree of resemblance to the genuine Banknote. Based on that background, authors want to do a classification process to distinguish between genuine Banknote and counterfeit Banknote. The classification process use methods Supervised Learning and compares the level of accuracy based on the distribution of training data. The methods of supervised Learning used are Support Vector Machine (SVM), K-Nearest Neighbor (K-NN), and Naïve Bayes. K-NN method is a method that has the highest specificity, sensitivity, and accuracy of the three methods used by the authors both in the training data of 30%, 50%, and 80%. Where in the training data 30% and 50% value specificity: 0.99, sensitivity: 1.00, accuracy: 0.99. While the 80% training data value specificity: 1.00, sensitivity: 1.00, accuracy: 1.00. This means that the distribution of training data influences the performance of the Supervised Machine Learning algorithm. In the KNN method, the greater the training data, the better the accuracy.","PeriodicalId":325920,"journal":{"name":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Influence Distribution Training Data on Performance Supervised Machine Learning Algorithms\",\"authors\":\"I. B. Suban, A. Emanuel\",\"doi\":\"10.1109/ISRITI51436.2020.9315413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Almost all fields of life need Banknote. Even particular fields of life require banknotes in large quantities such as banks, transportation companies, and casinos. Therefore Banknotes are an essential component in carrying out all activities every day, especially those related to finance. Through technological advancements such as scanners and copy machine, it can provide the opportunity for anyone to commit a crime. The crime is like a counterfeit banknote. Many people still find it difficult to distinguish between a genuine banknote and counterfeit Banknote, that is because counterfeit Banknote produced have a high degree of resemblance to the genuine Banknote. Based on that background, authors want to do a classification process to distinguish between genuine Banknote and counterfeit Banknote. The classification process use methods Supervised Learning and compares the level of accuracy based on the distribution of training data. The methods of supervised Learning used are Support Vector Machine (SVM), K-Nearest Neighbor (K-NN), and Naïve Bayes. K-NN method is a method that has the highest specificity, sensitivity, and accuracy of the three methods used by the authors both in the training data of 30%, 50%, and 80%. Where in the training data 30% and 50% value specificity: 0.99, sensitivity: 1.00, accuracy: 0.99. While the 80% training data value specificity: 1.00, sensitivity: 1.00, accuracy: 1.00. This means that the distribution of training data influences the performance of the Supervised Machine Learning algorithm. In the KNN method, the greater the training data, the better the accuracy.\",\"PeriodicalId\":325920,\"journal\":{\"name\":\"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISRITI51436.2020.9315413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI51436.2020.9315413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Influence Distribution Training Data on Performance Supervised Machine Learning Algorithms
Almost all fields of life need Banknote. Even particular fields of life require banknotes in large quantities such as banks, transportation companies, and casinos. Therefore Banknotes are an essential component in carrying out all activities every day, especially those related to finance. Through technological advancements such as scanners and copy machine, it can provide the opportunity for anyone to commit a crime. The crime is like a counterfeit banknote. Many people still find it difficult to distinguish between a genuine banknote and counterfeit Banknote, that is because counterfeit Banknote produced have a high degree of resemblance to the genuine Banknote. Based on that background, authors want to do a classification process to distinguish between genuine Banknote and counterfeit Banknote. The classification process use methods Supervised Learning and compares the level of accuracy based on the distribution of training data. The methods of supervised Learning used are Support Vector Machine (SVM), K-Nearest Neighbor (K-NN), and Naïve Bayes. K-NN method is a method that has the highest specificity, sensitivity, and accuracy of the three methods used by the authors both in the training data of 30%, 50%, and 80%. Where in the training data 30% and 50% value specificity: 0.99, sensitivity: 1.00, accuracy: 0.99. While the 80% training data value specificity: 1.00, sensitivity: 1.00, accuracy: 1.00. This means that the distribution of training data influences the performance of the Supervised Machine Learning algorithm. In the KNN method, the greater the training data, the better the accuracy.