{"title":"Data Labeling with Novel Decision Module of Tri-training","authors":"Chuan-Mu Tseng, Tzu-Wei Huang, Tzong-Jye Liu","doi":"10.1109/ICCCI49374.2020.9145968","DOIUrl":null,"url":null,"abstract":"In machine learning, supervised learning methods for classifiers have to need sufficient labeled training data. However, it is quite labor-intensive and expensive to manually label a great number of training data. At the moment, there are two types of research related to data labeling: co-training and tri-training. The former primarily uses the voting system based on two algorithms to obtain the labeled results, and the latter is applied to improving co-training. When the two algorithms produce inconsistent results, they will not be able to label the data correctly. Hence, the method of tri-training makes use of the third algorithm to help judge. Our proposed method, Novel Decision Module of Tri-training (NDMTT), uses the output of their architecture to filter the results with threshold conditions, so the validity of the labeling can be improved. When it is improved, one of the classifications can be relatively increased.","PeriodicalId":153290,"journal":{"name":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI49374.2020.9145968","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In machine learning, supervised learning methods for classifiers have to need sufficient labeled training data. However, it is quite labor-intensive and expensive to manually label a great number of training data. At the moment, there are two types of research related to data labeling: co-training and tri-training. The former primarily uses the voting system based on two algorithms to obtain the labeled results, and the latter is applied to improving co-training. When the two algorithms produce inconsistent results, they will not be able to label the data correctly. Hence, the method of tri-training makes use of the third algorithm to help judge. Our proposed method, Novel Decision Module of Tri-training (NDMTT), uses the output of their architecture to filter the results with threshold conditions, so the validity of the labeling can be improved. When it is improved, one of the classifications can be relatively increased.