Disagreement-Based Co-training

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence Pub Date : 2011-11-07 DOI:10.1109/ICTAI.2011.126

J. Tanha, M. Someren, H. Afsarmanesh

{"title":"Disagreement-Based Co-training","authors":"J. Tanha, M. Someren, H. Afsarmanesh","doi":"10.1109/ICTAI.2011.126","DOIUrl":null,"url":null,"abstract":"Recently, Semi-Supervised learning algorithms such as co-training are used in many domains. In co-training, two classifiers based on different subsets of the features or on different learning algorithms are trained in parallel and unlabeled data that are classified differently by the classifiers but for which one classifier has large confidence are labeled and used as training data for the other. In this paper, a new form of co-training, called Ensemble-Co-Training, is proposed that uses an ensemble of different learning algorithms. Based on a theorem by Angluin and Laird that relates noise in the data to the error of hypotheses learned from these data, we propose a criterion for finding a subset of high-confidence predictions and error rate for a classifier in each iteration of the training process. Experiments show that the new method in almost all domains gives better results than the state-of-the-art methods.","PeriodicalId":332661,"journal":{"name":"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence","volume":"287 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Disagreement-Based Co-training\",\"authors\":\"J. Tanha, M. Someren, H. Afsarmanesh\",\"doi\":\"10.1109/ICTAI.2011.126\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, Semi-Supervised learning algorithms such as co-training are used in many domains. In co-training, two classifiers based on different subsets of the features or on different learning algorithms are trained in parallel and unlabeled data that are classified differently by the classifiers but for which one classifier has large confidence are labeled and used as training data for the other. In this paper, a new form of co-training, called Ensemble-Co-Training, is proposed that uses an ensemble of different learning algorithms. Based on a theorem by Angluin and Laird that relates noise in the data to the error of hypotheses learned from these data, we propose a criterion for finding a subset of high-confidence predictions and error rate for a classifier in each iteration of the training process. Experiments show that the new method in almost all domains gives better results than the state-of-the-art methods.\",\"PeriodicalId\":332661,\"journal\":{\"name\":\"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence\",\"volume\":\"287 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2011.126\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2011.126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

近年来，协同训练等半监督学习算法在许多领域得到了应用。在协同训练中，基于不同特征子集或不同学习算法的两个分类器在并行上进行训练，并对未标记的数据进行标记，这些数据由分类器进行不同的分类，但其中一个分类器具有较大的置信度，并将其作为另一个分类器的训练数据。本文提出了一种新的协同训练形式，称为集成协同训练，它使用不同学习算法的集成。基于Angluin和Laird的一个定理，该定理将数据中的噪声与从这些数据中学习到的假设误差联系起来，我们提出了一个标准，用于在每次迭代的训练过程中为分类器找到高置信度预测和错误率的子集。实验表明，新方法在几乎所有领域都比现有方法具有更好的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Disagreement-Based Co-training

Recently, Semi-Supervised learning algorithms such as co-training are used in many domains. In co-training, two classifiers based on different subsets of the features or on different learning algorithms are trained in parallel and unlabeled data that are classified differently by the classifiers but for which one classifier has large confidence are labeled and used as training data for the other. In this paper, a new form of co-training, called Ensemble-Co-Training, is proposed that uses an ensemble of different learning algorithms. Based on a theorem by Angluin and Laird that relates noise in the data to the error of hypotheses learned from these data, we propose a criterion for finding a subset of high-confidence predictions and error rate for a classifier in each iteration of the training process. Experiments show that the new method in almost all domains gives better results than the state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence

自引率

0.00%

发文量

期刊最新文献

Independence-Based MAP for Markov Networks Structure Discovery Flexible, Efficient and Interactive Retrieval for Supporting In-silico Studies of Endobacteria Recurrent Neural Networks for Moisture Content Prediction in Seed Corn Dryer Buildings Top Subspace Synthesizing for Promotional Subspace Mining RELIEF-C: Efficient Feature Selection for Clustering over Noisy Data