Enhancing Adaboost performance in the presence of class-label noise: A comparative study on EEG-based classification of schizophrenic patients and benchmark datasets
{"title":"Enhancing Adaboost performance in the presence of class-label noise: A comparative study on EEG-based classification of schizophrenic patients and benchmark datasets","authors":"O. R. Pouya, Reza Boostani, M. Sabeti","doi":"10.3233/ida-227125","DOIUrl":null,"url":null,"abstract":"The performance of Adaboost is highly sensitive to noisy and outlier samples. This is therefore the weights of these samples are exponentially increased in successive rounds. In this paper, three novel schemes are proposed to hunt the corrupted samples and eliminate them through the training process. The methods are: I) a hybrid method based on K-means clustering and K-nearest neighbor, II) a two-layer Adaboost, and III) soft margin support vector machines. All of these solutions are compared to the standard Adaboost on thirteen Gunnar Raetsch’s datasets under three levels of class-label noise. To test the proposed method on a real application, electroencephalography (EEG) signals of 20 schizophrenic patients and 20 age-matched control subjects, are recorded via 20 channels in the idle state. Several features including autoregressive coefficients, band power and fractal dimension are extracted from EEG signals of all participants. Sequential feature subset selection technique is adopted to select the discriminative EEG features. Experimental results imply that exploiting the proposed hunting techniques enhance the Adaboost performance as well as alleviating its robustness against unconfident and noisy samples over Raetsch benchmark and EEG features of the two groups.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"32 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-227125","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The performance of Adaboost is highly sensitive to noisy and outlier samples. This is therefore the weights of these samples are exponentially increased in successive rounds. In this paper, three novel schemes are proposed to hunt the corrupted samples and eliminate them through the training process. The methods are: I) a hybrid method based on K-means clustering and K-nearest neighbor, II) a two-layer Adaboost, and III) soft margin support vector machines. All of these solutions are compared to the standard Adaboost on thirteen Gunnar Raetsch’s datasets under three levels of class-label noise. To test the proposed method on a real application, electroencephalography (EEG) signals of 20 schizophrenic patients and 20 age-matched control subjects, are recorded via 20 channels in the idle state. Several features including autoregressive coefficients, band power and fractal dimension are extracted from EEG signals of all participants. Sequential feature subset selection technique is adopted to select the discriminative EEG features. Experimental results imply that exploiting the proposed hunting techniques enhance the Adaboost performance as well as alleviating its robustness against unconfident and noisy samples over Raetsch benchmark and EEG features of the two groups.
期刊介绍:
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.