{"title":"在类标签噪声情况下提高 Adaboost 性能:基于脑电图的精神分裂症患者分类与基准数据集比较研究","authors":"O. R. Pouya, Reza Boostani, M. Sabeti","doi":"10.3233/ida-227125","DOIUrl":null,"url":null,"abstract":"The performance of Adaboost is highly sensitive to noisy and outlier samples. This is therefore the weights of these samples are exponentially increased in successive rounds. In this paper, three novel schemes are proposed to hunt the corrupted samples and eliminate them through the training process. The methods are: I) a hybrid method based on K-means clustering and K-nearest neighbor, II) a two-layer Adaboost, and III) soft margin support vector machines. All of these solutions are compared to the standard Adaboost on thirteen Gunnar Raetsch’s datasets under three levels of class-label noise. To test the proposed method on a real application, electroencephalography (EEG) signals of 20 schizophrenic patients and 20 age-matched control subjects, are recorded via 20 channels in the idle state. Several features including autoregressive coefficients, band power and fractal dimension are extracted from EEG signals of all participants. Sequential feature subset selection technique is adopted to select the discriminative EEG features. Experimental results imply that exploiting the proposed hunting techniques enhance the Adaboost performance as well as alleviating its robustness against unconfident and noisy samples over Raetsch benchmark and EEG features of the two groups.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"32 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Adaboost performance in the presence of class-label noise: A comparative study on EEG-based classification of schizophrenic patients and benchmark datasets\",\"authors\":\"O. R. Pouya, Reza Boostani, M. Sabeti\",\"doi\":\"10.3233/ida-227125\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of Adaboost is highly sensitive to noisy and outlier samples. This is therefore the weights of these samples are exponentially increased in successive rounds. In this paper, three novel schemes are proposed to hunt the corrupted samples and eliminate them through the training process. The methods are: I) a hybrid method based on K-means clustering and K-nearest neighbor, II) a two-layer Adaboost, and III) soft margin support vector machines. All of these solutions are compared to the standard Adaboost on thirteen Gunnar Raetsch’s datasets under three levels of class-label noise. To test the proposed method on a real application, electroencephalography (EEG) signals of 20 schizophrenic patients and 20 age-matched control subjects, are recorded via 20 channels in the idle state. Several features including autoregressive coefficients, band power and fractal dimension are extracted from EEG signals of all participants. Sequential feature subset selection technique is adopted to select the discriminative EEG features. Experimental results imply that exploiting the proposed hunting techniques enhance the Adaboost performance as well as alleviating its robustness against unconfident and noisy samples over Raetsch benchmark and EEG features of the two groups.\",\"PeriodicalId\":50355,\"journal\":{\"name\":\"Intelligent Data Analysis\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2023-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Data Analysis\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/ida-227125\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-227125","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
Adaboost 的性能对噪声样本和离群样本非常敏感。因此,这些样本的权重会在连续几轮中呈指数增长。本文提出了三种新方案,通过训练过程猎取并消除损坏的样本。这些方法是I) 基于 K 均值聚类和 K 近邻的混合方法;II) 两层 Adaboost;III) 软边际支持向量机。所有这些解决方案都在 Gunnar Raetsch 的 13 个数据集上与标准 Adaboost 方法进行了比较,这些数据集具有三种等级的类标签噪声。为了在实际应用中测试所提出的方法,在空闲状态下通过 20 个通道记录了 20 名精神分裂症患者和 20 名年龄匹配的对照受试者的脑电图(EEG)信号。从所有参与者的脑电信号中提取了一些特征,包括自回归系数、频带功率和分形维度。采用序列特征子集选择技术来选择具有区分性的脑电图特征。实验结果表明,与 Raetsch 基准和两组脑电图特征相比,利用所提出的狩猎技术提高了 Adaboost 的性能,并减轻了其对无把握样本和噪声样本的鲁棒性。
Enhancing Adaboost performance in the presence of class-label noise: A comparative study on EEG-based classification of schizophrenic patients and benchmark datasets
The performance of Adaboost is highly sensitive to noisy and outlier samples. This is therefore the weights of these samples are exponentially increased in successive rounds. In this paper, three novel schemes are proposed to hunt the corrupted samples and eliminate them through the training process. The methods are: I) a hybrid method based on K-means clustering and K-nearest neighbor, II) a two-layer Adaboost, and III) soft margin support vector machines. All of these solutions are compared to the standard Adaboost on thirteen Gunnar Raetsch’s datasets under three levels of class-label noise. To test the proposed method on a real application, electroencephalography (EEG) signals of 20 schizophrenic patients and 20 age-matched control subjects, are recorded via 20 channels in the idle state. Several features including autoregressive coefficients, band power and fractal dimension are extracted from EEG signals of all participants. Sequential feature subset selection technique is adopted to select the discriminative EEG features. Experimental results imply that exploiting the proposed hunting techniques enhance the Adaboost performance as well as alleviating its robustness against unconfident and noisy samples over Raetsch benchmark and EEG features of the two groups.
期刊介绍:
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.