Efstathios Branikas, Thomas Papastergiou, E. Zacharaki, V. Megalooikonomou
{"title":"Instance Selection Techniques for Multiple Instance Classification","authors":"Efstathios Branikas, Thomas Papastergiou, E. Zacharaki, V. Megalooikonomou","doi":"10.1109/IISA.2019.8900679","DOIUrl":null,"url":null,"abstract":"As the amount of data increases, fully supervised learning methods relying on dense annotations often become impractical, and are substituted by weakly supervised methods, that exploit data with a variable content in respect to size and semantics. In such schemes the volume of irrelevant information might be critically high impacting negatively the modeling performance and increasing considerably the memory and computational cost. Data reduction or selection are necessary to mitigate these effects. In this paper we propose and compare three different instance selection techniques for the Multiple Instance Learning (MIL) paradigm. The techniques are assessed for the problem of image classification using features from standard benchmark MIL datasets, as well as recently proposed features based on tensor decomposition. As implementation paradigm we exploit the widely accepted JC2MIL algorithm that performs joint clustering and classification. Two of the proposed instance selection techniques are based on Shannon entropy in image and feature space respectively, while one technique is based on a clustering evaluation metric, the silhouette score, that is introduced internally in the iterative joint clustering and classification algorithm. The enrichment of the MIL framework with the instance selection step showed to outperform the original algorithm providing state-of-the-art results in the vast majority of the performed experiments.","PeriodicalId":371385,"journal":{"name":"2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2019.8900679","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
As the amount of data increases, fully supervised learning methods relying on dense annotations often become impractical, and are substituted by weakly supervised methods, that exploit data with a variable content in respect to size and semantics. In such schemes the volume of irrelevant information might be critically high impacting negatively the modeling performance and increasing considerably the memory and computational cost. Data reduction or selection are necessary to mitigate these effects. In this paper we propose and compare three different instance selection techniques for the Multiple Instance Learning (MIL) paradigm. The techniques are assessed for the problem of image classification using features from standard benchmark MIL datasets, as well as recently proposed features based on tensor decomposition. As implementation paradigm we exploit the widely accepted JC2MIL algorithm that performs joint clustering and classification. Two of the proposed instance selection techniques are based on Shannon entropy in image and feature space respectively, while one technique is based on a clustering evaluation metric, the silhouette score, that is introduced internally in the iterative joint clustering and classification algorithm. The enrichment of the MIL framework with the instance selection step showed to outperform the original algorithm providing state-of-the-art results in the vast majority of the performed experiments.