Comprehensive Analysis of Few-shot Image Classification Method Using Triplet Loss

Vìsnik Nacìonalʹnogo unìversitetu "Lʹvìvsʹka polìtehnìka". Serìâ Ìnformacìjnì sistemi ta merežì Pub Date : 2022-06-15 DOI:10.23939/sisn2022.11.103

Mykola Baranov, Y. Shcherbyna

{"title":"Comprehensive Analysis of Few-shot Image Classification Method Using Triplet Loss","authors":"Mykola Baranov, Y. Shcherbyna","doi":"10.23939/sisn2022.11.103","DOIUrl":null,"url":null,"abstract":"Image classification task is a very important problem of a computer vision area. The first approaches to image classification tasks belong to a classic straightforward algorithm. Despite the successful applications of such algorithms a lot of image classification tasks had not been solved until machine learning approaches were involved in a computer vision area. An early successful result of machine learning applications helps researchers with extracted features classification which was not available without machine learning models. But handcrafter features were required which left the most complicated classification task impossible to solve. Recent success in deep learning allows researchers to implement automatic trainable feature extraction. This gave significant progress in the computer vision area last but not least. Processing large-scale datasets bring researchers great progress in automatic feature extraction thus combining such features with precious approaches led to groundbreaking in computer vision. But a new limitation has come - dependency on large amounts of data. Deep learning approaches to image classification task usually requires large-scale datasets. Moreover, modern models lead to unexpected behavior in distribution datasets. A few-shot learning approach of deep learning models allows us to dramatically reduce the amount of required data while keeping the same promising results. Despite reduced datasets, there is still a tradeoff between the amount of available data and trained model performance. In this paper, we implemented a siamese network based on triplet loss. Then, we investigate a relationship between the amount of available data and few-shot model performances. We compare the models obtained by metric-learning with baselines models trained using large-scale datasets.","PeriodicalId":444399,"journal":{"name":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23939/sisn2022.11.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Image classification task is a very important problem of a computer vision area. The first approaches to image classification tasks belong to a classic straightforward algorithm. Despite the successful applications of such algorithms a lot of image classification tasks had not been solved until machine learning approaches were involved in a computer vision area. An early successful result of machine learning applications helps researchers with extracted features classification which was not available without machine learning models. But handcrafter features were required which left the most complicated classification task impossible to solve. Recent success in deep learning allows researchers to implement automatic trainable feature extraction. This gave significant progress in the computer vision area last but not least. Processing large-scale datasets bring researchers great progress in automatic feature extraction thus combining such features with precious approaches led to groundbreaking in computer vision. But a new limitation has come - dependency on large amounts of data. Deep learning approaches to image classification task usually requires large-scale datasets. Moreover, modern models lead to unexpected behavior in distribution datasets. A few-shot learning approach of deep learning models allows us to dramatically reduce the amount of required data while keeping the same promising results. Despite reduced datasets, there is still a tradeoff between the amount of available data and trained model performance. In this paper, we implemented a siamese network based on triplet loss. Then, we investigate a relationship between the amount of available data and few-shot model performances. We compare the models obtained by metric-learning with baselines models trained using large-scale datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于三重态损失的少拍图像分类方法综合分析

图像分类任务是计算机视觉领域的一个非常重要的问题。图像分类任务的第一种方法属于经典的直接算法。尽管这些算法的成功应用，许多图像分类任务尚未解决，直到机器学习方法涉及到计算机视觉领域。机器学习应用的早期成功结果帮助研究人员提取特征分类，这在没有机器学习模型的情况下是不可用的。但是需要手工特征，这使得最复杂的分类任务无法解决。最近深度学习的成功使研究人员能够实现自动可训练的特征提取。这给计算机视觉领域带来了重大进展，最后但并非最不重要。对大规模数据集的处理使研究人员在自动特征提取方面取得了很大进展，将这些特征与宝贵的方法相结合，在计算机视觉方面取得了突破性进展。但是一个新的限制出现了——对大量数据的依赖。深度学习方法的图像分类任务通常需要大规模的数据集。此外，现代模型会导致分布数据集中出现意想不到的行为。深度学习模型的几次学习方法使我们能够在保持相同的有希望的结果的同时显着减少所需的数据量。尽管减少了数据集，但在可用数据量和训练模型性能之间仍然存在权衡。在本文中，我们实现了一个基于三重丢失的连体网络。然后，我们研究了可用数据量与少射模型性能之间的关系。我们将度量学习获得的模型与使用大规模数据集训练的基线模型进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Vìsnik Nacìonalʹnogo unìversitetu "Lʹvìvsʹka polìtehnìka". Serìâ Ìnformacìjnì sistemi ta merežì

自引率

0.00%

发文量