Understanding Episode Hardness in Few-Shot Learning

Yurong Guo;Ruoyi Du;Aneeshan Sain;Kongming Liang;Yuan Dong;Yi-Zhe Song;Zhanyu Ma
{"title":"Understanding Episode Hardness in Few-Shot Learning","authors":"Yurong Guo;Ruoyi Du;Aneeshan Sain;Kongming Liang;Yuan Dong;Yi-Zhe Song;Zhanyu Ma","doi":"10.1109/TPAMI.2024.3476075","DOIUrl":null,"url":null,"abstract":"Achieving generalization for deep learning models has usually suffered from the bottleneck of annotated sample scarcity. As a common way of tackling this issue, few-shot learning focuses on “episodes”, i.e., sampled tasks that help the model acquire generalizable knowledge onto unseen categories – better the episodes, the higher a model's generalisability. Despite extensive research, the characteristics of episodes and their potential effects are relatively less explored. A recent paper discussed that different episodes exhibit different prediction difficulties, and coined a new metric “hardness” to quantify episodes, which however is too wide-range for an arbitrary dataset and thus remains impractical for realistic applications. In this paper therefore, we for the first time conduct an algebraic analysis of the critical factors influencing episode hardness supported by experimental demonstrations, that reveal episode hardness to largely depend on classes within an episode, and importantly propose an efficient pre-sampling hardness assessment technique named Inverse-Fisher Discriminant Ratio (IFDR). This enables sampling hard episodes at the class level via class-level (CL) sampling scheme that drastically decreases quantification cost. Delving deeper, we also develop a variant called class-pair-level (CPL) sampling, which further reduces the sampling cost while guaranteeing the sampled distribution. Finally, comprehensive experiments conducted on benchmark datasets verify the efficacy of our proposed method.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 1","pages":"616-633"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10707331/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Achieving generalization for deep learning models has usually suffered from the bottleneck of annotated sample scarcity. As a common way of tackling this issue, few-shot learning focuses on “episodes”, i.e., sampled tasks that help the model acquire generalizable knowledge onto unseen categories – better the episodes, the higher a model's generalisability. Despite extensive research, the characteristics of episodes and their potential effects are relatively less explored. A recent paper discussed that different episodes exhibit different prediction difficulties, and coined a new metric “hardness” to quantify episodes, which however is too wide-range for an arbitrary dataset and thus remains impractical for realistic applications. In this paper therefore, we for the first time conduct an algebraic analysis of the critical factors influencing episode hardness supported by experimental demonstrations, that reveal episode hardness to largely depend on classes within an episode, and importantly propose an efficient pre-sampling hardness assessment technique named Inverse-Fisher Discriminant Ratio (IFDR). This enables sampling hard episodes at the class level via class-level (CL) sampling scheme that drastically decreases quantification cost. Delving deeper, we also develop a variant called class-pair-level (CPL) sampling, which further reduces the sampling cost while guaranteeing the sampled distribution. Finally, comprehensive experiments conducted on benchmark datasets verify the efficacy of our proposed method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
了解少子学习中的插曲硬度
深度学习模型泛化的实现通常受到标注样本稀缺性的瓶颈。作为解决这个问题的一种常见方法,少镜头学习关注于“片段”,即帮助模型获得未见类别的可推广知识的采样任务——片段越好,模型的可推广性越高。尽管进行了广泛的研究,但对发作的特征及其潜在影响的探索相对较少。最近的一篇论文讨论了不同的事件表现出不同的预测困难,并创造了一个新的度量“硬度”来量化事件,然而,对于任意数据集来说,这个范围太大,因此在现实应用中仍然不切实际。因此,在本文中,我们首次通过实验证明对影响片段硬度的关键因素进行了代数分析,揭示了片段硬度在很大程度上取决于片段内的类别,并重要地提出了一种有效的预采样硬度评估技术,称为逆费雪判别比(IFDR)。这使得可以通过类级别(CL)采样方案在类级别对硬集进行采样,从而大大降低了量化成本。深入研究,我们还开发了一种称为类对级(CPL)采样的变体,在保证采样分布的同时进一步降低了采样成本。最后,在基准数据集上进行了全面的实验,验证了本文方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fully-Connected Transformer for Multi-Source Image Fusion RenAIssance: A Survey Into AI Text-to-Image Generation in the Era of Large Model Natural Adversarial Mask for Face Identity Protection in Physical World Multi-Head Encoding for Extreme Label Classification Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1