基于距离的卷积神经网络深度特征空间学习损失函数

IF 4.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Vision and Image Understanding Pub Date : 2024-09-28 DOI:10.1016/j.cviu.2024.104184
Eduardo S. Ribeiro , Lourenço R.G. Araújo , Gabriel T.L. Chaves , Antônio P. Braga
{"title":"基于距离的卷积神经网络深度特征空间学习损失函数","authors":"Eduardo S. Ribeiro ,&nbsp;Lourenço R.G. Araújo ,&nbsp;Gabriel T.L. Chaves ,&nbsp;Antônio P. Braga","doi":"10.1016/j.cviu.2024.104184","DOIUrl":null,"url":null,"abstract":"<div><div>Convolutional Neural Networks (CNNs) have been on the forefront of neural network research in recent years. Their breakthrough performance in fields such as image classification has gathered efforts in the development of new CNN-based architectures, but recently more attention has been directed to the study of new loss functions. Softmax loss remains the most popular loss function due mainly to its efficiency in class separation, but the function is unsatisfactory in terms of intra-class compactness. While some studies have addressed this problem, most solutions attempt to refine softmax loss or combine it with other approaches. We present a novel loss function based on distance matrices (LDMAT), softmax independent, that maximizes interclass distance and minimizes intraclass distance. The loss function operates directly on deep features, allowing their use on arbitrary classifiers. LDMAT minimizes the distance between two distance matrices, one constructed with the model’s deep features and the other calculated from the labels. The use of a distance matrix in the loss function allows a two-dimensional representation of features and imposes a fixed distance between classes, while improving intra-class compactness. A regularization method applied to the distance matrix of labels is also presented, that allows a degree of relaxation of the solution and leads to a better spreading of features in the separation space. Efficient feature extraction was observed on datasets such as MNIST, CIFAR10 and CIFAR100.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"249 ","pages":"Article 104184"},"PeriodicalIF":4.3000,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distance-based loss function for deep feature space learning of convolutional neural networks\",\"authors\":\"Eduardo S. Ribeiro ,&nbsp;Lourenço R.G. Araújo ,&nbsp;Gabriel T.L. Chaves ,&nbsp;Antônio P. Braga\",\"doi\":\"10.1016/j.cviu.2024.104184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Convolutional Neural Networks (CNNs) have been on the forefront of neural network research in recent years. Their breakthrough performance in fields such as image classification has gathered efforts in the development of new CNN-based architectures, but recently more attention has been directed to the study of new loss functions. Softmax loss remains the most popular loss function due mainly to its efficiency in class separation, but the function is unsatisfactory in terms of intra-class compactness. While some studies have addressed this problem, most solutions attempt to refine softmax loss or combine it with other approaches. We present a novel loss function based on distance matrices (LDMAT), softmax independent, that maximizes interclass distance and minimizes intraclass distance. The loss function operates directly on deep features, allowing their use on arbitrary classifiers. LDMAT minimizes the distance between two distance matrices, one constructed with the model’s deep features and the other calculated from the labels. The use of a distance matrix in the loss function allows a two-dimensional representation of features and imposes a fixed distance between classes, while improving intra-class compactness. A regularization method applied to the distance matrix of labels is also presented, that allows a degree of relaxation of the solution and leads to a better spreading of features in the separation space. Efficient feature extraction was observed on datasets such as MNIST, CIFAR10 and CIFAR100.</div></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":\"249 \",\"pages\":\"Article 104184\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314224002650\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224002650","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

卷积神经网络(CNN)近年来一直处于神经网络研究的前沿。它们在图像分类等领域的突破性表现为开发基于 CNN 的新架构集聚了力量,但最近更多的注意力被引导到新损失函数的研究上。Softmax 损失函数仍然是最受欢迎的损失函数,这主要是因为它在类分离方面的高效性,但该函数在类内紧凑性方面并不令人满意。虽然一些研究已经解决了这一问题,但大多数解决方案都试图改进 softmax 损失函数或将其与其他方法相结合。我们提出了一种基于距离矩阵(LDMAT)、独立于 softmax 的新型损失函数,它能最大化类间距离,最小化类内距离。该损失函数直接作用于深度特征,可用于任意分类器。LDMAT 将两个距离矩阵之间的距离最小化,其中一个距离矩阵由模型的深度特征构建,另一个距离矩阵由标签计算得出。在损失函数中使用距离矩阵可实现特征的二维表示,并在改善类内紧凑性的同时,强加类之间的固定距离。此外,还介绍了一种应用于标签距离矩阵的正则化方法,这种方法可以在一定程度上放松解决方案,并使特征在分离空间中得到更好的分布。在 MNIST、CIFAR10 和 CIFAR100 等数据集上观察到了高效的特征提取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Distance-based loss function for deep feature space learning of convolutional neural networks
Convolutional Neural Networks (CNNs) have been on the forefront of neural network research in recent years. Their breakthrough performance in fields such as image classification has gathered efforts in the development of new CNN-based architectures, but recently more attention has been directed to the study of new loss functions. Softmax loss remains the most popular loss function due mainly to its efficiency in class separation, but the function is unsatisfactory in terms of intra-class compactness. While some studies have addressed this problem, most solutions attempt to refine softmax loss or combine it with other approaches. We present a novel loss function based on distance matrices (LDMAT), softmax independent, that maximizes interclass distance and minimizes intraclass distance. The loss function operates directly on deep features, allowing their use on arbitrary classifiers. LDMAT minimizes the distance between two distance matrices, one constructed with the model’s deep features and the other calculated from the labels. The use of a distance matrix in the loss function allows a two-dimensional representation of features and imposes a fixed distance between classes, while improving intra-class compactness. A regularization method applied to the distance matrix of labels is also presented, that allows a degree of relaxation of the solution and leads to a better spreading of features in the separation space. Efficient feature extraction was observed on datasets such as MNIST, CIFAR10 and CIFAR100.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Vision and Image Understanding
Computer Vision and Image Understanding 工程技术-工程:电子与电气
CiteScore
7.80
自引率
4.40%
发文量
112
审稿时长
79 days
期刊介绍: The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems
期刊最新文献
Editorial Board UATST: Towards unpaired arbitrary text-guided style transfer with cross-space modulation Multi-Scale Adaptive Skeleton Transformer for action recognition Open-set domain adaptation with visual-language foundation models Leveraging vision-language prompts for real-world image restoration and enhancement
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1