Generalized zero-shot learning via discriminative and transferable disentangled representations.

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Networks Pub Date : 2025-03-01 Epub Date: 2024-11-30 DOI:10.1016/j.neunet.2024.106964
Chunyu Zhang, Zhanshan Li
{"title":"Generalized zero-shot learning via discriminative and transferable disentangled representations.","authors":"Chunyu Zhang, Zhanshan Li","doi":"10.1016/j.neunet.2024.106964","DOIUrl":null,"url":null,"abstract":"<p><p>In generalized zero-shot learning (GZSL), it is required to identify seen and unseen samples under the condition that only seen classes can be obtained during training. Recent methods utilize disentanglement to make the information contained in visual features semantically related, and ensuring semantic consistency and independence of the disentangled representations is the key to achieving better performance. However, we think there are still some limitations. Firstly, due to the fact that only seen classes can be obtained during training, the recognition of unseen samples will be poor. Secondly, the distribution relations of the representation space and the semantic space are different, and ignoring the discrepancy between them may impact the generalization of the model. In addition, the instances are associated with each other, and considering the interactions between them can obtain more discriminative information, which should not be ignored. Thirdly, since the synthesized visual features may not match the corresponding semantic descriptions well, it will compromise the learning of semantic consistency. To overcome these challenges, we propose to learn discriminative and transferable disentangled representations (DTDR) for generalized zero-shot learning. Firstly, we exploit the estimated class similarities to supervise the relations between seen semantic-matched representations and unseen semantic descriptions, thereby gaining better insight into the unseen domain. Secondly, we use cosine similarities between semantic descriptions to constrain the similarities between semantic-matched representations, thereby facilitating the distribution relation of semantic-matched representation space to approximate the distribution relation of semantic space. And during the process, the instance-level correlation can be taken into account. Thirdly, we reconstruct the synthesized visual features into the corresponding semantic descriptions to better establish the associations between them. The experimental results on four datasets verify the effectiveness of our method.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"106964"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1016/j.neunet.2024.106964","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/30 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In generalized zero-shot learning (GZSL), it is required to identify seen and unseen samples under the condition that only seen classes can be obtained during training. Recent methods utilize disentanglement to make the information contained in visual features semantically related, and ensuring semantic consistency and independence of the disentangled representations is the key to achieving better performance. However, we think there are still some limitations. Firstly, due to the fact that only seen classes can be obtained during training, the recognition of unseen samples will be poor. Secondly, the distribution relations of the representation space and the semantic space are different, and ignoring the discrepancy between them may impact the generalization of the model. In addition, the instances are associated with each other, and considering the interactions between them can obtain more discriminative information, which should not be ignored. Thirdly, since the synthesized visual features may not match the corresponding semantic descriptions well, it will compromise the learning of semantic consistency. To overcome these challenges, we propose to learn discriminative and transferable disentangled representations (DTDR) for generalized zero-shot learning. Firstly, we exploit the estimated class similarities to supervise the relations between seen semantic-matched representations and unseen semantic descriptions, thereby gaining better insight into the unseen domain. Secondly, we use cosine similarities between semantic descriptions to constrain the similarities between semantic-matched representations, thereby facilitating the distribution relation of semantic-matched representation space to approximate the distribution relation of semantic space. And during the process, the instance-level correlation can be taken into account. Thirdly, we reconstruct the synthesized visual features into the corresponding semantic descriptions to better establish the associations between them. The experimental results on four datasets verify the effectiveness of our method.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于判别和可转移解纠缠表征的广义零次学习。
在广义零次学习(GZSL)中,要求在训练过程中只能得到已知类的情况下,识别可见样本和未见样本。最近的方法利用解纠缠来使视觉特征中包含的信息语义相关,而保证解纠缠表示的语义一致性和独立性是获得更好性能的关键。然而,我们认为仍有一些局限性。首先,由于在训练过程中只能得到看到的类,对看不见的样本的识别会很差。其次,表示空间和语义空间的分布关系不同,忽略它们之间的差异可能会影响模型的泛化。此外,实例之间是相互关联的,考虑它们之间的相互作用可以获得更多的判别信息,这一点不容忽视。第三,由于合成的视觉特征可能不能很好地匹配相应的语义描述,这将损害语义一致性的学习。为了克服这些挑战,我们提出学习判别和可转移解纠缠表征(DTDR)用于广义零次学习。首先,我们利用估计的类相似度来监督可见语义匹配表示和不可见语义描述之间的关系,从而更好地了解不可见领域。其次,我们利用语义描述之间的余弦相似度来约束语义匹配表示之间的相似度,从而便于语义匹配表示空间的分布关系来近似语义空间的分布关系。在此过程中,可以考虑实例级相关性。第三,我们将合成的视觉特征重构为相应的语义描述,以更好地建立它们之间的关联。在四个数据集上的实验结果验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
期刊最新文献
Estimating global phase synchronization by quantifying multivariate mutual information and detecting network structure. Event-based adaptive fixed-time optimal control for saturated fault-tolerant nonlinear multiagent systems via reinforcement learning algorithm. Lie group convolution neural networks with scale-rotation equivariance. Multi-hop interpretable meta learning for few-shot temporal knowledge graph completion. An object detection-based model for automated screening of stem-cells senescence during drug screening.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1