用于部分多视图不完整多标签分类的任务增强型交叉视图估算网络

Xiaohuan Lu, Lian Zhao, Wai Keung Wong, Jie Wen, Jiang Long, Wulin Xie
{"title":"用于部分多视图不完整多标签分类的任务增强型交叉视图估算网络","authors":"Xiaohuan Lu, Lian Zhao, Wai Keung Wong, Jie Wen, Jiang Long, Wulin Xie","doi":"arxiv-2409.07931","DOIUrl":null,"url":null,"abstract":"In real-world scenarios, multi-view multi-label learning often encounters the\nchallenge of incomplete training data due to limitations in data collection and\nunreliable annotation processes. The absence of multi-view features impairs the\ncomprehensive understanding of samples, omitting crucial details essential for\nclassification. To address this issue, we present a task-augmented cross-view\nimputation network (TACVI-Net) for the purpose of handling partial multi-view\nincomplete multi-label classification. Specifically, we employ a two-stage\nnetwork to derive highly task-relevant features to recover the missing views.\nIn the first stage, we leverage the information bottleneck theory to obtain a\ndiscriminative representation of each view by extracting task-relevant\ninformation through a view-specific encoder-classifier architecture. In the\nsecond stage, an autoencoder based multi-view reconstruction network is\nutilized to extract high-level semantic representation of the augmented\nfeatures and recover the missing data, thereby aiding the final classification\ntask. Extensive experiments on five datasets demonstrate that our TACVI-Net\noutperforms other state-of-the-art methods.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification\",\"authors\":\"Xiaohuan Lu, Lian Zhao, Wai Keung Wong, Jie Wen, Jiang Long, Wulin Xie\",\"doi\":\"arxiv-2409.07931\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In real-world scenarios, multi-view multi-label learning often encounters the\\nchallenge of incomplete training data due to limitations in data collection and\\nunreliable annotation processes. The absence of multi-view features impairs the\\ncomprehensive understanding of samples, omitting crucial details essential for\\nclassification. To address this issue, we present a task-augmented cross-view\\nimputation network (TACVI-Net) for the purpose of handling partial multi-view\\nincomplete multi-label classification. Specifically, we employ a two-stage\\nnetwork to derive highly task-relevant features to recover the missing views.\\nIn the first stage, we leverage the information bottleneck theory to obtain a\\ndiscriminative representation of each view by extracting task-relevant\\ninformation through a view-specific encoder-classifier architecture. In the\\nsecond stage, an autoencoder based multi-view reconstruction network is\\nutilized to extract high-level semantic representation of the augmented\\nfeatures and recover the missing data, thereby aiding the final classification\\ntask. Extensive experiments on five datasets demonstrate that our TACVI-Net\\noutperforms other state-of-the-art methods.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07931\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07931","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在现实世界中,由于数据收集的局限性和注释过程的不可靠,多视角多标签学习经常会遇到训练数据不完整的挑战。多视角特征的缺失会影响对样本的全面理解,从而遗漏对分类至关重要的细节。为了解决这个问题,我们提出了一种任务增强跨视图输入网络(TACVI-Net),用于处理部分多视图不完整多标签分类。在第一阶段,我们利用信息瓶颈理论,通过特定视图的编码器-分类器架构提取任务相关信息,从而获得每个视图的区分表示。在第二阶段,我们利用基于自动编码器的多视图重构网络来提取增强特征的高级语义表示并恢复缺失数据,从而帮助完成最终的分类任务。在五个数据集上进行的广泛实验表明,我们的 TACVI-Netout 优于其他最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification
In real-world scenarios, multi-view multi-label learning often encounters the challenge of incomplete training data due to limitations in data collection and unreliable annotation processes. The absence of multi-view features impairs the comprehensive understanding of samples, omitting crucial details essential for classification. To address this issue, we present a task-augmented cross-view imputation network (TACVI-Net) for the purpose of handling partial multi-view incomplete multi-label classification. Specifically, we employ a two-stage network to derive highly task-relevant features to recover the missing views. In the first stage, we leverage the information bottleneck theory to obtain a discriminative representation of each view by extracting task-relevant information through a view-specific encoder-classifier architecture. In the second stage, an autoencoder based multi-view reconstruction network is utilized to extract high-level semantic representation of the augmented features and recover the missing data, thereby aiding the final classification task. Extensive experiments on five datasets demonstrate that our TACVI-Net outperforms other state-of-the-art methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1