{"title":"MDSC-Net: Multi-Modal Discriminative Sparse Coding Driven RGB-D Classification Network","authors":"Jingyi Xu;Xin Deng;Yibing Fu;Mai Xu;Shengxi Li","doi":"10.1109/TMM.2024.3521720","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel sparsity-driven deep neural network to solve the RGB-D image classification problem. Different from existing classification networks, our network architecture is designed by drawing inspirations from a new proposed multi-modal discriminative sparse coding (MDSC) model. The key feature of this model is that it can gradually separate the discriminative and non-discriminative features in RGB-D images in a coarse-to-fine manner. Only the discriminative features are integrated and refined for classification, while the non-discriminative features are discarded, to improve the classification accuracy and efficiency. Derived from the MDSC model, the proposed network is composed of three modules, i.e., the shared feature extraction (SFE) module, discriminative feature refinement (DFR) module, and classification module. The architecture of each module is derived from the optimization solution in the MDSC model. To the best of our knowledge, this is the first time a fully sparsity-driven network has been proposed for RGB-D image classification. Extensive results verify the effectiveness of our method on different RGB-D image datasets.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"442-454"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10812785/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose a novel sparsity-driven deep neural network to solve the RGB-D image classification problem. Different from existing classification networks, our network architecture is designed by drawing inspirations from a new proposed multi-modal discriminative sparse coding (MDSC) model. The key feature of this model is that it can gradually separate the discriminative and non-discriminative features in RGB-D images in a coarse-to-fine manner. Only the discriminative features are integrated and refined for classification, while the non-discriminative features are discarded, to improve the classification accuracy and efficiency. Derived from the MDSC model, the proposed network is composed of three modules, i.e., the shared feature extraction (SFE) module, discriminative feature refinement (DFR) module, and classification module. The architecture of each module is derived from the optimization solution in the MDSC model. To the best of our knowledge, this is the first time a fully sparsity-driven network has been proposed for RGB-D image classification. Extensive results verify the effectiveness of our method on different RGB-D image datasets.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.