Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method

IF 6.9 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2024-09-01 Epub Date: 2024-05-19 DOI:10.1016/j.ipm.2024.103735

Chaoyang Zhang , Hang Xue , Kai Nie , Xihui Wu , Zhengzheng Lou , Shouyi Yang , Qinglei Zhou , Shizhe Hu

{"title":"Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method","authors":"Chaoyang Zhang , Hang Xue , Kai Nie , Xihui Wu , Zhengzheng Lou , Shouyi Yang , Qinglei Zhou , Shizhe Hu","doi":"10.1016/j.ipm.2024.103735","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 5","pages":"Article 103735"},"PeriodicalIF":6.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324000955","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

很高兴见到具有大聚类和特征的图像：集群加权多模态共聚方法

多模态图像聚类侧重于探索和利用输入图像的各种模态之间的相关信息，以获得清晰的图像聚类模式。最近的多模态/视图聚类方法在解决图像聚类问题上表现出了良好的性能。然而，在图像检索、多模态自动驾驶感知和工业自动化等实际应用中，大多数现有方法都无法正确处理具有海量聚类和高维度特征的多模态图像数据。我们将这一问题称为 "大聚类和特征"（Big Clusters and Features），正如大量样本的 "大数据"（Big Data）一样。为了解决这个具有挑战性的问题，我们在本文中设计了一个通用的多模态图像聚类框架，它集成了聚类权重学习、特征学习和聚类结构学习。在此框架下，我们进一步提出了一种新的聚类加权多模态信息瓶颈协同聚类（CMIBC）方法，该方法能有效衡量图像聚类的重要性信息和各模态的判别特征，从而获得令人满意的图像聚类性能。与现有的只考虑簇内相似性或簇间不相似性的簇权重学习方法不同，我们设计了一种新颖的簇权重学习策略，共同考虑并兼顾两者的优点。在各种多模态图像数据集上进行的大量精心设计的实验表明，CMIBC 算法与大量单模态/多模态聚类方法相比具有竞争优势，特别是在植物物种数据集上，CMIBC 算法在准确率和归一化互信息方面分别提高了 3.12% 和 5.28%。鉴于其良好的性能，所提出的 CMIBC 可以推广到许多其他实际应用中，例如多模态医疗分析和视频识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.