很高兴见到具有大聚类和特征的图像:集群加权多模态共聚方法

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2024-05-19 DOI:10.1016/j.ipm.2024.103735
Chaoyang Zhang , Hang Xue , Kai Nie , Xihui Wu , Zhengzheng Lou , Shouyi Yang , Qinglei Zhou , Shizhe Hu
{"title":"很高兴见到具有大聚类和特征的图像:集群加权多模态共聚方法","authors":"Chaoyang Zhang ,&nbsp;Hang Xue ,&nbsp;Kai Nie ,&nbsp;Xihui Wu ,&nbsp;Zhengzheng Lou ,&nbsp;Shouyi Yang ,&nbsp;Qinglei Zhou ,&nbsp;Shizhe Hu","doi":"10.1016/j.ipm.2024.103735","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method\",\"authors\":\"Chaoyang Zhang ,&nbsp;Hang Xue ,&nbsp;Kai Nie ,&nbsp;Xihui Wu ,&nbsp;Zhengzheng Lou ,&nbsp;Shouyi Yang ,&nbsp;Qinglei Zhou ,&nbsp;Shizhe Hu\",\"doi\":\"10.1016/j.ipm.2024.103735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.</p></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324000955\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324000955","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

多模态图像聚类侧重于探索和利用输入图像的各种模态之间的相关信息,以获得清晰的图像聚类模式。最近的多模态/视图聚类方法在解决图像聚类问题上表现出了良好的性能。然而,在图像检索、多模态自动驾驶感知和工业自动化等实际应用中,大多数现有方法都无法正确处理具有海量聚类和高维度特征的多模态图像数据。我们将这一问题称为 "大聚类和特征"(Big Clusters and Features),正如大量样本的 "大数据"(Big Data)一样。为了解决这个具有挑战性的问题,我们在本文中设计了一个通用的多模态图像聚类框架,它集成了聚类权重学习、特征学习和聚类结构学习。在此框架下,我们进一步提出了一种新的聚类加权多模态信息瓶颈协同聚类(CMIBC)方法,该方法能有效衡量图像聚类的重要性信息和各模态的判别特征,从而获得令人满意的图像聚类性能。与现有的只考虑簇内相似性或簇间不相似性的簇权重学习方法不同,我们设计了一种新颖的簇权重学习策略,共同考虑并兼顾两者的优点。在各种多模态图像数据集上进行的大量精心设计的实验表明,CMIBC 算法与大量单模态/多模态聚类方法相比具有竞争优势,特别是在植物物种数据集上,CMIBC 算法在准确率和归一化互信息方面分别提高了 3.12% 和 5.28%。鉴于其良好的性能,所提出的 CMIBC 可以推广到许多其他实际应用中,例如多模态医疗分析和视频识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method

Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem “Big Clusters and Features” for short just as “Big Data” for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
期刊最新文献
ME3A: A Multimodal Entity Entailment framework for multimodal Entity Alignment Hierarchical multi-label text classification of tourism resources using a label-aware dual graph attention network Impact of economic and socio-political risk factors on sovereign credit ratings Higher-order structure based node importance evaluation in directed networks Membership inference attacks via spatial projection-based relative information loss in MLaaS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1