LSPC:基于本地语义信息和原型的对比聚类探索

IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Systems Pub Date : 2023-12-13 DOI:10.1016/j.is.2023.102336
Jun-Fen Chen, Lang Sun, Bo-Jun Xie
{"title":"LSPC:基于本地语义信息和原型的对比聚类探索","authors":"Jun-Fen Chen,&nbsp;Lang Sun,&nbsp;Bo-Jun Xie","doi":"10.1016/j.is.2023.102336","DOIUrl":null,"url":null,"abstract":"<div><p>Recently years, several prominent contrastive learning<span><span> algorithms, a kind of self-supervised learning methods, have been extensively studied that can efficiently extract useful feature representations from input images by means of data augmentation techniques. How to further partition the representations into meaningful clusters is the issue that deep clustering is addressing. In this work, a deep </span>clustering algorithm based on local semantic information and prototype is proposed referring to LSPC that aims at learning a group of representative prototypes. Rather than learning the distinguishing characteristics between different images, more attention is given to the essential characteristics of images that are maybe from a potential category. On the training framework, contrastive learning is skillfully combined with k-means clustering algorithm. The prediction is transformed into soft assignments for end-to-end training. In order to enable the model to accurately capture the semantic information between images, we mine similar samples of training samples in the embedded space as local semantic information to effectively increase the similarity between samples belonging to the same cluster. Experimental results show that our algorithm achieves state-of-the-art performance on several commonly used public datasets, and additional experiments prove that this superior clustering performance can also be extended to large datasets such as ImageNet.</span></p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"121 ","pages":"Article 102336"},"PeriodicalIF":3.0000,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LSPC: Exploring contrastive clustering based on local semantic information and prototype\",\"authors\":\"Jun-Fen Chen,&nbsp;Lang Sun,&nbsp;Bo-Jun Xie\",\"doi\":\"10.1016/j.is.2023.102336\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Recently years, several prominent contrastive learning<span><span> algorithms, a kind of self-supervised learning methods, have been extensively studied that can efficiently extract useful feature representations from input images by means of data augmentation techniques. How to further partition the representations into meaningful clusters is the issue that deep clustering is addressing. In this work, a deep </span>clustering algorithm based on local semantic information and prototype is proposed referring to LSPC that aims at learning a group of representative prototypes. Rather than learning the distinguishing characteristics between different images, more attention is given to the essential characteristics of images that are maybe from a potential category. On the training framework, contrastive learning is skillfully combined with k-means clustering algorithm. The prediction is transformed into soft assignments for end-to-end training. In order to enable the model to accurately capture the semantic information between images, we mine similar samples of training samples in the embedded space as local semantic information to effectively increase the similarity between samples belonging to the same cluster. Experimental results show that our algorithm achieves state-of-the-art performance on several commonly used public datasets, and additional experiments prove that this superior clustering performance can also be extended to large datasets such as ImageNet.</span></p></div>\",\"PeriodicalId\":50363,\"journal\":{\"name\":\"Information Systems\",\"volume\":\"121 \",\"pages\":\"Article 102336\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306437923001722\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437923001722","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

近年来,对比学习算法作为一种自监督学习方法得到了广泛的研究,它可以通过数据增强技术有效地从输入图像中提取有用的特征表示。如何将表示进一步划分为有意义的聚类是深度聚类要解决的问题。本文在LSPC的基础上,提出了一种基于局部语义信息和原型的深度聚类算法,旨在学习一组具有代表性的原型。比起学习不同图像之间的区别特征,更多的是关注可能来自潜在类别的图像的本质特征。在训练框架上,将对比学习与k-means聚类算法巧妙结合。将预测转化为端到端训练的软任务。为了使模型能够准确地捕获图像之间的语义信息,我们在嵌入空间中挖掘训练样本的相似样本作为局部语义信息,有效地增加了属于同一聚类的样本之间的相似度。实验结果表明,我们的算法在几个常用的公共数据集上达到了最先进的性能,另外的实验证明,这种优越的聚类性能也可以扩展到像ImageNet这样的大型数据集上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LSPC: Exploring contrastive clustering based on local semantic information and prototype

Recently years, several prominent contrastive learning algorithms, a kind of self-supervised learning methods, have been extensively studied that can efficiently extract useful feature representations from input images by means of data augmentation techniques. How to further partition the representations into meaningful clusters is the issue that deep clustering is addressing. In this work, a deep clustering algorithm based on local semantic information and prototype is proposed referring to LSPC that aims at learning a group of representative prototypes. Rather than learning the distinguishing characteristics between different images, more attention is given to the essential characteristics of images that are maybe from a potential category. On the training framework, contrastive learning is skillfully combined with k-means clustering algorithm. The prediction is transformed into soft assignments for end-to-end training. In order to enable the model to accurately capture the semantic information between images, we mine similar samples of training samples in the embedded space as local semantic information to effectively increase the similarity between samples belonging to the same cluster. Experimental results show that our algorithm achieves state-of-the-art performance on several commonly used public datasets, and additional experiments prove that this superior clustering performance can also be extended to large datasets such as ImageNet.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Systems
Information Systems 工程技术-计算机:信息系统
CiteScore
9.40
自引率
2.70%
发文量
112
审稿时长
53 days
期刊介绍: Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems. Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.
期刊最新文献
STracker: A framework for identifying sentiment changes in customer feedbacks Two-level massive string dictionaries A generative and discriminative model for diversity-promoting recommendation Soundness unknotted: An efficient soundness checking algorithm for arbitrary cyclic process models by loosening loops The composition diagram of a complex process: Enhancing understanding of hierarchical business processes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1