From superpixels to foundational models: An overview of unsupervised and generalizable image segmentation

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Computers & Graphics-Uk Pub Date : 2024-07-29 DOI:10.1016/j.cag.2024.104014

Cristiano N. Rodrigues , Ian M. Nunes , Matheus B. Pereira , Hugo Oliveira , Jefersson A. dos Santos

{"title":"From superpixels to foundational models: An overview of unsupervised and generalizable image segmentation","authors":"Cristiano N. Rodrigues , Ian M. Nunes , Matheus B. Pereira , Hugo Oliveira , Jefersson A. dos Santos","doi":"10.1016/j.cag.2024.104014","DOIUrl":null,"url":null,"abstract":"<div><p>Image segmentation is one of the most classical computer vision tasks. Segmentation tasks yield a set of classes attributed to individual pixels instead of sparsely predicted images or patches, such as in classification or detection tasks. However, creating annotation sets for pixelwise tasks is a very costly task, often requiring hours for labeling single samples in images with multiple classes of objects. In this context, unsupervised learning can be leveraged either to expedite the annotation procedure and/or to guide the segmentation algorithms altogether without the need for manual annotations. Classical unsupervised segmentation methods leveraged techniques from areas as graph theory, image processing, clustering or supervised classifiers in order to achieve “shallow” pixelwise classification. These techniques usually aim to achieve superpixel over-segmentations by grouping similar pixels that should pertain to the same object. Modern deep unsupervised approaches for image segmentation aimed to group pixels in a data-driven way by using the capabilities of deep architectures to process unstructured data such as images. Later, self-supervised learning bypassed the need for labels via pretext tasks, compelling deep architectures to learn more generic features capable of enhancing downstream tasks, including segmentation. The generalized representations produced by unsupervised models have propelled the recent progress in self-supervised, few- and zero-shot learning and even general-purpose foundational models in computer vision, yielding state-of-the-art results across diverse tasks and datasets. This paper provides an overview of unsupervised and generalizable approaches for image segmentation, introduces key concepts and terminology, and discusses the main aspects of state-of-the-art methods. Additionally, we highlight prominent applications in various domains such as remote sensing, medical imaging, and geology. Finally, we discuss trends and future directions for state-of-the-art unsupervised image segmentation.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104014"},"PeriodicalIF":2.8000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324001493","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Image segmentation is one of the most classical computer vision tasks. Segmentation tasks yield a set of classes attributed to individual pixels instead of sparsely predicted images or patches, such as in classification or detection tasks. However, creating annotation sets for pixelwise tasks is a very costly task, often requiring hours for labeling single samples in images with multiple classes of objects. In this context, unsupervised learning can be leveraged either to expedite the annotation procedure and/or to guide the segmentation algorithms altogether without the need for manual annotations. Classical unsupervised segmentation methods leveraged techniques from areas as graph theory, image processing, clustering or supervised classifiers in order to achieve “shallow” pixelwise classification. These techniques usually aim to achieve superpixel over-segmentations by grouping similar pixels that should pertain to the same object. Modern deep unsupervised approaches for image segmentation aimed to group pixels in a data-driven way by using the capabilities of deep architectures to process unstructured data such as images. Later, self-supervised learning bypassed the need for labels via pretext tasks, compelling deep architectures to learn more generic features capable of enhancing downstream tasks, including segmentation. The generalized representations produced by unsupervised models have propelled the recent progress in self-supervised, few- and zero-shot learning and even general-purpose foundational models in computer vision, yielding state-of-the-art results across diverse tasks and datasets. This paper provides an overview of unsupervised and generalizable approaches for image segmentation, introduces key concepts and terminology, and discusses the main aspects of state-of-the-art methods. Additionally, we highlight prominent applications in various domains such as remote sensing, medical imaging, and geology. Finally, we discuss trends and future directions for state-of-the-art unsupervised image segmentation.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从超像素到基础模型：无监督和通用图像分割概述

图像分割是最经典的计算机视觉任务之一。分割任务产生一组归属于单个像素而非稀疏预测图像或斑块的类别，例如在分类或检测任务中。然而，为像素任务创建注释集是一项非常昂贵的任务，通常需要数小时才能在包含多类对象的图像中标注单个样本。在这种情况下，可以利用无监督学习来加快标注过程和/或指导分割算法，而无需手动标注。经典的无监督分割方法利用图论、图像处理、聚类或监督分类器等领域的技术来实现 "浅层 "像素分类。这些技术通常旨在通过将应属于同一对象的相似像素分组来实现超像素过度分割。现代深度无监督图像分割方法旨在利用深度架构处理图像等非结构化数据的能力，以数据驱动的方式对像素进行分组。后来，自监督学习通过前置任务绕过了对标签的需求，迫使深度架构学习更多通用特征，以增强包括分割在内的下游任务。无监督模型产生的通用表征推动了计算机视觉领域的自监督学习、少镜头学习和零镜头学习，甚至通用基础模型的最新进展，在各种任务和数据集上取得了最先进的成果。本文概述了用于图像分割的无监督和通用方法，介绍了关键概念和术语，并讨论了最先进方法的主要方面。此外，我们还重点介绍了遥感、医学成像和地质学等不同领域的突出应用。最后，我们讨论了最先进的无监督图像分割技术的发展趋势和未来方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.