Superpixel segmentations for thin sections: Evaluation of methods to enable the generation of machine learning training data sets

Comput. Geosci. Pub Date : 2021-10-16 DOI:10.31223/x55s65

Jiaxin Yu, F. Wellmann, S. Virgo, Marven von Domarus, M. Jiang, J. Schmatz, B. Leibe

{"title":"Superpixel segmentations for thin sections: Evaluation of methods to enable the generation of machine learning training data sets","authors":"Jiaxin Yu, F. Wellmann, S. Virgo, Marven von Domarus, M. Jiang, J. Schmatz, B. Leibe","doi":"10.31223/x55s65","DOIUrl":null,"url":null,"abstract":"Training data is the backbone of developing either Machine Learning (ML) models or specific deep learning algorithms. The paucity of well-labeled training image data has significantly impeded the applications of ML-based approaches, especially the development of novel Deep Learning (DL) methods like Convolutional Neural Networks (CNNs) in mineral thin section images identification. However, image annotation, especially pixel-wise annotation is always a costly process. Manually creating dense semantic labels for rock thin section images has been long considered as an unprecedented challenge in view of the ubiquitous variety and complexity of minerals in thin sections. To speed up the annotation, we propose a human-computer collaborative pipeline in which superpixel segmentation is used as a boundary extractor to avoid hand delineation of instances boundaries. The pipeline consists of two steps: superpixel segmentation using MultiSLIC, and superpixel labeling through a specific-designed tool. We use a cutting-edge methodology Virtual Petroscopy (ViP) for automatic image acquisition. Bentheimer sandstone sample is used to conduct performance testing of the pipeline. Three standard error metrics are used to evaluate the performance of MultiSLIC. The result indicates that MultiSLIC is able to extract compact superpixels with satisfying boundary adherence given multiple input images. According to our test results, large and complex thin section images with pixel-wisely accurate labels can be annotated with the labeling tool more efficiently than in a conventional, purely manual work, and generate data of high quality.","PeriodicalId":10649,"journal":{"name":"Comput. Geosci.","volume":"26 1","pages":"105232"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comput. Geosci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31223/x55s65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Training data is the backbone of developing either Machine Learning (ML) models or specific deep learning algorithms. The paucity of well-labeled training image data has significantly impeded the applications of ML-based approaches, especially the development of novel Deep Learning (DL) methods like Convolutional Neural Networks (CNNs) in mineral thin section images identification. However, image annotation, especially pixel-wise annotation is always a costly process. Manually creating dense semantic labels for rock thin section images has been long considered as an unprecedented challenge in view of the ubiquitous variety and complexity of minerals in thin sections. To speed up the annotation, we propose a human-computer collaborative pipeline in which superpixel segmentation is used as a boundary extractor to avoid hand delineation of instances boundaries. The pipeline consists of two steps: superpixel segmentation using MultiSLIC, and superpixel labeling through a specific-designed tool. We use a cutting-edge methodology Virtual Petroscopy (ViP) for automatic image acquisition. Bentheimer sandstone sample is used to conduct performance testing of the pipeline. Three standard error metrics are used to evaluate the performance of MultiSLIC. The result indicates that MultiSLIC is able to extract compact superpixels with satisfying boundary adherence given multiple input images. According to our test results, large and complex thin section images with pixel-wisely accurate labels can be annotated with the labeling tool more efficiently than in a conventional, purely manual work, and generate data of high quality.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

薄片的超像素分割:评估生成机器学习训练数据集的方法

训练数据是开发机器学习(ML)模型或特定深度学习算法的支柱。缺乏标记良好的训练图像数据严重阻碍了基于机器学习的方法的应用，特别是卷积神经网络(cnn)等新型深度学习(DL)方法在矿物薄片图像识别中的发展。然而，图像标注，特别是像素标注总是一个代价高昂的过程。考虑到岩石薄片中矿物的多样性和复杂性，长期以来，人工为岩石薄片图像创建密集的语义标签一直被认为是一个前所未有的挑战。为了加快标注速度，我们提出了一种人机协作管道，其中使用超像素分割作为边界提取器，以避免手工划定实例边界。该流程包括两个步骤:使用MultiSLIC进行超像素分割，以及通过特定设计的工具进行超像素标记。我们使用先进的方法虚拟岩石镜(ViP)进行自动图像采集。采用Bentheimer砂岩试样对管道进行了性能测试。使用三个标准误差指标来评估MultiSLIC的性能。结果表明，在给定多幅输入图像的情况下，MultiSLIC能够提取出紧凑的超像素，并具有满意的边界依附性。根据我们的测试结果，与传统的纯手工工作相比，使用标注工具可以更有效地注释具有像素精确标签的大型复杂薄片图像，并生成高质量的数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Comput. Geosci.

自引率

0.00%

发文量