Myung Seok Shim, Christopher Thiele, Jeremy Vila, Nishank Saxena, Detlef Hohl
{"title":"Content-based image retrieval for industrial material images with deep learning and encoded physical properties","authors":"Myung Seok Shim, Christopher Thiele, Jeremy Vila, Nishank Saxena, Detlef Hohl","doi":"10.1017/dce.2023.16","DOIUrl":null,"url":null,"abstract":"Abstract Industrial materials images are an important application domain for content-based image retrieval. Users need to quickly search databases for images that exhibit similar appearance, properties, and/or features to reduce analysis turnaround time and cost. The images in this study are 2D images of millimeter-scale rock samples acquired at micrometer resolution with light microscopy or extracted from 3D micro-CT scans. Labeled rock images are expensive and time-consuming to acquire and thus are typically only available in the tens of thousands. Training a high-capacity deep learning (DL) model from scratch is therefore not practicable due to data paucity. To overcome this “few-shot learning” challenge, we propose leveraging pretrained common DL models in conjunction with transfer learning. The “similarity” of industrial materials images is subjective and assessed by human experts based on both visual appearance and physical qualities. We have emulated this human-driven assessment process via a physics-informed neural network including metadata and physical measurements in the loss function. We present a novel DL architecture that combines Siamese neural networks with a loss function that integrates classification and regression terms. The networks are trained with both image and metadata similarity (classification), and with metadata prediction (regression). For efficient inference, we use a highly compressed image feature representation, computed offline once, to search the database for images similar to a query image. Numerical experiments demonstrate superior retrieval performance of our new architecture compared with other DL and custom-feature-based approaches.","PeriodicalId":158708,"journal":{"name":"Data-Centric Engineering","volume":"281 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data-Centric Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/dce.2023.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Industrial materials images are an important application domain for content-based image retrieval. Users need to quickly search databases for images that exhibit similar appearance, properties, and/or features to reduce analysis turnaround time and cost. The images in this study are 2D images of millimeter-scale rock samples acquired at micrometer resolution with light microscopy or extracted from 3D micro-CT scans. Labeled rock images are expensive and time-consuming to acquire and thus are typically only available in the tens of thousands. Training a high-capacity deep learning (DL) model from scratch is therefore not practicable due to data paucity. To overcome this “few-shot learning” challenge, we propose leveraging pretrained common DL models in conjunction with transfer learning. The “similarity” of industrial materials images is subjective and assessed by human experts based on both visual appearance and physical qualities. We have emulated this human-driven assessment process via a physics-informed neural network including metadata and physical measurements in the loss function. We present a novel DL architecture that combines Siamese neural networks with a loss function that integrates classification and regression terms. The networks are trained with both image and metadata similarity (classification), and with metadata prediction (regression). For efficient inference, we use a highly compressed image feature representation, computed offline once, to search the database for images similar to a query image. Numerical experiments demonstrate superior retrieval performance of our new architecture compared with other DL and custom-feature-based approaches.