Design of deep convolution feature extraction for multimedia information retrieval

IF 1.3 Q4 ROBOTICS International Journal of Intelligent Unmanned Systems Pub Date : 2022-01-26 DOI:10.1108/ijius-11-2021-0126

K. Venkataravana Nayak, J. Arunalatha, Dr. Vasanthakumar G. U., K. Venugopal

{"title":"Design of deep convolution feature extraction for multimedia information retrieval","authors":"K. Venkataravana Nayak, J. Arunalatha, Dr. Vasanthakumar G. U., K. Venugopal","doi":"10.1108/ijius-11-2021-0126","DOIUrl":null,"url":null,"abstract":"PurposeThe analysis of multimedia content is being applied in various real-time computer vision applications. In multimedia content, digital images constitute a significant part. The representation of digital images interpreted by humans is subjective in nature and complex. Hence, searching for relevant images from the archives is difficult. Thus, electronic image analysis strategies have become effective tools in the process of image interpretation.Design/methodology/approachThe traditional approach used is text-based, i.e. searching images using textual annotations. It consumes time in the manual process of annotating images and is difficult to reduce the dependency in textual annotations if the archive consists of large number of samples. Therefore, content-based image retrieval (CBIR) is adopted in which the high-level visuals of images are represented in terms of feature vectors, which contain numerical values. It is a commonly used approach to understand the content of query images in retrieving relevant images. Still, the performance is less than optimal due to the presence of semantic gap among the image content representation and human visual understanding perspective because of the image content photometric, geometric variations and occlusions in search environments.FindingsThe authors proposed an image retrieval framework to generate semantic response through the feature extraction with convolution network and optimization of extracted features using adaptive moment estimation algorithm towards enhancing the retrieval performance.Originality/valueThe proposed framework is tested on Corel-1k and ImageNet datasets resulted in an accuracy of 98 and 96%, respectively, compared to the state-of-the-art approaches.","PeriodicalId":42876,"journal":{"name":"International Journal of Intelligent Unmanned Systems","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2022-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Unmanned Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijius-11-2021-0126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 2

Abstract

PurposeThe analysis of multimedia content is being applied in various real-time computer vision applications. In multimedia content, digital images constitute a significant part. The representation of digital images interpreted by humans is subjective in nature and complex. Hence, searching for relevant images from the archives is difficult. Thus, electronic image analysis strategies have become effective tools in the process of image interpretation.Design/methodology/approachThe traditional approach used is text-based, i.e. searching images using textual annotations. It consumes time in the manual process of annotating images and is difficult to reduce the dependency in textual annotations if the archive consists of large number of samples. Therefore, content-based image retrieval (CBIR) is adopted in which the high-level visuals of images are represented in terms of feature vectors, which contain numerical values. It is a commonly used approach to understand the content of query images in retrieving relevant images. Still, the performance is less than optimal due to the presence of semantic gap among the image content representation and human visual understanding perspective because of the image content photometric, geometric variations and occlusions in search environments.FindingsThe authors proposed an image retrieval framework to generate semantic response through the feature extraction with convolution network and optimization of extracted features using adaptive moment estimation algorithm towards enhancing the retrieval performance.Originality/valueThe proposed framework is tested on Corel-1k and ImageNet datasets resulted in an accuracy of 98 and 96%, respectively, compared to the state-of-the-art approaches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于多媒体信息检索的深度卷积特征提取设计

目的多媒体内容分析在各种实时计算机视觉应用中得到了广泛的应用。在多媒体内容中，数字图像是一个重要的组成部分。人类对数字图像的解读具有主观性和复杂性。因此，从档案中寻找相关图像是困难的。因此，电子图像分析策略已成为图像解释过程中的有效工具。设计/方法/方法使用的传统方法是基于文本的，即使用文本注释搜索图像。手工对图像进行标注会消耗大量的时间，而且如果存档包含大量的样本，则难以减少对文本标注的依赖。因此，采用基于内容的图像检索(content-based image retrieval, CBIR)，将图像的高级视觉效果用包含数值的特征向量表示。在检索相关图像时，理解查询图像的内容是一种常用的方法。然而，由于搜索环境中图像内容的光度、几何变化和遮挡等因素，在图像内容表示和人类视觉理解角度之间存在语义差距，导致性能不够理想。作者提出了一种图像检索框架，通过卷积网络特征提取和自适应矩估计算法对提取的特征进行优化来生成语义响应，从而提高检索性能。与最先进的方法相比，所提出的框架在Corel-1k和ImageNet数据集上进行了测试，其准确率分别为98%和96%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Intelligent Unmanned Systems ROBOTICS-

CiteScore

3.50

自引率

0.00%

发文量