基于双模态反向重排(DM-RR)的图像检索框架

IF 5.2 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Open Journal of the Industrial Electronics Society Pub Date : 2024-07-30 DOI:10.1109/OJIES.2024.3435956
Ikhlaq Ahmed;Naima Iltaf;Rabia Latif;Nor Shahida Mohd Jamail;Zafran Khan
{"title":"基于双模态反向重排(DM-RR)的图像检索框架","authors":"Ikhlaq Ahmed;Naima Iltaf;Rabia Latif;Nor Shahida Mohd Jamail;Zafran Khan","doi":"10.1109/OJIES.2024.3435956","DOIUrl":null,"url":null,"abstract":"Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.","PeriodicalId":52675,"journal":{"name":"IEEE Open Journal of the Industrial Electronics Society","volume":"5 ","pages":"886-897"},"PeriodicalIF":5.2000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10614798","citationCount":"0","resultStr":"{\"title\":\"Dual Modality Reverse Reranking (DM-RR) Based Image Retrieval Framework\",\"authors\":\"Ikhlaq Ahmed;Naima Iltaf;Rabia Latif;Nor Shahida Mohd Jamail;Zafran Khan\",\"doi\":\"10.1109/OJIES.2024.3435956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.\",\"PeriodicalId\":52675,\"journal\":{\"name\":\"IEEE Open Journal of the Industrial Electronics Society\",\"volume\":\"5 \",\"pages\":\"886-897\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10614798\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Industrial Electronics Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10614798/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Industrial Electronics Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10614798/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

在我们的日常生活中,经常会遇到从大量的在线工业平台中检索带有所需修改的产品的情况。本研究提出了一个专门的框架,用于检索用户查询的产品,并将其所需的修改纳入其中。为了促进终端用户和代理在这种情况下的互动,一个基于多模态内容的图像检索系统是必不可少的。该系统提取文本和视觉属性,并通过归纳学习将它们组合成统一的表示形式。该系统基于对视觉特征的深入理解,而视觉特征是由文本语义修改而来的。最后,一种新颖的反向重新排序(RR)算法将双模态查询及其相应的目标图像进行联合表示,以实现高效检索。与早期的方法相比,所提出的框架具有新颖性。首先,它实现了两种不同模态的成功融合。其次,它在推理阶段引入了 RR 算法,以实现高效检索。我们使用 Fashion-200 K 和 MIT-States 真实世界基准数据集对拟议框架的增强性能进行了评估。提议的系统可用于现实世界的应用中,但需考虑其实际影响,如对不同领域的通用性、特定领域数据的可用性、数据和查询的性质以及计算资源的可用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Dual Modality Reverse Reranking (DM-RR) Based Image Retrieval Framework
Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Open Journal of the Industrial Electronics Society
IEEE Open Journal of the Industrial Electronics Society ENGINEERING, ELECTRICAL & ELECTRONIC-
CiteScore
10.80
自引率
2.40%
发文量
33
审稿时长
12 weeks
期刊介绍: The IEEE Open Journal of the Industrial Electronics Society is dedicated to advancing information-intensive, knowledge-based automation, and digitalization, aiming to enhance various industrial and infrastructural ecosystems including energy, mobility, health, and home/building infrastructure. Encompassing a range of techniques leveraging data and information acquisition, analysis, manipulation, and distribution, the journal strives to achieve greater flexibility, efficiency, effectiveness, reliability, and security within digitalized and networked environments. Our scope provides a platform for discourse and dissemination of the latest developments in numerous research and innovation areas. These include electrical components and systems, smart grids, industrial cyber-physical systems, motion control, robotics and mechatronics, sensors and actuators, factory and building communication and automation, industrial digitalization, flexible and reconfigurable manufacturing, assistant systems, industrial applications of artificial intelligence and data science, as well as the implementation of machine learning, artificial neural networks, and fuzzy logic. Additionally, we explore human factors in digitalized and networked ecosystems. Join us in exploring and shaping the future of industrial electronics and digitalization.
期刊最新文献
Short-Term Control of Heat Pumps to Support Power Grid Operation Effects of Grid Voltage and Load Unbalances on the Efficiency of a Hybrid Distribution Transformer Enhanced PI Control Based SHC-PWM Strategy for Active Power Filters A Detailed Study on Algorithms for Predictive Maintenance in Smart Manufacturing: Chip Form Classification Using Edge Machine Learning Design and Evaluation of a Voice-Controlled Elevator System to Improve the Safety and Accessibility
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1