Ikhlaq Ahmed;Naima Iltaf;Rabia Latif;Nor Shahida Mohd Jamail;Zafran Khan
{"title":"基于双模态反向重排(DM-RR)的图像检索框架","authors":"Ikhlaq Ahmed;Naima Iltaf;Rabia Latif;Nor Shahida Mohd Jamail;Zafran Khan","doi":"10.1109/OJIES.2024.3435956","DOIUrl":null,"url":null,"abstract":"Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.","PeriodicalId":52675,"journal":{"name":"IEEE Open Journal of the Industrial Electronics Society","volume":"5 ","pages":"886-897"},"PeriodicalIF":5.2000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10614798","citationCount":"0","resultStr":"{\"title\":\"Dual Modality Reverse Reranking (DM-RR) Based Image Retrieval Framework\",\"authors\":\"Ikhlaq Ahmed;Naima Iltaf;Rabia Latif;Nor Shahida Mohd Jamail;Zafran Khan\",\"doi\":\"10.1109/OJIES.2024.3435956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.\",\"PeriodicalId\":52675,\"journal\":{\"name\":\"IEEE Open Journal of the Industrial Electronics Society\",\"volume\":\"5 \",\"pages\":\"886-897\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10614798\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Industrial Electronics Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10614798/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Industrial Electronics Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10614798/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
在我们的日常生活中,经常会遇到从大量的在线工业平台中检索带有所需修改的产品的情况。本研究提出了一个专门的框架,用于检索用户查询的产品,并将其所需的修改纳入其中。为了促进终端用户和代理在这种情况下的互动,一个基于多模态内容的图像检索系统是必不可少的。该系统提取文本和视觉属性,并通过归纳学习将它们组合成统一的表示形式。该系统基于对视觉特征的深入理解,而视觉特征是由文本语义修改而来的。最后,一种新颖的反向重新排序(RR)算法将双模态查询及其相应的目标图像进行联合表示,以实现高效检索。与早期的方法相比,所提出的框架具有新颖性。首先,它实现了两种不同模态的成功融合。其次,它在推理阶段引入了 RR 算法,以实现高效检索。我们使用 Fashion-200 K 和 MIT-States 真实世界基准数据集对拟议框架的增强性能进行了评估。提议的系统可用于现实世界的应用中,但需考虑其实际影响,如对不同领域的通用性、特定领域数据的可用性、数据和查询的性质以及计算资源的可用性。
Dual Modality Reverse Reranking (DM-RR) Based Image Retrieval Framework
Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.
期刊介绍:
The IEEE Open Journal of the Industrial Electronics Society is dedicated to advancing information-intensive, knowledge-based automation, and digitalization, aiming to enhance various industrial and infrastructural ecosystems including energy, mobility, health, and home/building infrastructure. Encompassing a range of techniques leveraging data and information acquisition, analysis, manipulation, and distribution, the journal strives to achieve greater flexibility, efficiency, effectiveness, reliability, and security within digitalized and networked environments.
Our scope provides a platform for discourse and dissemination of the latest developments in numerous research and innovation areas. These include electrical components and systems, smart grids, industrial cyber-physical systems, motion control, robotics and mechatronics, sensors and actuators, factory and building communication and automation, industrial digitalization, flexible and reconfigurable manufacturing, assistant systems, industrial applications of artificial intelligence and data science, as well as the implementation of machine learning, artificial neural networks, and fuzzy logic. Additionally, we explore human factors in digitalized and networked ecosystems. Join us in exploring and shaping the future of industrial electronics and digitalization.