Multimodal Retrieval through Relations between Subjects and Objects in Lifelog Images

Proceedings of the Third Annual Workshop on Lifelog Search Challenge Pub Date : 2020-06-09 DOI:10.1145/3379172.3391723

Tai-Te Chu, Chia-Chun Chang, An-Zi Yen, Hen-Hsen Huang, Hsin-Hsi Chen

{"title":"Multimodal Retrieval through Relations between Subjects and Objects in Lifelog Images","authors":"Tai-Te Chu, Chia-Chun Chang, An-Zi Yen, Hen-Hsen Huang, Hsin-Hsi Chen","doi":"10.1145/3379172.3391723","DOIUrl":null,"url":null,"abstract":"With the development of wearable devices, people nowadays record their life experiences much easier than before. Lifelog retrieval becomes an emerging task. Because of the semantic gap between visual data and textual queries, retrieving lifelog images with text queries could be challenging. This paper proposes an interactive lifelog retrieval system that is aimed at retrieving more intuitive and accurate results. Our system is divided into the offline and the online parts. In the offline part, we aim to incorporate original visual and textual concepts from images into our system utilizing pre-trained word embedding. Moreover, we encode the information of relationships between subjects and objects in images by using a pre-trained relation graph generation model. In the online part, We provide an intuitive frontend with various metadata filters, which not only provides users with a convenient interface, but also a mechanism to exploit detail memory recall to users. In this case, users would clearly know the difference between the concepts in the clusters and efficiently browse the retrieved images clusters in a short time.","PeriodicalId":340585,"journal":{"name":"Proceedings of the Third Annual Workshop on Lifelog Search Challenge","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third Annual Workshop on Lifelog Search Challenge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3379172.3391723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

With the development of wearable devices, people nowadays record their life experiences much easier than before. Lifelog retrieval becomes an emerging task. Because of the semantic gap between visual data and textual queries, retrieving lifelog images with text queries could be challenging. This paper proposes an interactive lifelog retrieval system that is aimed at retrieving more intuitive and accurate results. Our system is divided into the offline and the online parts. In the offline part, we aim to incorporate original visual and textual concepts from images into our system utilizing pre-trained word embedding. Moreover, we encode the information of relationships between subjects and objects in images by using a pre-trained relation graph generation model. In the online part, We provide an intuitive frontend with various metadata filters, which not only provides users with a convenient interface, but also a mechanism to exploit detail memory recall to users. In this case, users would clearly know the difference between the concepts in the clusters and efficiently browse the retrieved images clusters in a short time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于生活日志图像主客体关系的多模态检索

随着可穿戴设备的发展，人们比以前更容易记录自己的生活经历。生活日志检索成为一项新兴任务。由于视觉数据和文本查询之间的语义差距，使用文本查询检索生活日志图像可能具有挑战性。本文提出了一种交互式生活日志检索系统，旨在检索更直观、更准确的结果。我们的系统分为离线和在线两部分。在离线部分，我们的目标是利用预训练的词嵌入将图像中的原始视觉和文本概念整合到我们的系统中。此外，我们使用预训练的关系图生成模型对图像中主客体之间的关系信息进行编码。在在线部分，我们提供了一个直观的带有各种元数据过滤器的前端，不仅为用户提供了一个方便的界面，而且为用户提供了一种利用细节记忆的机制。在这种情况下，用户将清楚地知道集群中概念之间的差异，并在短时间内有效地浏览检索到的图像集群。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Third Annual Workshop on Lifelog Search Challenge

自引率

0.00%

发文量