Proceedings of the 2019 on International Conference on Multimedia Retrieval最新文献

英文中文

Multimodal Dialog for Browsing Large Visual Catalogs using Exploration-Exploitation Paradigm in a Joint Embedding Space 在联合嵌入空间中使用探索-开发范式浏览大型视觉目录的多模态对话

Proceedings of the 2019 on International Conference on Multimedia Retrieval

Pub Date : 2019-01-28 DOI: 10.1145/3323873.3325036

Indrani Bhattacharya, Arkabandhu Chowdhury, V. Raykar

We present a multimodal dialog (MMD) system to assist online customers in visually browsing through large catalogs. Visual browsing allows customers to explore products beyond exact search results. We focus on a slightly asymmetric version of a complete MMD system, in that our agent can understand both text and image queries, but responds only in images. We formulate our problem of "showing the k best images to a user'', based on the dialog context so far, as sampling from a Gaussian Mixture Model (GMM) in a high dimensional joint multimodal embedding space. The joint embedding space is learned by Common Representation Learning and embeds both the text and the image queries. Our system remembers the context of the dialog, and uses an exploration-exploitation paradigm to assist in visual browsing. We train and evaluate the system on an MMD dataset that we synthesize from large catalog data. Our experiments and preliminary human evaluation show that the system is capable of learning and displaying relevant products with an average cosine similarity of 0.85 to the ground truth results, and is capable of engaging human users.

我们提出了一个多模式对话(MMD)系统，以帮助在线客户在视觉上浏览大型目录。视觉浏览允许客户在精确的搜索结果之外探索产品。我们专注于一个完整的MMD系统的稍微不对称的版本，因为我们的代理可以理解文本和图像查询，但只响应图像。基于到目前为止的对话上下文，我们将“向用户展示k张最佳图像”的问题表述为从高维联合多模态嵌入空间中的高斯混合模型(GMM)中采样。联合嵌入空间通过公共表示学习来学习，并嵌入文本和图像查询。我们的系统会记住对话框的上下文，并使用探索-开发模式来辅助视觉浏览。我们在从大型目录数据合成的MMD数据集上训练和评估系统。我们的实验和初步的人类评估表明，该系统能够学习和显示相关产品，与地面真实结果的平均余弦相似度为0.85，并且能够吸引人类用户。

{"title":"Multimodal Dialog for Browsing Large Visual Catalogs using Exploration-Exploitation Paradigm in a Joint Embedding Space","authors":"Indrani Bhattacharya, Arkabandhu Chowdhury, V. Raykar","doi":"10.1145/3323873.3325036","DOIUrl":"https://doi.org/10.1145/3323873.3325036","url":null,"abstract":"We present a multimodal dialog (MMD) system to assist online customers in visually browsing through large catalogs. Visual browsing allows customers to explore products beyond exact search results. We focus on a slightly asymmetric version of a complete MMD system, in that our agent can understand both text and image queries, but responds only in images. We formulate our problem of \"showing the k best images to a user'', based on the dialog context so far, as sampling from a Gaussian Mixture Model (GMM) in a high dimensional joint multimodal embedding space. The joint embedding space is learned by Common Representation Learning and embeds both the text and the image queries. Our system remembers the context of the dialog, and uses an exploration-exploitation paradigm to assist in visual browsing. We train and evaluate the system on an MMD dataset that we synthesize from large catalog data. Our experiments and preliminary human evaluation show that the system is capable of learning and displaying relevant products with an average cosine similarity of 0.85 to the ground truth results, and is capable of engaging human users.","PeriodicalId":149041,"journal":{"name":"Proceedings of the 2019 on International Conference on Multimedia Retrieval","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127918851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation 基于视觉分布表征的设定距离多镜头人物再识别

Proceedings of the 2019 on International Conference on Multimedia Retrieval

Pub Date : 2018-08-03 DOI: 10.1145/3323873.3325030

Ting-yao Hu, Alexander Hauptmann

Person re-identification aims to identify a specific person at distinct times and locations. It is challenging because of occlusion, illumination, and viewpoint change in camera views. Recently, multi-shot person re-id task receives more attention since it is closer to real-world application. A key point of a good algorithm for multi-shot person re-id is the temporal aggregation of the person appearance features. While most of the current approaches apply pooling strategies and obtain a fixed-size vector representation, these may lose the matching evidence between examples. In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space. Based on the supervision signals from a downstream task of interest, the method reshapes the appearance feature space and further learns the unknown distribution of each image set. In the context of multi-shot person re-id, we apply this novel concept along with Wasserstein distance and jointly learn a distributional set distance function between two image sets. In this way, the proper alignment between two image sets can be discovered naturally in a non-parametric manner. Our experiment results on three public datasets show the advantages of our proposed method compared to other state-of-the-art approaches.

人物再识别的目的是在不同的时间和地点识别特定的人。这是具有挑战性的，因为在相机视图中遮挡，照明和视点变化。近年来，多人重识别任务因其更接近现实应用而受到越来越多的关注。一个好的多镜头人物身份识别算法的关键是人物外表特征的时间聚合。虽然目前大多数方法采用池化策略并获得固定大小的向量表示，但这些方法可能会丢失示例之间的匹配证据。在这项工作中，我们提出了视觉分布表示的思想，它将图像集解释为从外观特征空间中的未知分布中提取的样本。该方法基于下游感兴趣任务的监督信号，重构外观特征空间，并进一步学习每个图像集的未知分布。在多镜头人物重新识别的背景下，我们将这个新概念与Wasserstein距离一起应用，共同学习两个图像集之间的分布集距离函数。通过这种方式，可以以非参数的方式自然地发现两个图像集之间的适当对齐。我们在三个公共数据集上的实验结果表明，与其他最先进的方法相比，我们提出的方法具有优势。

{"title":"Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation","authors":"Ting-yao Hu, Alexander Hauptmann","doi":"10.1145/3323873.3325030","DOIUrl":"https://doi.org/10.1145/3323873.3325030","url":null,"abstract":"Person re-identification aims to identify a specific person at distinct times and locations. It is challenging because of occlusion, illumination, and viewpoint change in camera views. Recently, multi-shot person re-id task receives more attention since it is closer to real-world application. A key point of a good algorithm for multi-shot person re-id is the temporal aggregation of the person appearance features. While most of the current approaches apply pooling strategies and obtain a fixed-size vector representation, these may lose the matching evidence between examples. In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space. Based on the supervision signals from a downstream task of interest, the method reshapes the appearance feature space and further learns the unknown distribution of each image set. In the context of multi-shot person re-id, we apply this novel concept along with Wasserstein distance and jointly learn a distributional set distance function between two image sets. In this way, the proper alignment between two image sets can be discovered naturally in a non-parametric manner. Our experiment results on three public datasets show the advantages of our proposed method compared to other state-of-the-art approaches.","PeriodicalId":149041,"journal":{"name":"Proceedings of the 2019 on International Conference on Multimedia Retrieval","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127661563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2019 on International Conference on Multimedia Retrieval

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀