{"title":"用于日常场景的大规模图像检索系统","authors":"Arun Zachariah, Mohamed Gharibi, P. Rao","doi":"10.1145/3444685.3446253","DOIUrl":null,"url":null,"abstract":"We present a system for large-scale image retrieval on everyday scenes with common objects. Our system leverages advances in deep learning and natural language processing (NLP) for improved understanding of images by capturing the relationships between the objects within an image. As a result, a user can retrieve highly relevant images and obtain suggestions for similar image queries to further explore the repository. Each image in the repository is processed (using deep learning) to obtain the most probable captions and objects in it. The captions are parsed into tree structures using NLP techniques, and stored and indexed in a database system. When a query image is posed, an optimized tree-pattern query is executed by the database system to obtain candidate matches, which are then ranked using tree-edit distance of the tree structures to output the top-k matches. Word embeddings and Bloom filters are used to obtain similar image queries. By clicking the suggested similar image queries, a user can intuitively explore the repository.","PeriodicalId":119278,"journal":{"name":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A large-scale image retrieval system for everyday scenes\",\"authors\":\"Arun Zachariah, Mohamed Gharibi, P. Rao\",\"doi\":\"10.1145/3444685.3446253\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a system for large-scale image retrieval on everyday scenes with common objects. Our system leverages advances in deep learning and natural language processing (NLP) for improved understanding of images by capturing the relationships between the objects within an image. As a result, a user can retrieve highly relevant images and obtain suggestions for similar image queries to further explore the repository. Each image in the repository is processed (using deep learning) to obtain the most probable captions and objects in it. The captions are parsed into tree structures using NLP techniques, and stored and indexed in a database system. When a query image is posed, an optimized tree-pattern query is executed by the database system to obtain candidate matches, which are then ranked using tree-edit distance of the tree structures to output the top-k matches. Word embeddings and Bloom filters are used to obtain similar image queries. By clicking the suggested similar image queries, a user can intuitively explore the repository.\",\"PeriodicalId\":119278,\"journal\":{\"name\":\"Proceedings of the 2nd ACM International Conference on Multimedia in Asia\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd ACM International Conference on Multimedia in Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3444685.3446253\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3444685.3446253","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A large-scale image retrieval system for everyday scenes
We present a system for large-scale image retrieval on everyday scenes with common objects. Our system leverages advances in deep learning and natural language processing (NLP) for improved understanding of images by capturing the relationships between the objects within an image. As a result, a user can retrieve highly relevant images and obtain suggestions for similar image queries to further explore the repository. Each image in the repository is processed (using deep learning) to obtain the most probable captions and objects in it. The captions are parsed into tree structures using NLP techniques, and stored and indexed in a database system. When a query image is posed, an optimized tree-pattern query is executed by the database system to obtain candidate matches, which are then ranked using tree-edit distance of the tree structures to output the top-k matches. Word embeddings and Bloom filters are used to obtain similar image queries. By clicking the suggested similar image queries, a user can intuitively explore the repository.