{"title":"The Design and Research of Cross-media Retrieval","authors":"Chen Li, Jing Zhang, Chunhua Wang, Yaqiong Fan","doi":"10.1109/ICCIS56375.2022.9998137","DOIUrl":null,"url":null,"abstract":"Nowadays, the rapid development of Internet technology drives the advance of the big data era, and along with the widespread use of smart devices and social media, multimedia data is growing explosively. With the increasingly complex and diverse needs of information exchange, collection and storage, the types of information have also evolved from traditional text information to diverse data forms such as pictures and video and audio, bringing different degrees of convenience to people's work and life and other scenarios. However, the huge amount of multimedia data also makes information storage and retrieval more cumbersome. How to realize the effective storage and efficient retrieval of data, so as to better utilize the value of multimedia data, is one of the challenges that academia and information industry are tackling nowadays. In this paper, we study the cross-media retrieval technology of text and image by Contrastive Language-Image Pre-training model based on natural language processing method. The cross-media pre-training idea proposed in this paper can be applied not only to text-image processing, but also theoretically to mutual retrieval of modal information of video and audio, etc.","PeriodicalId":398546,"journal":{"name":"2022 6th International Conference on Communication and Information Systems (ICCIS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th International Conference on Communication and Information Systems (ICCIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIS56375.2022.9998137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, the rapid development of Internet technology drives the advance of the big data era, and along with the widespread use of smart devices and social media, multimedia data is growing explosively. With the increasingly complex and diverse needs of information exchange, collection and storage, the types of information have also evolved from traditional text information to diverse data forms such as pictures and video and audio, bringing different degrees of convenience to people's work and life and other scenarios. However, the huge amount of multimedia data also makes information storage and retrieval more cumbersome. How to realize the effective storage and efficient retrieval of data, so as to better utilize the value of multimedia data, is one of the challenges that academia and information industry are tackling nowadays. In this paper, we study the cross-media retrieval technology of text and image by Contrastive Language-Image Pre-training model based on natural language processing method. The cross-media pre-training idea proposed in this paper can be applied not only to text-image processing, but also theoretically to mutual retrieval of modal information of video and audio, etc.