{"title":"基于扩散模式的跨模态哈希方法研究","authors":"Wenjiao Li, Zirui Zhong","doi":"10.1117/12.2682410","DOIUrl":null,"url":null,"abstract":"For achieving fast and flexible retrieval across heterogeneous modalities, unsupervised is more flexible and easy to use than supervised methods, of which the unsupervised method GAN is the most popular. However, GAN has been suffering from the problems of lack of diversity in generated samples, debugging difficulties and training instability. A cross-modal hashing method based on a diffusion model is proposed in the paper. Specifically: (1) For the first time, the diffusion model is applied to the field of cross-modal retrieval, targeting three modalities for mutual retrieval. (2) The combination of adversarial network GAN and diffusion model improves the sample quality and sample diversity, and ameliorates the problems of complex GAN debugging and unstable training. The effectiveness of the proposed method is demonstrated through experiments on three datasets and comparison with state-of-the-art methods.","PeriodicalId":440430,"journal":{"name":"International Conference on Electronic Technology and Information Science","volume":"354 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on cross-modal hashing method based on diffusion mode\",\"authors\":\"Wenjiao Li, Zirui Zhong\",\"doi\":\"10.1117/12.2682410\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For achieving fast and flexible retrieval across heterogeneous modalities, unsupervised is more flexible and easy to use than supervised methods, of which the unsupervised method GAN is the most popular. However, GAN has been suffering from the problems of lack of diversity in generated samples, debugging difficulties and training instability. A cross-modal hashing method based on a diffusion model is proposed in the paper. Specifically: (1) For the first time, the diffusion model is applied to the field of cross-modal retrieval, targeting three modalities for mutual retrieval. (2) The combination of adversarial network GAN and diffusion model improves the sample quality and sample diversity, and ameliorates the problems of complex GAN debugging and unstable training. The effectiveness of the proposed method is demonstrated through experiments on three datasets and comparison with state-of-the-art methods.\",\"PeriodicalId\":440430,\"journal\":{\"name\":\"International Conference on Electronic Technology and Information Science\",\"volume\":\"354 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Electronic Technology and Information Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2682410\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Electronic Technology and Information Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2682410","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on cross-modal hashing method based on diffusion mode
For achieving fast and flexible retrieval across heterogeneous modalities, unsupervised is more flexible and easy to use than supervised methods, of which the unsupervised method GAN is the most popular. However, GAN has been suffering from the problems of lack of diversity in generated samples, debugging difficulties and training instability. A cross-modal hashing method based on a diffusion model is proposed in the paper. Specifically: (1) For the first time, the diffusion model is applied to the field of cross-modal retrieval, targeting three modalities for mutual retrieval. (2) The combination of adversarial network GAN and diffusion model improves the sample quality and sample diversity, and ameliorates the problems of complex GAN debugging and unstable training. The effectiveness of the proposed method is demonstrated through experiments on three datasets and comparison with state-of-the-art methods.