Ekrem Yildiz, Ege Burak Safdil, Furkan Arslan, H. F. Alsan, Taner Arsan
{"title":"基于多模态数据嵌入的多类型学习","authors":"Ekrem Yildiz, Ege Burak Safdil, Furkan Arslan, H. F. Alsan, Taner Arsan","doi":"10.1109/ISMSIT52890.2021.9604738","DOIUrl":null,"url":null,"abstract":"This paper creates a multimodal retrieval system for image and text data in a multi-type learning approach that enables text-to-image, image-to-text, text-to-text, and image-to-image retrievals. As a practical solution, a mobile application is developed in which the users can upload their images to search a description sentence for the images. The user system is created on the application, which is done with React Native, and crucial features like e-mail authentication and reset password options are added to the application. An essential database system is designed with PostgreSQL to store user information and search for the user. The multimodal embedding study is worked, and the model that recognizes multitype retrievals is formed. The image-to-text retrieval model, which is our application’s idea, is applied to the mobile application.","PeriodicalId":120997,"journal":{"name":"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multitype Learning via Multimodal Data Embedding\",\"authors\":\"Ekrem Yildiz, Ege Burak Safdil, Furkan Arslan, H. F. Alsan, Taner Arsan\",\"doi\":\"10.1109/ISMSIT52890.2021.9604738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper creates a multimodal retrieval system for image and text data in a multi-type learning approach that enables text-to-image, image-to-text, text-to-text, and image-to-image retrievals. As a practical solution, a mobile application is developed in which the users can upload their images to search a description sentence for the images. The user system is created on the application, which is done with React Native, and crucial features like e-mail authentication and reset password options are added to the application. An essential database system is designed with PostgreSQL to store user information and search for the user. The multimodal embedding study is worked, and the model that recognizes multitype retrievals is formed. The image-to-text retrieval model, which is our application’s idea, is applied to the mobile application.\",\"PeriodicalId\":120997,\"journal\":{\"name\":\"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)\",\"volume\":\"121 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISMSIT52890.2021.9604738\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISMSIT52890.2021.9604738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper creates a multimodal retrieval system for image and text data in a multi-type learning approach that enables text-to-image, image-to-text, text-to-text, and image-to-image retrievals. As a practical solution, a mobile application is developed in which the users can upload their images to search a description sentence for the images. The user system is created on the application, which is done with React Native, and crucial features like e-mail authentication and reset password options are added to the application. An essential database system is designed with PostgreSQL to store user information and search for the user. The multimodal embedding study is worked, and the model that recognizes multitype retrievals is formed. The image-to-text retrieval model, which is our application’s idea, is applied to the mobile application.