Asma Belhadi , Youcef Djenouri , Ahmed Nabil Belbachir , Tomasz Michalak , Gautam Srivastava
{"title":"用于图像到文本生成的夏普利视觉变换器","authors":"Asma Belhadi , Youcef Djenouri , Ahmed Nabil Belbachir , Tomasz Michalak , Gautam Srivastava","doi":"10.1016/j.asoc.2024.112205","DOIUrl":null,"url":null,"abstract":"<div><p>In the contemporary landscape of the web, text-to-image generation stands out as a crucial information service. Recently, deep learning has emerged as the cutting-edge methodology for advancing text-to-image generation systems. However, these models are typically constructed using domain knowledge specific to the application at hand and a very particular data distribution. Consequently, data scientists must be well-versed in the relevant subject. In this research work, we target a new foundation for text-to-image generation systems by introducing a consensus method that facilitates self-adaptation and flexibility to handle different learning tasks and diverse data distributions. This paper presents I2T-SP (Image-to-Text Generation for Shapley Pruning) as a consensus method for general-purpose intelligence without the assistance of a domain expert. The trained model is developed using a general deep-learning approach that investigates the contribution of each model in the training process. Multiple deep learning models are trained for each set of historical data, and the Shapley Value is determined to compute the contribution of each subset of models in the training. Subsequently, the models are pruned according to their contribution to the learning process. We present the evaluation of the generality of I2T-SP using different datasets with varying shapes and complexities. The results reveal the effectiveness of I2T-SP compared to baseline image-to-text generation solutions. This research marks a significant step towards establishing a more adaptable and broadly applicable foundation for image-to-text generation systems.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1568494624009797/pdfft?md5=39a52c2a5fb7a1074b8c576121d7aca6&pid=1-s2.0-S1568494624009797-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Shapley visual transformers for image-to-text generation\",\"authors\":\"Asma Belhadi , Youcef Djenouri , Ahmed Nabil Belbachir , Tomasz Michalak , Gautam Srivastava\",\"doi\":\"10.1016/j.asoc.2024.112205\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In the contemporary landscape of the web, text-to-image generation stands out as a crucial information service. Recently, deep learning has emerged as the cutting-edge methodology for advancing text-to-image generation systems. However, these models are typically constructed using domain knowledge specific to the application at hand and a very particular data distribution. Consequently, data scientists must be well-versed in the relevant subject. In this research work, we target a new foundation for text-to-image generation systems by introducing a consensus method that facilitates self-adaptation and flexibility to handle different learning tasks and diverse data distributions. This paper presents I2T-SP (Image-to-Text Generation for Shapley Pruning) as a consensus method for general-purpose intelligence without the assistance of a domain expert. The trained model is developed using a general deep-learning approach that investigates the contribution of each model in the training process. Multiple deep learning models are trained for each set of historical data, and the Shapley Value is determined to compute the contribution of each subset of models in the training. Subsequently, the models are pruned according to their contribution to the learning process. We present the evaluation of the generality of I2T-SP using different datasets with varying shapes and complexities. The results reveal the effectiveness of I2T-SP compared to baseline image-to-text generation solutions. This research marks a significant step towards establishing a more adaptable and broadly applicable foundation for image-to-text generation systems.</p></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1568494624009797/pdfft?md5=39a52c2a5fb7a1074b8c576121d7aca6&pid=1-s2.0-S1568494624009797-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494624009797\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624009797","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Shapley visual transformers for image-to-text generation
In the contemporary landscape of the web, text-to-image generation stands out as a crucial information service. Recently, deep learning has emerged as the cutting-edge methodology for advancing text-to-image generation systems. However, these models are typically constructed using domain knowledge specific to the application at hand and a very particular data distribution. Consequently, data scientists must be well-versed in the relevant subject. In this research work, we target a new foundation for text-to-image generation systems by introducing a consensus method that facilitates self-adaptation and flexibility to handle different learning tasks and diverse data distributions. This paper presents I2T-SP (Image-to-Text Generation for Shapley Pruning) as a consensus method for general-purpose intelligence without the assistance of a domain expert. The trained model is developed using a general deep-learning approach that investigates the contribution of each model in the training process. Multiple deep learning models are trained for each set of historical data, and the Shapley Value is determined to compute the contribution of each subset of models in the training. Subsequently, the models are pruned according to their contribution to the learning process. We present the evaluation of the generality of I2T-SP using different datasets with varying shapes and complexities. The results reveal the effectiveness of I2T-SP compared to baseline image-to-text generation solutions. This research marks a significant step towards establishing a more adaptable and broadly applicable foundation for image-to-text generation systems.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.