Naihua Ji , Yongqiang Sun , Fanyun Meng , Liping Pang , Yuzhu Tian
{"title":"多任务学习的可变多尺度注意融合网络和自适应校正梯度优化","authors":"Naihua Ji , Yongqiang Sun , Fanyun Meng , Liping Pang , Yuzhu Tian","doi":"10.1016/j.patcog.2025.111423","DOIUrl":null,"url":null,"abstract":"<div><div>Network architecture and optimization are two indispensable parts in multi-task learning, which together improve the performance of multi-task learning. Previous work has rarely focused on both aspects simultaneously. In this paper, we analyze the multi-task learning from network architecture and optimization. In network architecture aspect, we propose a variable multi-scale attention fusion network, which overcomes the issue of feature loss when processing small-scale feature maps during upsampling and resolves the problem of inadequate learning in conventional multi-scale models due to significant spatial size disparities. In optimization aspect, a adaptive correcting gradient scheme is put forward to treat the defects of conflicts and dominance among multiple tasks during the process of training, and it effectively alleviates the imbalance of multi-task training. Various ablation experiments and comparative experiments demonstrate that simultaneously considering the network framework and optimization can make great improvement for the performance of multi-task learning. Our code is available at <span><span>https://github.com/SyqxhSt/Net-Opt-MTL</span><svg><path></path></svg></span></div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"162 ","pages":"Article 111423"},"PeriodicalIF":9.1000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Variable multi-scale attention fusion network and adaptive correcting gradient optimization for multi-task learning\",\"authors\":\"Naihua Ji , Yongqiang Sun , Fanyun Meng , Liping Pang , Yuzhu Tian\",\"doi\":\"10.1016/j.patcog.2025.111423\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Network architecture and optimization are two indispensable parts in multi-task learning, which together improve the performance of multi-task learning. Previous work has rarely focused on both aspects simultaneously. In this paper, we analyze the multi-task learning from network architecture and optimization. In network architecture aspect, we propose a variable multi-scale attention fusion network, which overcomes the issue of feature loss when processing small-scale feature maps during upsampling and resolves the problem of inadequate learning in conventional multi-scale models due to significant spatial size disparities. In optimization aspect, a adaptive correcting gradient scheme is put forward to treat the defects of conflicts and dominance among multiple tasks during the process of training, and it effectively alleviates the imbalance of multi-task training. Various ablation experiments and comparative experiments demonstrate that simultaneously considering the network framework and optimization can make great improvement for the performance of multi-task learning. Our code is available at <span><span>https://github.com/SyqxhSt/Net-Opt-MTL</span><svg><path></path></svg></span></div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"162 \",\"pages\":\"Article 111423\"},\"PeriodicalIF\":9.1000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325000834\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325000834","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/1 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Variable multi-scale attention fusion network and adaptive correcting gradient optimization for multi-task learning
Network architecture and optimization are two indispensable parts in multi-task learning, which together improve the performance of multi-task learning. Previous work has rarely focused on both aspects simultaneously. In this paper, we analyze the multi-task learning from network architecture and optimization. In network architecture aspect, we propose a variable multi-scale attention fusion network, which overcomes the issue of feature loss when processing small-scale feature maps during upsampling and resolves the problem of inadequate learning in conventional multi-scale models due to significant spatial size disparities. In optimization aspect, a adaptive correcting gradient scheme is put forward to treat the defects of conflicts and dominance among multiple tasks during the process of training, and it effectively alleviates the imbalance of multi-task training. Various ablation experiments and comparative experiments demonstrate that simultaneously considering the network framework and optimization can make great improvement for the performance of multi-task learning. Our code is available at https://github.com/SyqxhSt/Net-Opt-MTL
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.