{"title":"视觉变压器稳定域自适应的低秩自适应","authors":"N. Filatov, M. Kindulov","doi":"10.3103/S1060992X2306005X","DOIUrl":null,"url":null,"abstract":"<p>Unsupervised domain adaptation plays a crucial role in semantic segmentation tasks due to the high cost of annotating data. Existing approaches often rely on large transformer models and momentum networks to stabilize and improve the self-training process. In this study, we investigate the applicability of low-rank adaptation (LoRA) to domain adaptation in computer vision. Our focus is on the unsupervised domain adaptation task of semantic segmentation, which requires adapting models from a synthetic dataset (GTA5) to a real-world dataset (City-scapes). We employ the Swin Transformer as the feature extractor and TransDA domain adaptation framework. Through experiments, we demonstrate that LoRA effectively stabilizes the self-training process, achieving similar training dynamics to the exponentially moving average (EMA) mechanism. Moreover, LoRA provides comparable metrics to EMA under the same limited computation budget. In GTA5 → Cityscapes experiments, the adaptation pipeline with LoRA achieves a mIoU of 0.515, slightly surpassing the EMA baseline’s mIoU of 0.513, while also offering an 11% speedup in training time and video memory saving. These re-sults highlight LoRA as a promising approach for domain adaptation in computer vision, offering a viable alternative to momentum networks which also saves computational resources.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"32 2","pages":"S277 - S283"},"PeriodicalIF":1.0000,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers\",\"authors\":\"N. Filatov, M. Kindulov\",\"doi\":\"10.3103/S1060992X2306005X\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Unsupervised domain adaptation plays a crucial role in semantic segmentation tasks due to the high cost of annotating data. Existing approaches often rely on large transformer models and momentum networks to stabilize and improve the self-training process. In this study, we investigate the applicability of low-rank adaptation (LoRA) to domain adaptation in computer vision. Our focus is on the unsupervised domain adaptation task of semantic segmentation, which requires adapting models from a synthetic dataset (GTA5) to a real-world dataset (City-scapes). We employ the Swin Transformer as the feature extractor and TransDA domain adaptation framework. Through experiments, we demonstrate that LoRA effectively stabilizes the self-training process, achieving similar training dynamics to the exponentially moving average (EMA) mechanism. Moreover, LoRA provides comparable metrics to EMA under the same limited computation budget. In GTA5 → Cityscapes experiments, the adaptation pipeline with LoRA achieves a mIoU of 0.515, slightly surpassing the EMA baseline’s mIoU of 0.513, while also offering an 11% speedup in training time and video memory saving. These re-sults highlight LoRA as a promising approach for domain adaptation in computer vision, offering a viable alternative to momentum networks which also saves computational resources.</p>\",\"PeriodicalId\":721,\"journal\":{\"name\":\"Optical Memory and Neural Networks\",\"volume\":\"32 2\",\"pages\":\"S277 - S283\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optical Memory and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.3103/S1060992X2306005X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optical Memory and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S1060992X2306005X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}
Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers
Unsupervised domain adaptation plays a crucial role in semantic segmentation tasks due to the high cost of annotating data. Existing approaches often rely on large transformer models and momentum networks to stabilize and improve the self-training process. In this study, we investigate the applicability of low-rank adaptation (LoRA) to domain adaptation in computer vision. Our focus is on the unsupervised domain adaptation task of semantic segmentation, which requires adapting models from a synthetic dataset (GTA5) to a real-world dataset (City-scapes). We employ the Swin Transformer as the feature extractor and TransDA domain adaptation framework. Through experiments, we demonstrate that LoRA effectively stabilizes the self-training process, achieving similar training dynamics to the exponentially moving average (EMA) mechanism. Moreover, LoRA provides comparable metrics to EMA under the same limited computation budget. In GTA5 → Cityscapes experiments, the adaptation pipeline with LoRA achieves a mIoU of 0.515, slightly surpassing the EMA baseline’s mIoU of 0.513, while also offering an 11% speedup in training time and video memory saving. These re-sults highlight LoRA as a promising approach for domain adaptation in computer vision, offering a viable alternative to momentum networks which also saves computational resources.
期刊介绍:
The journal covers a wide range of issues in information optics such as optical memory, mechanisms for optical data recording and processing, photosensitive materials, optical, optoelectronic and holographic nanostructures, and many other related topics. Papers on memory systems using holographic and biological structures and concepts of brain operation are also included. The journal pays particular attention to research in the field of neural net systems that may lead to a new generation of computional technologies by endowing them with intelligence.