评估通过参数效率微调方法训练的参数矩阵的可移植性

Findings Pub Date : 2024-01-25 DOI:10.48550/arXiv.2401.14228
Mohammed Sabry, Anya Belz
{"title":"评估通过参数效率微调方法训练的参数矩阵的可移植性","authors":"Mohammed Sabry, Anya Belz","doi":"10.48550/arXiv.2401.14228","DOIUrl":null,"url":null,"abstract":"As the cost of training ever larger language models has grown, so has the interest in reusing previously learnt knowledge. Transfer learning methods have shown how reusing non-task-specific knowledge can help in subsequent task-specific learning.In this paper, we investigate the inverse: porting whole functional modules that encode task-specific knowledge from one model to another. We designed a study comprising 1,440 training/testing runs to test the portability of modules trained by parameter-efficient finetuning (PEFT) techniques, using sentiment analysis as an example task. We test portability in a wide range of scenarios, involving different PEFT techniques and different pretrained host models, among other dimensions. We compare the performance of ported modules with that of equivalent modules trained (i) from scratch, and (ii) from parameters sampled from the same distribution as the ported module.We find that the ported modules far outperform the two alternatives tested, but that there are interesting differences between the four PEFT techniques tested.We conclude that task-specific knowledge in the form of structurally modular sets of parameters as produced by PEFT techniques is highly portable, but that degree of success depends on type of PEFT and on differences between originating and receiving pretrained models.","PeriodicalId":508951,"journal":{"name":"Findings","volume":"30 3","pages":"1548-1556"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing the Portability of Parameter Matrices Trained by Parameter-Efficient Finetuning Methods\",\"authors\":\"Mohammed Sabry, Anya Belz\",\"doi\":\"10.48550/arXiv.2401.14228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the cost of training ever larger language models has grown, so has the interest in reusing previously learnt knowledge. Transfer learning methods have shown how reusing non-task-specific knowledge can help in subsequent task-specific learning.In this paper, we investigate the inverse: porting whole functional modules that encode task-specific knowledge from one model to another. We designed a study comprising 1,440 training/testing runs to test the portability of modules trained by parameter-efficient finetuning (PEFT) techniques, using sentiment analysis as an example task. We test portability in a wide range of scenarios, involving different PEFT techniques and different pretrained host models, among other dimensions. We compare the performance of ported modules with that of equivalent modules trained (i) from scratch, and (ii) from parameters sampled from the same distribution as the ported module.We find that the ported modules far outperform the two alternatives tested, but that there are interesting differences between the four PEFT techniques tested.We conclude that task-specific knowledge in the form of structurally modular sets of parameters as produced by PEFT techniques is highly portable, but that degree of success depends on type of PEFT and on differences between originating and receiving pretrained models.\",\"PeriodicalId\":508951,\"journal\":{\"name\":\"Findings\",\"volume\":\"30 3\",\"pages\":\"1548-1556\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Findings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2401.14228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Findings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2401.14228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着训练越来越大的语言模型所需的成本越来越高,人们对重复使用以前所学知识的兴趣也越来越大。迁移学习方法表明,重复使用非特定任务的知识有助于后续特定任务的学习。在本文中,我们研究了反向迁移:将编码特定任务知识的整个功能模块从一个模型移植到另一个模型。我们设计了一项包含 1,440 次训练/测试运行的研究,以情感分析为例,测试通过参数高效微调(PEFT)技术训练的模块的可移植性。我们在广泛的场景中测试了可移植性,包括不同的 PEFT 技术和不同的预训练主机模型等。我们将移植模块的性能与(i)从零开始训练的同等模块和(ii)从与移植模块相同的分布中采样的参数训练的同等模块的性能进行了比较。我们发现移植模块的性能远远优于所测试的两种替代方案,但所测试的四种 PEFT 技术之间存在有趣的差异。我们的结论是,由 PEFT 技术产生的以结构模块化参数集为形式的特定任务知识具有高度可移植性,但成功程度取决于 PEFT 的类型以及原始模型和接收预训练模型之间的差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessing the Portability of Parameter Matrices Trained by Parameter-Efficient Finetuning Methods
As the cost of training ever larger language models has grown, so has the interest in reusing previously learnt knowledge. Transfer learning methods have shown how reusing non-task-specific knowledge can help in subsequent task-specific learning.In this paper, we investigate the inverse: porting whole functional modules that encode task-specific knowledge from one model to another. We designed a study comprising 1,440 training/testing runs to test the portability of modules trained by parameter-efficient finetuning (PEFT) techniques, using sentiment analysis as an example task. We test portability in a wide range of scenarios, involving different PEFT techniques and different pretrained host models, among other dimensions. We compare the performance of ported modules with that of equivalent modules trained (i) from scratch, and (ii) from parameters sampled from the same distribution as the ported module.We find that the ported modules far outperform the two alternatives tested, but that there are interesting differences between the four PEFT techniques tested.We conclude that task-specific knowledge in the form of structurally modular sets of parameters as produced by PEFT techniques is highly portable, but that degree of success depends on type of PEFT and on differences between originating and receiving pretrained models.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Changes in Traffic Jams and Injuries Impact on Acceptability of Automated Vehicles: A Strong Curvilinear Relation with no signs of Loss Aversion. Day-of-Week, Month, and Seasonal Demand Variations: Comparing Flow Estimates Across New Travel Data Sources Human Mobility Patterns during the 2024 Total Solar Eclipse in Canada Substituting Car Trips: Does Intermodal Mobility Decrease External Costs and How Does It Affect Travel Times? An Analysis Based on GPS Tracking Data Revealed Preferences for Utilitarian Cycling Energy Expenditure versus Travel Time
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1