关于语言学习者长语境扩展和泛化的对照研究

Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush
{"title":"关于语言学习者长语境扩展和泛化的对照研究","authors":"Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush","doi":"arxiv-2409.12181","DOIUrl":null,"url":null,"abstract":"Broad textual understanding and in-context learning require language models\nthat utilize full document contexts. Due to the implementation challenges\nassociated with directly training long-context models, many methods have been\nproposed for extending models to handle long contexts. However, owing to\ndifferences in data and model classes, it has been challenging to compare these\napproaches, leading to uncertainty as to how to evaluate long-context\nperformance and whether it differs from standard evaluation. We implement a\ncontrolled protocol for extension methods with a standardized evaluation,\nutilizing consistent base models and extension data. Our study yields several\ninsights into long-context behavior. First, we reaffirm the critical role of\nperplexity as a general-purpose performance indicator even in longer-context\ntasks. Second, we find that current approximate attention methods\nsystematically underperform across long-context tasks. Finally, we confirm that\nexact fine-tuning based methods are generally effective within the range of\ntheir extension, whereas extrapolation remains challenging. All codebases,\nmodels, and checkpoints will be made available open-source, promoting\ntransparency and facilitating further research in this critical area of AI\ndevelopment.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Controlled Study on Long Context Extension and Generalization in LLMs\",\"authors\":\"Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush\",\"doi\":\"arxiv-2409.12181\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Broad textual understanding and in-context learning require language models\\nthat utilize full document contexts. Due to the implementation challenges\\nassociated with directly training long-context models, many methods have been\\nproposed for extending models to handle long contexts. However, owing to\\ndifferences in data and model classes, it has been challenging to compare these\\napproaches, leading to uncertainty as to how to evaluate long-context\\nperformance and whether it differs from standard evaluation. We implement a\\ncontrolled protocol for extension methods with a standardized evaluation,\\nutilizing consistent base models and extension data. Our study yields several\\ninsights into long-context behavior. First, we reaffirm the critical role of\\nperplexity as a general-purpose performance indicator even in longer-context\\ntasks. Second, we find that current approximate attention methods\\nsystematically underperform across long-context tasks. Finally, we confirm that\\nexact fine-tuning based methods are generally effective within the range of\\ntheir extension, whereas extrapolation remains challenging. All codebases,\\nmodels, and checkpoints will be made available open-source, promoting\\ntransparency and facilitating further research in this critical area of AI\\ndevelopment.\",\"PeriodicalId\":501030,\"journal\":{\"name\":\"arXiv - CS - Computation and Language\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computation and Language\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.12181\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.12181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

广泛的文本理解和上下文学习需要利用完整文档上下文的语言模型。由于直接训练长语境模型在实现上存在挑战,人们提出了许多方法来扩展模型以处理长语境。然而,由于数据和模型类别的差异,对这些方法进行比较具有挑战性,从而导致了如何评估长语境性能以及它是否有别于标准评估的不确定性。我们利用一致的基础模型和扩展数据,为具有标准化评估的扩展方法实施了受控协议。我们的研究对长情境行为提出了几点见解。首先,我们重申了复杂度作为通用性能指标的关键作用,即使在长情境任务中也是如此。其次,我们发现当前的近似注意方法在长情境任务中表现不佳。最后,我们证实了基于近似微调的方法在其扩展范围内通常是有效的,而外推法仍然具有挑战性。所有代码库、模型和检查点都将开源,以提高透明度,促进人工智能发展这一关键领域的进一步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Controlled Study on Long Context Extension and Generalization in LLMs
Broad textual understanding and in-context learning require language models that utilize full document contexts. Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending models to handle long contexts. However, owing to differences in data and model classes, it has been challenging to compare these approaches, leading to uncertainty as to how to evaluate long-context performance and whether it differs from standard evaluation. We implement a controlled protocol for extension methods with a standardized evaluation, utilizing consistent base models and extension data. Our study yields several insights into long-context behavior. First, we reaffirm the critical role of perplexity as a general-purpose performance indicator even in longer-context tasks. Second, we find that current approximate attention methods systematically underperform across long-context tasks. Finally, we confirm that exact fine-tuning based methods are generally effective within the range of their extension, whereas extrapolation remains challenging. All codebases, models, and checkpoints will be made available open-source, promoting transparency and facilitating further research in this critical area of AI development.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LLMs + Persona-Plug = Personalized LLMs MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Human-like Affective Cognition in Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1