伊辛临界附近的自回归模型路径依赖性

Yi Hong Teoh, Roger G. Melko
{"title":"伊辛临界附近的自回归模型路径依赖性","authors":"Yi Hong Teoh, Roger G. Melko","doi":"arxiv-2408.15715","DOIUrl":null,"url":null,"abstract":"Autoregressive models are a class of generative model that probabilistically\npredict the next output of a sequence based on previous inputs. The\nautoregressive sequence is by definition one-dimensional (1D), which is natural\nfor language tasks and hence an important component of modern architectures\nlike recurrent neural networks (RNNs) and transformers. However, when language\nmodels are used to predict outputs on physical systems that are not\nintrinsically 1D, the question arises of which choice of autoregressive\nsequence -- if any -- is optimal. In this paper, we study the reconstruction of\ncritical correlations in the two-dimensional (2D) Ising model, using RNNs and\ntransformers trained on binary spin data obtained near the thermal phase\ntransition. We compare the training performance for a number of different 1D\nautoregressive sequences imposed on finite-size 2D lattices. We find that paths\nwith long 1D segments are more efficient at training the autoregressive models\ncompared to space-filling curves that better preserve the 2D locality. Our\nresults illustrate the potential importance in choosing the optimal\nautoregressive sequence ordering when training modern language models for tasks\nin physics.","PeriodicalId":501066,"journal":{"name":"arXiv - PHYS - Disordered Systems and Neural Networks","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Autoregressive model path dependence near Ising criticality\",\"authors\":\"Yi Hong Teoh, Roger G. Melko\",\"doi\":\"arxiv-2408.15715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autoregressive models are a class of generative model that probabilistically\\npredict the next output of a sequence based on previous inputs. The\\nautoregressive sequence is by definition one-dimensional (1D), which is natural\\nfor language tasks and hence an important component of modern architectures\\nlike recurrent neural networks (RNNs) and transformers. However, when language\\nmodels are used to predict outputs on physical systems that are not\\nintrinsically 1D, the question arises of which choice of autoregressive\\nsequence -- if any -- is optimal. In this paper, we study the reconstruction of\\ncritical correlations in the two-dimensional (2D) Ising model, using RNNs and\\ntransformers trained on binary spin data obtained near the thermal phase\\ntransition. We compare the training performance for a number of different 1D\\nautoregressive sequences imposed on finite-size 2D lattices. We find that paths\\nwith long 1D segments are more efficient at training the autoregressive models\\ncompared to space-filling curves that better preserve the 2D locality. Our\\nresults illustrate the potential importance in choosing the optimal\\nautoregressive sequence ordering when training modern language models for tasks\\nin physics.\",\"PeriodicalId\":501066,\"journal\":{\"name\":\"arXiv - PHYS - Disordered Systems and Neural Networks\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Disordered Systems and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.15715\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Disordered Systems and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.15715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

自回归模型是一类生成模型,它能根据先前的输入从概率上预测序列的下一个输出。自回归序列顾名思义是一维(1D)的,这对于语言任务来说很自然,因此也是循环神经网络(RNN)和变换器等现代架构的重要组成部分。然而,当语言模型被用于预测非一维物理系统的输出时,就出现了自回归方程的最佳选择问题。在本文中,我们使用在热相位转换附近获得的二元自旋数据上训练的 RNN 和变换器,研究了二维 (2D) 伊辛模型中临界相关性的重建。我们比较了施加在有限尺寸二维网格上的一系列不同一维自回归序列的训练性能。我们发现,在训练自回归模型时,具有长一维段的路径比空间填充曲线更有效,后者能更好地保持二维局部性。我们的研究结果说明,在为物理学任务训练现代语言模型时,选择最佳自回归序列排序具有潜在的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Autoregressive model path dependence near Ising criticality
Autoregressive models are a class of generative model that probabilistically predict the next output of a sequence based on previous inputs. The autoregressive sequence is by definition one-dimensional (1D), which is natural for language tasks and hence an important component of modern architectures like recurrent neural networks (RNNs) and transformers. However, when language models are used to predict outputs on physical systems that are not intrinsically 1D, the question arises of which choice of autoregressive sequence -- if any -- is optimal. In this paper, we study the reconstruction of critical correlations in the two-dimensional (2D) Ising model, using RNNs and transformers trained on binary spin data obtained near the thermal phase transition. We compare the training performance for a number of different 1D autoregressive sequences imposed on finite-size 2D lattices. We find that paths with long 1D segments are more efficient at training the autoregressive models compared to space-filling curves that better preserve the 2D locality. Our results illustrate the potential importance in choosing the optimal autoregressive sequence ordering when training modern language models for tasks in physics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fast Analysis of the OpenAI O1-Preview Model in Solving Random K-SAT Problem: Does the LLM Solve the Problem Itself or Call an External SAT Solver? Trade-off relations between quantum coherence and measure of many-body localization Soft modes in vector spin glass models on sparse random graphs Boolean mean field spin glass model: rigorous results Generalized hetero-associative neural networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1