伊辛临界附近的自回归模型路径依赖性

arXiv - PHYS - Disordered Systems and Neural Networks Pub Date : 2024-08-28 DOI:arxiv-2408.15715

Yi Hong Teoh, Roger G. Melko

{"title":"伊辛临界附近的自回归模型路径依赖性","authors":"Yi Hong Teoh, Roger G. Melko","doi":"arxiv-2408.15715","DOIUrl":null,"url":null,"abstract":"Autoregressive models are a class of generative model that probabilistically\npredict the next output of a sequence based on previous inputs. The\nautoregressive sequence is by definition one-dimensional (1D), which is natural\nfor language tasks and hence an important component of modern architectures\nlike recurrent neural networks (RNNs) and transformers. However, when language\nmodels are used to predict outputs on physical systems that are not\nintrinsically 1D, the question arises of which choice of autoregressive\nsequence -- if any -- is optimal. In this paper, we study the reconstruction of\ncritical correlations in the two-dimensional (2D) Ising model, using RNNs and\ntransformers trained on binary spin data obtained near the thermal phase\ntransition. We compare the training performance for a number of different 1D\nautoregressive sequences imposed on finite-size 2D lattices. We find that paths\nwith long 1D segments are more efficient at training the autoregressive models\ncompared to space-filling curves that better preserve the 2D locality. Our\nresults illustrate the potential importance in choosing the optimal\nautoregressive sequence ordering when training modern language models for tasks\nin physics.","PeriodicalId":501066,"journal":{"name":"arXiv - PHYS - Disordered Systems and Neural Networks","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Autoregressive model path dependence near Ising criticality\",\"authors\":\"Yi Hong Teoh, Roger G. Melko\",\"doi\":\"arxiv-2408.15715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autoregressive models are a class of generative model that probabilistically\\npredict the next output of a sequence based on previous inputs. The\\nautoregressive sequence is by definition one-dimensional (1D), which is natural\\nfor language tasks and hence an important component of modern architectures\\nlike recurrent neural networks (RNNs) and transformers. However, when language\\nmodels are used to predict outputs on physical systems that are not\\nintrinsically 1D, the question arises of which choice of autoregressive\\nsequence -- if any -- is optimal. In this paper, we study the reconstruction of\\ncritical correlations in the two-dimensional (2D) Ising model, using RNNs and\\ntransformers trained on binary spin data obtained near the thermal phase\\ntransition. We compare the training performance for a number of different 1D\\nautoregressive sequences imposed on finite-size 2D lattices. We find that paths\\nwith long 1D segments are more efficient at training the autoregressive models\\ncompared to space-filling curves that better preserve the 2D locality. Our\\nresults illustrate the potential importance in choosing the optimal\\nautoregressive sequence ordering when training modern language models for tasks\\nin physics.\",\"PeriodicalId\":501066,\"journal\":{\"name\":\"arXiv - PHYS - Disordered Systems and Neural Networks\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Disordered Systems and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.15715\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Disordered Systems and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.15715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自回归模型是一类生成模型，它能根据先前的输入从概率上预测序列的下一个输出。自回归序列顾名思义是一维（1D）的，这对于语言任务来说很自然，因此也是循环神经网络（RNN）和变换器等现代架构的重要组成部分。然而，当语言模型被用于预测非一维物理系统的输出时，就出现了自回归方程的最佳选择问题。在本文中，我们使用在热相位转换附近获得的二元自旋数据上训练的 RNN 和变换器，研究了二维 (2D) 伊辛模型中临界相关性的重建。我们比较了施加在有限尺寸二维网格上的一系列不同一维自回归序列的训练性能。我们发现，在训练自回归模型时，具有长一维段的路径比空间填充曲线更有效，后者能更好地保持二维局部性。我们的研究结果说明，在为物理学任务训练现代语言模型时，选择最佳自回归序列排序具有潜在的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Autoregressive model path dependence near Ising criticality

Autoregressive models are a class of generative model that probabilistically predict the next output of a sequence based on previous inputs. The autoregressive sequence is by definition one-dimensional (1D), which is natural for language tasks and hence an important component of modern architectures like recurrent neural networks (RNNs) and transformers. However, when language models are used to predict outputs on physical systems that are not intrinsically 1D, the question arises of which choice of autoregressive sequence -- if any -- is optimal. In this paper, we study the reconstruction of critical correlations in the two-dimensional (2D) Ising model, using RNNs and transformers trained on binary spin data obtained near the thermal phase transition. We compare the training performance for a number of different 1D autoregressive sequences imposed on finite-size 2D lattices. We find that paths with long 1D segments are more efficient at training the autoregressive models compared to space-filling curves that better preserve the 2D locality. Our results illustrate the potential importance in choosing the optimal autoregressive sequence ordering when training modern language models for tasks in physics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - PHYS - Disordered Systems and Neural Networks

自引率

0.00%

发文量