深度学习Calabi-Yau四倍混合和循环神经网络架构

IF 2.8 3区物理与天体物理 Q2 PHYSICS, PARTICLES & FIELDS Nuclear Physics B Pub Date : 2025-04-01 Epub Date: 2025-02-14 DOI:10.1016/j.nuclphysb.2025.116832

Harriet L. Dao

{"title":"深度学习Calabi-Yau四倍混合和循环神经网络架构","authors":"Harriet L. Dao","doi":"10.1016/j.nuclphysb.2025.116832","DOIUrl":null,"url":null,"abstract":"<div><div>In this work, we report the results of applying deep learning based on hybrid convolutional-recurrent and purely recurrent neural network architectures to the dataset of almost one million complete intersection Calabi-Yau four-folds (CICY4) to machine-learn their four Hodge numbers <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>3</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>2</mn></mrow></msup></math></span>. In particular, we explored and experimented with twelve different neural network models, nine of which are convolutional-recurrent (CNN-RNN) hybrids with the RNN unit being either GRU (Gated Recurrent Unit) or Long Short Term Memory (LSTM). The remaining four models are purely recurrent neural networks based on LSTM. In terms of the <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>3</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span> and <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>2</mn></mrow></msup></math></span> prediction accuracies, at 72% training ratio, our best performing individual model is CNN-LSTM-400, a hybrid CNN-LSTM with the LSTM hidden size of 400, which obtained 99.74%, 98.07%, 95.19%, 81.01%, our second best performing individual model is LSTM-448, an LSTM-based model with the hidden size of 448, which obtained 99.74%, 97.51%, 94.24%, and 78.63%. These results were improved by forming ensembles of the top two, three or even four models. Our best ensemble, consisting of the top four models, achieved the accuracies of 99.84%, 98.71%, 96.26%, 85.03%. At 80% training ratio, the top two performing models LSTM-448 and LSTM-424 are both LSTM-based with the hidden sizes of 448 and 424. Compared with the 72% training ratio, there is a significant improvement of accuracies, which reached 99.85%, 98.66%, 96.26%, 84.77% for the best individual model and 99.90%, 99.03%, 97.97%, 87.34% for the best ensemble. By nature a proof of concept, the results of this work conclusively established the utility of RNN-based architectures and demonstrated their effective performances compared to the well-explored purely CNN-based architectures in the problem of deep learning Calabi Yau manifolds.</div></div>","PeriodicalId":54712,"journal":{"name":"Nuclear Physics B","volume":"1013 ","pages":"Article 116832"},"PeriodicalIF":2.8000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning Calabi-Yau four folds with hybrid and recurrent neural network architectures\",\"authors\":\"Harriet L. Dao\",\"doi\":\"10.1016/j.nuclphysb.2025.116832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In this work, we report the results of applying deep learning based on hybrid convolutional-recurrent and purely recurrent neural network architectures to the dataset of almost one million complete intersection Calabi-Yau four-folds (CICY4) to machine-learn their four Hodge numbers <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>3</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>2</mn></mrow></msup></math></span>. In particular, we explored and experimented with twelve different neural network models, nine of which are convolutional-recurrent (CNN-RNN) hybrids with the RNN unit being either GRU (Gated Recurrent Unit) or Long Short Term Memory (LSTM). The remaining four models are purely recurrent neural networks based on LSTM. In terms of the <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span>, <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>3</mn><mo>,</mo><mn>1</mn></mrow></msup></math></span> and <span><math><msup><mrow><mi>h</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>2</mn></mrow></msup></math></span> prediction accuracies, at 72% training ratio, our best performing individual model is CNN-LSTM-400, a hybrid CNN-LSTM with the LSTM hidden size of 400, which obtained 99.74%, 98.07%, 95.19%, 81.01%, our second best performing individual model is LSTM-448, an LSTM-based model with the hidden size of 448, which obtained 99.74%, 97.51%, 94.24%, and 78.63%. These results were improved by forming ensembles of the top two, three or even four models. Our best ensemble, consisting of the top four models, achieved the accuracies of 99.84%, 98.71%, 96.26%, 85.03%. At 80% training ratio, the top two performing models LSTM-448 and LSTM-424 are both LSTM-based with the hidden sizes of 448 and 424. Compared with the 72% training ratio, there is a significant improvement of accuracies, which reached 99.85%, 98.66%, 96.26%, 84.77% for the best individual model and 99.90%, 99.03%, 97.97%, 87.34% for the best ensemble. By nature a proof of concept, the results of this work conclusively established the utility of RNN-based architectures and demonstrated their effective performances compared to the well-explored purely CNN-based architectures in the problem of deep learning Calabi Yau manifolds.</div></div>\",\"PeriodicalId\":54712,\"journal\":{\"name\":\"Nuclear Physics B\",\"volume\":\"1013 \",\"pages\":\"Article 116832\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nuclear Physics B\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0550321325000422\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/14 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, PARTICLES & FIELDS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nuclear Physics B","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0550321325000422","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/14 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PHYSICS, PARTICLES & FIELDS","Score":null,"Total":0}

引用次数: 0

摘要

在这项工作中，我们报告了将基于混合卷积-递归和纯递归神经网络架构的深度学习应用于近一百万完全相交Calabi-Yau四倍（CICY4）数据集的结果，以机器学习它们的四个霍奇数h1,1, h2,1, h3,1, h2,2。特别是，我们探索和实验了12种不同的神经网络模型，其中9种是卷积-递归（CNN-RNN）混合模型，其中RNN单元是GRU（门控递归单元）或长短期记忆（LSTM）。其余四个模型是基于LSTM的纯循环神经网络。在h1,1, h2,1, h3,1和h2,2的预测准确率方面，在72%的训练率下，我们表现最好的个体模型是CNN-LSTM-400，一个LSTM隐藏大小为400的混合CNN-LSTM模型，分别获得99.74%,98.07%,95.19%,81.01%，我们表现第二好的个体模型是LSTM-448，一个基于LSTM的模型，隐藏大小为448，分别获得99.74%,97.51%,94.24%,78.63%。通过形成前两个、三个甚至四个模型的集合，这些结果得到了改善。由前4个模型组成的最佳集合的准确率分别为99.84%、98.71%、96.26%和85.03%。在80%的训练率下，表现最好的两个模型LSTM-448和LSTM-424都是基于lstm的，隐藏大小分别为448和424。与72%的训练率相比，准确率显著提高，最佳个体模型的准确率分别达到99.85%、98.66%、96.26%、84.77%，最佳整体模型的准确率分别达到99.90%、99.03%、97.97%、87.34%。本质上是一个概念证明，这项工作的结果最终确定了基于rnn的架构的效用，并在深度学习Calabi Yau流形问题中与经过充分探索的纯基于cnn的架构相比，展示了它们的有效性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep learning Calabi-Yau four folds with hybrid and recurrent neural network architectures

In this work, we report the results of applying deep learning based on hybrid convolutional-recurrent and purely recurrent neural network architectures to the dataset of almost one million complete intersection Calabi-Yau four-folds (CICY4) to machine-learn their four Hodge numbers

h^{1, 1}

h^{2, 1}

h^{3, 1}

h^{2, 2}

. In particular, we explored and experimented with twelve different neural network models, nine of which are convolutional-recurrent (CNN-RNN) hybrids with the RNN unit being either GRU (Gated Recurrent Unit) or Long Short Term Memory (LSTM). The remaining four models are purely recurrent neural networks based on LSTM. In terms of the

h^{1, 1}

h^{2, 1}

h^{3, 1}

and

h^{2, 2}

prediction accuracies, at 72% training ratio, our best performing individual model is CNN-LSTM-400, a hybrid CNN-LSTM with the LSTM hidden size of 400, which obtained 99.74%, 98.07%, 95.19%, 81.01%, our second best performing individual model is LSTM-448, an LSTM-based model with the hidden size of 448, which obtained 99.74%, 97.51%, 94.24%, and 78.63%. These results were improved by forming ensembles of the top two, three or even four models. Our best ensemble, consisting of the top four models, achieved the accuracies of 99.84%, 98.71%, 96.26%, 85.03%. At 80% training ratio, the top two performing models LSTM-448 and LSTM-424 are both LSTM-based with the hidden sizes of 448 and 424. Compared with the 72% training ratio, there is a significant improvement of accuracies, which reached 99.85%, 98.66%, 96.26%, 84.77% for the best individual model and 99.90%, 99.03%, 97.97%, 87.34% for the best ensemble. By nature a proof of concept, the results of this work conclusively established the utility of RNN-based architectures and demonstrated their effective performances compared to the well-explored purely CNN-based architectures in the problem of deep learning Calabi Yau manifolds.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nuclear Physics B 物理-物理：粒子与场物理

CiteScore

5.50

自引率

7.10%

发文量

302

审稿时长

1 months

期刊介绍： Nuclear Physics B focuses on the domain of high energy physics, quantum field theory, statistical systems, and mathematical physics, and includes four main sections: high energy physics - phenomenology, high energy physics - theory, high energy physics - experiment, and quantum field theory, statistical systems, and mathematical physics. The emphasis is on original research papers (Frontiers Articles or Full Length Articles), but Review Articles are also welcome.