Konstantin Köster, Tobias Binninger, Payam Kaghazchi
Most of the novel energy materials contain multiple elements occupying a single site in their lattice. The exceedingly large configurational space of these materials imposes challenges in determining their ground-state structures. Coulomb energies of possible configurations generally show a satisfactory correlation to computed energies at higher levels of theory and thus allow to screen for minimum-energy structures. Employing a second-order cluster expansion, we obtain an efficient Coulomb energy optimizer using Monte Carlo and Genetic Algorithms. The presented optimization package, GOAC (Global Optimization of Atomistic Configurations by Coulomb), can achieve a speed up of several orders of magnitude compared to existing software. Our code is able to find low-energy configurations of complex systems involving up to $10^{920}$ structural configurations. The GOAC package thus provides an efficient method for constructing ground-state atomistic models for multi-element materials with gigantic configurational spaces.
{"title":"Optimization of Coulomb Energies in Gigantic Configurational Spaces of Multi-Element Ionic Crystals","authors":"Konstantin Köster, Tobias Binninger, Payam Kaghazchi","doi":"arxiv-2409.08808","DOIUrl":"https://doi.org/arxiv-2409.08808","url":null,"abstract":"Most of the novel energy materials contain multiple elements occupying a\u0000single site in their lattice. The exceedingly large configurational space of\u0000these materials imposes challenges in determining their ground-state\u0000structures. Coulomb energies of possible configurations generally show a\u0000satisfactory correlation to computed energies at higher levels of theory and\u0000thus allow to screen for minimum-energy structures. Employing a second-order\u0000cluster expansion, we obtain an efficient Coulomb energy optimizer using Monte\u0000Carlo and Genetic Algorithms. The presented optimization package, GOAC (Global\u0000Optimization of Atomistic Configurations by Coulomb), can achieve a speed up of\u0000several orders of magnitude compared to existing software. Our code is able to\u0000find low-energy configurations of complex systems involving up to $10^{920}$\u0000structural configurations. The GOAC package thus provides an efficient method\u0000for constructing ground-state atomistic models for multi-element materials with\u0000gigantic configurational spaces.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although impurities are unavoidable in real-world and experimental systems, most numerical studies on nucleation focus on pure (impurity-free) systems. As a result, the role of impurities in phase transitions remains poorly understood, especially for systems with complex free energy landscapes featuring one or more metastable intermediate phases. In this study, we employed Monte-Carlo simulations to investigate the effects of static impurities (quenched disorder) of varying length scales and surface morphologies on the nucleation mechanism and kinetics in the Gaussian Core Model (GCM) system, a model for soft colloidal systems. We first explored how the nucleation free energy barrier and critical cluster size are influenced by the fraction of pinned particles ($f_{rm p}$) and the pinned cluster size ($n_{rm p}$). Both the nucleation free energy barrier and critical cluster size increase sharply with increasing $f_{rm p}$ but decrease as $n_{rm p}$ grows, eventually approaching the homogeneous nucleation limit. On examining the impact of surface morphology on nucleation kinetics, we observed that the nucleation barrier significantly decreases with increasing the spherical pinned cluster (referred to as "seed") size of face-centred cubic (FCC), body-centred cubic (BCC), and simple cubic (SC) structures, with BCC showing the greatest facilitation. Interestingly, seeds with random surface roughness had little effect on nucleation kinetics. Additionally, the polymorphic identity of particles in the final crystalline phase is influenced by both seed surface morphology and system size. This study further provides crucial insights into the intricate relationship between substrate-induced local structural fluctuations and the selection of the polymorphic identity in the final crystalline phase, which is essential for understanding and controlling crystallization processes in experiments.
{"title":"Effects of quenched disorder on the kinetics and pathways of phase transition in a soft colloidal system","authors":"Gadha Ramesh, Mantu Santra, Rakesh S. Singh","doi":"arxiv-2409.08679","DOIUrl":"https://doi.org/arxiv-2409.08679","url":null,"abstract":"Although impurities are unavoidable in real-world and experimental systems,\u0000most numerical studies on nucleation focus on pure (impurity-free) systems. As\u0000a result, the role of impurities in phase transitions remains poorly\u0000understood, especially for systems with complex free energy landscapes\u0000featuring one or more metastable intermediate phases. In this study, we\u0000employed Monte-Carlo simulations to investigate the effects of static\u0000impurities (quenched disorder) of varying length scales and surface\u0000morphologies on the nucleation mechanism and kinetics in the Gaussian Core\u0000Model (GCM) system, a model for soft colloidal systems. We first explored how\u0000the nucleation free energy barrier and critical cluster size are influenced by\u0000the fraction of pinned particles ($f_{rm p}$) and the pinned cluster size\u0000($n_{rm p}$). Both the nucleation free energy barrier and critical cluster\u0000size increase sharply with increasing $f_{rm p}$ but decrease as $n_{rm p}$\u0000grows, eventually approaching the homogeneous nucleation limit. On examining\u0000the impact of surface morphology on nucleation kinetics, we observed that the\u0000nucleation barrier significantly decreases with increasing the spherical pinned\u0000cluster (referred to as \"seed\") size of face-centred cubic (FCC), body-centred\u0000cubic (BCC), and simple cubic (SC) structures, with BCC showing the greatest\u0000facilitation. Interestingly, seeds with random surface roughness had little\u0000effect on nucleation kinetics. Additionally, the polymorphic identity of\u0000particles in the final crystalline phase is influenced by both seed surface\u0000morphology and system size. This study further provides crucial insights into\u0000the intricate relationship between substrate-induced local structural\u0000fluctuations and the selection of the polymorphic identity in the final\u0000crystalline phase, which is essential for understanding and controlling\u0000crystallization processes in experiments.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maximilian Mörchen, Guang Hao Low, Thomas Weymuth, Hongbin Liu, Matthias Troyer, Markus Reiher
Quantum computation for chemical problems will require the construction of guiding states with sufficient overlap with a target state. Since easily available and initializable mean-field states are characterized by an overlap that is reduced for multi-configurational electronic structures and even vanishes with growing system size, we here investigate the severity of state preparation for reaction chemistry. We emphasize weaknesses in current traditional approaches (even for weakly correlated molecules) and highlight the advantage of quantum phase estimation algorithms. An important result is the introduction of a new classification scheme for electronic structures based on orbital entanglement information. We identify two categories of multi-configurational molecules. Whereas class-1 molecules are dominated by very few determinants and often found in reaction chemistry, class-2 molecules do not allow one to single out a reasonably sized number of important determinants. The latter are particularly hard for traditional approaches and an ultimate target for quantum computation. Some open-shell iron-sulfur clusters belong to class 2. We discuss the role of the molecular orbital basis set and show that true class-2 molecules remain in this class independent of the choice of the orbital basis, with the iron-molybdenum cofactor of nitrogenase being a prototypical example. We stress that class-2 molecules can be build in a systematic fashion from open-shell centers or unsaturated carbon atoms. Our key result is that it will always be possible to initialize a guiding state for chemical reaction chemistry in the ground state based on initial low-cost approximate electronic structure information, which is facilitated by the finite size of the atomistic structures to be considered.
{"title":"Classification of electronic structures and state preparation for quantum computation of reaction chemistry","authors":"Maximilian Mörchen, Guang Hao Low, Thomas Weymuth, Hongbin Liu, Matthias Troyer, Markus Reiher","doi":"arxiv-2409.08910","DOIUrl":"https://doi.org/arxiv-2409.08910","url":null,"abstract":"Quantum computation for chemical problems will require the construction of\u0000guiding states with sufficient overlap with a target state. Since easily\u0000available and initializable mean-field states are characterized by an overlap\u0000that is reduced for multi-configurational electronic structures and even\u0000vanishes with growing system size, we here investigate the severity of state\u0000preparation for reaction chemistry. We emphasize weaknesses in current\u0000traditional approaches (even for weakly correlated molecules) and highlight the\u0000advantage of quantum phase estimation algorithms. An important result is the\u0000introduction of a new classification scheme for electronic structures based on\u0000orbital entanglement information. We identify two categories of\u0000multi-configurational molecules. Whereas class-1 molecules are dominated by\u0000very few determinants and often found in reaction chemistry, class-2 molecules\u0000do not allow one to single out a reasonably sized number of important\u0000determinants. The latter are particularly hard for traditional approaches and\u0000an ultimate target for quantum computation. Some open-shell iron-sulfur\u0000clusters belong to class 2. We discuss the role of the molecular orbital basis\u0000set and show that true class-2 molecules remain in this class independent of\u0000the choice of the orbital basis, with the iron-molybdenum cofactor of\u0000nitrogenase being a prototypical example. We stress that class-2 molecules can\u0000be build in a systematic fashion from open-shell centers or unsaturated carbon\u0000atoms. Our key result is that it will always be possible to initialize a\u0000guiding state for chemical reaction chemistry in the ground state based on\u0000initial low-cost approximate electronic structure information, which is\u0000facilitated by the finite size of the atomistic structures to be considered.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathematically modelling diffusive and advective transport of particles in heterogeneous layered media is important to many applications in computational, biological and medical physics. While deterministic continuum models of such transport processes are well established, they fail to account for randomness inherent in many problems and are valid only for a large number of particles. To address this, this paper derives a suite of equivalent stochastic (discrete-time discrete-space random walk) models for several standard continuum (partial differential equation) models of diffusion and advection-diffusion across a fully- or semi-permeable interface. Our approach involves discretising the continuum model in space and time to yield a Markov chain, which governs the transition probabilities between spatial lattice sites during each time step. Discretisation in space is carried out using a standard finite volume method while two options are considered for discretisation in time. A simple forward Euler discretisation yields a stochastic model taking the form of a local (nearest-neighbour) random walk with simple analytical expressions for the transition probabilities while an exact exponential discretisation yields a non-local random walk with transition probabilities defined numerically via a matrix exponential. Constraints on the size of the spatial and/or temporal steps are provided for each option to ensure the transition probabilities are non-negative. MATLAB code comparing the stochastic and continuum models is available on GitHub (https://github.com/elliotcarr/Carr2024c) with simulation results demonstrating good agreement for several example problems.
{"title":"Stochastic models of advection-diffusion in layered media","authors":"Elliot J. Carr","doi":"arxiv-2409.08447","DOIUrl":"https://doi.org/arxiv-2409.08447","url":null,"abstract":"Mathematically modelling diffusive and advective transport of particles in\u0000heterogeneous layered media is important to many applications in computational,\u0000biological and medical physics. While deterministic continuum models of such\u0000transport processes are well established, they fail to account for randomness\u0000inherent in many problems and are valid only for a large number of particles.\u0000To address this, this paper derives a suite of equivalent stochastic\u0000(discrete-time discrete-space random walk) models for several standard\u0000continuum (partial differential equation) models of diffusion and\u0000advection-diffusion across a fully- or semi-permeable interface. Our approach\u0000involves discretising the continuum model in space and time to yield a Markov\u0000chain, which governs the transition probabilities between spatial lattice sites\u0000during each time step. Discretisation in space is carried out using a standard\u0000finite volume method while two options are considered for discretisation in\u0000time. A simple forward Euler discretisation yields a stochastic model taking\u0000the form of a local (nearest-neighbour) random walk with simple analytical\u0000expressions for the transition probabilities while an exact exponential\u0000discretisation yields a non-local random walk with transition probabilities\u0000defined numerically via a matrix exponential. Constraints on the size of the\u0000spatial and/or temporal steps are provided for each option to ensure the\u0000transition probabilities are non-negative. MATLAB code comparing the stochastic\u0000and continuum models is available on GitHub\u0000(https://github.com/elliotcarr/Carr2024c) with simulation results demonstrating\u0000good agreement for several example problems.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate the aging properties of phase-separation kinetics following quenches from $T=infty$ to a finite temperature below $T_c$ of the paradigmatic two-dimensional conserved Ising model with power-law decaying long-range interactions $sim r^{-(2 + sigma)}$. Physical aging with a power-law decay of the two-time autocorrelation function $C(t,t_w)sim left(t/t_wright)^{-lambda/z}$ is observed, displaying a complex dependence of the autocorrelation exponent $lambda$ on $sigma$. A value of $lambda=3.500(26)$ for the corresponding nearest-neighbor model (which is recovered as the $sigma rightarrow infty$ limes) is determined. The values of $lambda$ in the long-range regime ($sigma < 1$) are all compatible with $lambda approx 4$. In between, a continuous crossover is visible for $1 lesssim sigma lesssim 2$ with non-universal, $sigma$-dependent values of $lambda$. The performed Metropolis Monte Carlo simulations are primarily enabled by our novel algorithm for long-range interacting systems.
{"title":"Non-universality of aging during phase separation of the two-dimensional long-range Ising model","authors":"Fabio Müller, Henrik Christiansen, Wolfhard Janke","doi":"arxiv-2409.08050","DOIUrl":"https://doi.org/arxiv-2409.08050","url":null,"abstract":"We investigate the aging properties of phase-separation kinetics following\u0000quenches from $T=infty$ to a finite temperature below $T_c$ of the\u0000paradigmatic two-dimensional conserved Ising model with power-law decaying\u0000long-range interactions $sim r^{-(2 + sigma)}$. Physical aging with a\u0000power-law decay of the two-time autocorrelation function $C(t,t_w)sim\u0000left(t/t_wright)^{-lambda/z}$ is observed, displaying a complex dependence\u0000of the autocorrelation exponent $lambda$ on $sigma$. A value of\u0000$lambda=3.500(26)$ for the corresponding nearest-neighbor model (which is\u0000recovered as the $sigma rightarrow infty$ limes) is determined. The values\u0000of $lambda$ in the long-range regime ($sigma < 1$) are all compatible with\u0000$lambda approx 4$. In between, a continuous crossover is visible for $1\u0000lesssim sigma lesssim 2$ with non-universal, $sigma$-dependent values of\u0000$lambda$. The performed Metropolis Monte Carlo simulations are primarily\u0000enabled by our novel algorithm for long-range interacting systems.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The discovery of new superconducting materials, particularly those exhibiting high critical temperature ($T_c$), has been a vibrant area of study within the field of condensed matter physics. Conventional approaches primarily rely on physical intuition to search for potential superconductors within the existing databases. However, the known materials only scratch the surface of the extensive array of possibilities within the realm of materials. Here, we develop an AI search engine that integrates deep model pre-training and fine-tuning techniques, diffusion models, and physics-based approaches (e.g., first-principles electronic structure calculation) for discovery of high-$T_c$ superconductors. Utilizing this AI search engine, we have obtained 74 dynamically stable materials with critical temperatures predicted by the AI model to be $T_c geq$ 15 K based on a very small set of samples. Notably, these materials are not contained in any existing dataset. Furthermore, we analyze trends in our dataset and individual materials including B$_4$CN$_3$ and B$_5$CN$_2$ whose $T_c$s are 24.08 K and 15.93 K, respectively. We demonstrate that AI technique can discover a set of new high-$T_c$ superconductors, outline its potential for accelerating discovery of the materials with targeted properties.
{"title":"AI-accelerated discovery of high critical temperature superconductors","authors":"Xiao-Qi Han, Zhenfeng Ouyang, Peng-Jie Guo, Hao Sun, Ze-Feng Gao, Zhong-Yi Lu","doi":"arxiv-2409.08065","DOIUrl":"https://doi.org/arxiv-2409.08065","url":null,"abstract":"The discovery of new superconducting materials, particularly those exhibiting\u0000high critical temperature ($T_c$), has been a vibrant area of study within the\u0000field of condensed matter physics. Conventional approaches primarily rely on\u0000physical intuition to search for potential superconductors within the existing\u0000databases. However, the known materials only scratch the surface of the\u0000extensive array of possibilities within the realm of materials. Here, we\u0000develop an AI search engine that integrates deep model pre-training and\u0000fine-tuning techniques, diffusion models, and physics-based approaches (e.g.,\u0000first-principles electronic structure calculation) for discovery of high-$T_c$\u0000superconductors. Utilizing this AI search engine, we have obtained 74\u0000dynamically stable materials with critical temperatures predicted by the AI\u0000model to be $T_c geq$ 15 K based on a very small set of samples. Notably,\u0000these materials are not contained in any existing dataset. Furthermore, we\u0000analyze trends in our dataset and individual materials including B$_4$CN$_3$\u0000and B$_5$CN$_2$ whose $T_c$s are 24.08 K and 15.93 K, respectively. We\u0000demonstrate that AI technique can discover a set of new high-$T_c$\u0000superconductors, outline its potential for accelerating discovery of the\u0000materials with targeted properties.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Liang, Hong Guo, Tianyu Zhao, He wang, Herik Evangelinelis, Yuxiang Xu, Chang liu, Manjia Liang, Xiaotong Wei, Yong Yuan, Peng Xu, Minghui Du, Wei-Liang Qian, Ziren Luo
Extreme-mass-ratio inspiral (EMRI) signals pose significant challenges in gravitational wave (GW) astronomy owing to their low-frequency nature and highly complex waveforms, which occupy a high-dimensional parameter space with numerous variables. Given their extended inspiral timescales and low signal-to-noise ratios, EMRI signals warrant prolonged observation periods. Parameter estimation becomes particularly challenging due to non-local parameter degeneracies, arising from multiple local maxima, as well as flat regions and ridges inherent in the likelihood function. These factors lead to exceptionally high time complexity for parameter analysis while employing traditional matched filtering and random sampling methods. To address these challenges, the present study applies machine learning to Bayesian posterior estimation of EMRI signals, leveraging the recently developed flow matching technique based on ODE neural networks. Our approach demonstrates computational efficiency several orders of magnitude faster than the traditional Markov Chain Monte Carlo (MCMC) methods, while preserving the unbiasedness of parameter estimation. We show that machine learning technology has the potential to efficiently handle the vast parameter space, involving up to seventeen parameters, associated with EMRI signals. Furthermore, to our knowledge, this is the first instance of applying machine learning, specifically the Continuous Normalizing Flows (CNFs), to EMRI signal analysis. Our findings highlight the promising potential of machine learning in EMRI waveform analysis, offering new perspectives for the advancement of space-based GW detection and GW astronomy.
极端质量比吸气(EMRI)信号对引力波(GW)天文学构成了重大挑战,因为它们的频率低,波形非常复杂,占据了一个有无数变量的高维参数空间。由于多个局部最大值产生的非局部参数退化,以及似然函数中固有的平坦区域和山脊,参数估计变得特别具有挑战性。这些因素导致采用传统匹配滤波和随机抽样方法进行参数分析的时间复杂度异常高。为了应对这些挑战,本研究利用最近开发的基于 ODE 神经网络的流量匹配技术,将机器学习应用于 EMRI 信号的贝叶斯后验估计。与传统的马尔可夫链蒙特卡洛(MCMC)方法相比,我们的方法在保持参数估计无偏性的同时,计算效率快了几个数量级。我们的研究表明,机器学习技术有潜力高效处理与 EMRI 信号相关的庞大参数空间,其中涉及多达十七个参数。此外,据我们所知,这是首次将机器学习,特别是连续归一化流(CNFs)应用于 EMRI 信号分析。我们的研究结果凸显了机器学习在 EMRI 波形分析中的巨大潜力,为天基全球风暴探测和全球风暴天文学的发展提供了新的视角。
{"title":"Rapid Parameter Estimation for Extreme Mass Ratio Inspirals Using Machine Learning","authors":"Bo Liang, Hong Guo, Tianyu Zhao, He wang, Herik Evangelinelis, Yuxiang Xu, Chang liu, Manjia Liang, Xiaotong Wei, Yong Yuan, Peng Xu, Minghui Du, Wei-Liang Qian, Ziren Luo","doi":"arxiv-2409.07957","DOIUrl":"https://doi.org/arxiv-2409.07957","url":null,"abstract":"Extreme-mass-ratio inspiral (EMRI) signals pose significant challenges in\u0000gravitational wave (GW) astronomy owing to their low-frequency nature and\u0000highly complex waveforms, which occupy a high-dimensional parameter space with\u0000numerous variables. Given their extended inspiral timescales and low\u0000signal-to-noise ratios, EMRI signals warrant prolonged observation periods.\u0000Parameter estimation becomes particularly challenging due to non-local\u0000parameter degeneracies, arising from multiple local maxima, as well as flat\u0000regions and ridges inherent in the likelihood function. These factors lead to\u0000exceptionally high time complexity for parameter analysis while employing\u0000traditional matched filtering and random sampling methods. To address these\u0000challenges, the present study applies machine learning to Bayesian posterior\u0000estimation of EMRI signals, leveraging the recently developed flow matching\u0000technique based on ODE neural networks. Our approach demonstrates computational\u0000efficiency several orders of magnitude faster than the traditional Markov Chain\u0000Monte Carlo (MCMC) methods, while preserving the unbiasedness of parameter\u0000estimation. We show that machine learning technology has the potential to\u0000efficiently handle the vast parameter space, involving up to seventeen\u0000parameters, associated with EMRI signals. Furthermore, to our knowledge, this\u0000is the first instance of applying machine learning, specifically the Continuous\u0000Normalizing Flows (CNFs), to EMRI signal analysis. Our findings highlight the\u0000promising potential of machine learning in EMRI waveform analysis, offering new\u0000perspectives for the advancement of space-based GW detection and GW astronomy.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Max Zhu, Jian Yao, Marcus Mynatt, Hubert Pugzlys, Shuyi Li, Sergio Bacallado, Qingyuan Zhao, Chunjing Jia
We introduce a Bayesian active learning algorithm that efficiently elucidates phase diagrams. Using a novel acquisition function that assesses both the impact and likelihood of the next observation, the algorithm iteratively determines the most informative next experiment to conduct and rapidly discerns the phase diagrams with multiple phases. Comparative studies against existing methods highlight the superior efficiency of our approach. We demonstrate the algorithm's practical application through the successful identification of the entire phase diagram of a spin Hamiltonian with antisymmetric interaction on Honeycomb lattice, using significantly fewer sample points than traditional grid search methods and a previous method based on support vector machines. Our algorithm identifies the phase diagram consisting of skyrmion, spiral and polarized phases with error less than 5% using only 8% of the total possible sample points, in both two-dimensional and three-dimensional phase spaces. Additionally, our method proves highly efficient in constructing three-dimensional phase diagrams, significantly reducing computational and experimental costs. Our methodological contributions extend to higher-dimensional phase diagrams with multiple phases, emphasizing the algorithm's effectiveness and versatility in handling complex, multi-phase systems in various dimensions.
{"title":"Active Learning for Discovering Complex Phase Diagrams with Gaussian Processes","authors":"Max Zhu, Jian Yao, Marcus Mynatt, Hubert Pugzlys, Shuyi Li, Sergio Bacallado, Qingyuan Zhao, Chunjing Jia","doi":"arxiv-2409.07042","DOIUrl":"https://doi.org/arxiv-2409.07042","url":null,"abstract":"We introduce a Bayesian active learning algorithm that efficiently elucidates\u0000phase diagrams. Using a novel acquisition function that assesses both the\u0000impact and likelihood of the next observation, the algorithm iteratively\u0000determines the most informative next experiment to conduct and rapidly discerns\u0000the phase diagrams with multiple phases. Comparative studies against existing\u0000methods highlight the superior efficiency of our approach. We demonstrate the\u0000algorithm's practical application through the successful identification of the\u0000entire phase diagram of a spin Hamiltonian with antisymmetric interaction on\u0000Honeycomb lattice, using significantly fewer sample points than traditional\u0000grid search methods and a previous method based on support vector machines. Our\u0000algorithm identifies the phase diagram consisting of skyrmion, spiral and\u0000polarized phases with error less than 5% using only 8% of the total possible\u0000sample points, in both two-dimensional and three-dimensional phase spaces.\u0000Additionally, our method proves highly efficient in constructing\u0000three-dimensional phase diagrams, significantly reducing computational and\u0000experimental costs. Our methodological contributions extend to\u0000higher-dimensional phase diagrams with multiple phases, emphasizing the\u0000algorithm's effectiveness and versatility in handling complex, multi-phase\u0000systems in various dimensions.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jason B. Gibson, Tesia D. Janicki, Ajinkya C. Hire, Chris Bishop, J. Matthew D. Lane, Richard G. Hennig
Machine-learned interatomic potentials (MLIPs) are becoming an essential tool in materials modeling. However, optimizing the generation of training data used to parameterize the MLIPs remains a significant challenge. This is because MLIPs can fail when encountering local enviroments too different from those present in the training data. The difficulty of determining textit{a priori} the environments that will be encountered during molecular dynamics (MD) simulation necessitates diverse, high-quality training data. This study investigates how training data diversity affects the performance of MLIPs using the Ultra-Fast Force Field (UF$^3$) to model amorphous silicon nitride. We employ expert and autonomously generated data to create the training data and fit four force-field variants to subsets of the data. Our findings reveal a critical balance in training data diversity: insufficient diversity hinders generalization, while excessive diversity can exceed the MLIP's learning capacity, reducing simulation accuracy. Specifically, we found that the UF$^3$ variant trained on a subset of the training data, in which nitrogen-rich structures were removed, offered vastly better prediction and simulation accuracy than any other variant. By comparing these UF$^3$ variants, we highlight the nuanced requirements for creating accurate MLIPs, emphasizing the importance of application-specific training data to achieve optimal performance in modeling complex material behaviors.
{"title":"When More Data Hurts: Optimizing Data Coverage While Mitigating Diversity Induced Underfitting in an Ultra-Fast Machine-Learned Potential","authors":"Jason B. Gibson, Tesia D. Janicki, Ajinkya C. Hire, Chris Bishop, J. Matthew D. Lane, Richard G. Hennig","doi":"arxiv-2409.07610","DOIUrl":"https://doi.org/arxiv-2409.07610","url":null,"abstract":"Machine-learned interatomic potentials (MLIPs) are becoming an essential tool\u0000in materials modeling. However, optimizing the generation of training data used\u0000to parameterize the MLIPs remains a significant challenge. This is because\u0000MLIPs can fail when encountering local enviroments too different from those\u0000present in the training data. The difficulty of determining textit{a priori}\u0000the environments that will be encountered during molecular dynamics (MD)\u0000simulation necessitates diverse, high-quality training data. This study\u0000investigates how training data diversity affects the performance of MLIPs using\u0000the Ultra-Fast Force Field (UF$^3$) to model amorphous silicon nitride. We\u0000employ expert and autonomously generated data to create the training data and\u0000fit four force-field variants to subsets of the data. Our findings reveal a\u0000critical balance in training data diversity: insufficient diversity hinders\u0000generalization, while excessive diversity can exceed the MLIP's learning\u0000capacity, reducing simulation accuracy. Specifically, we found that the UF$^3$\u0000variant trained on a subset of the training data, in which nitrogen-rich\u0000structures were removed, offered vastly better prediction and simulation\u0000accuracy than any other variant. By comparing these UF$^3$ variants, we\u0000highlight the nuanced requirements for creating accurate MLIPs, emphasizing the\u0000importance of application-specific training data to achieve optimal performance\u0000in modeling complex material behaviors.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enhanced sampling simulations make the computational study of rare events feasible. A large family of such methods crucially depends on the definition of some collective variables (CVs) that could provide a low-dimensional representation of the relevant physics of the process. Recently, many methods have been proposed to semi-automatize the CV design by using machine learning tools to learn the variables directly from the simulation data. However, most methods are based on feed-forward neural networks and require as input some user-defined physical descriptors. Here, we propose to bypass this step using a graph neural network to directly use the atomic coordinates as input for the CV model. This way, we achieve a fully automatic approach to CV determination that provides variables invariant under the relevant symmetries, especially the permutational one. Furthermore, we provide different analysis tools to favor the physical interpretation of the final CV. We prove the robustness of our approach using different methods from the literature for the optimization of the CV, and we prove its efficacy on several systems, including a small peptide, an ion dissociation in explicit solvent, and a simple chemical reaction.
{"title":"Descriptors-free Collective Variables From Geometric Graph Neural Networks","authors":"Jintu Zhang, Luigi Bonati, Enrico Trizio, Odin Zhang, Yu Kang, TingJun Hou, Michele Parrinello","doi":"arxiv-2409.07339","DOIUrl":"https://doi.org/arxiv-2409.07339","url":null,"abstract":"Enhanced sampling simulations make the computational study of rare events\u0000feasible. A large family of such methods crucially depends on the definition of\u0000some collective variables (CVs) that could provide a low-dimensional\u0000representation of the relevant physics of the process. Recently, many methods\u0000have been proposed to semi-automatize the CV design by using machine learning\u0000tools to learn the variables directly from the simulation data. However, most\u0000methods are based on feed-forward neural networks and require as input some\u0000user-defined physical descriptors. Here, we propose to bypass this step using a\u0000graph neural network to directly use the atomic coordinates as input for the CV\u0000model. This way, we achieve a fully automatic approach to CV determination that\u0000provides variables invariant under the relevant symmetries, especially the\u0000permutational one. Furthermore, we provide different analysis tools to favor\u0000the physical interpretation of the final CV. We prove the robustness of our\u0000approach using different methods from the literature for the optimization of\u0000the CV, and we prove its efficacy on several systems, including a small\u0000peptide, an ion dissociation in explicit solvent, and a simple chemical\u0000reaction.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}