GANs and Closures: Micro-Macro Consistency in Multiscale Modeling

IF 1.9 4区数学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Multiscale Modeling & Simulation Pub Date : 2023-08-14 DOI:10.1137/22m1517834

Ellis R. Crabtree, Juan M. Bello-Rivas, Andrew L. Ferguson, Ioannis G. Kevrekidis

{"title":"GANs and Closures: Micro-Macro Consistency in Multiscale Modeling","authors":"Ellis R. Crabtree, Juan M. Bello-Rivas, Andrew L. Ferguson, Ioannis G. Kevrekidis","doi":"10.1137/22m1517834","DOIUrl":null,"url":null,"abstract":"Sampling the phase space of molecular systems—and, more generally, of complex systems effectively modeled by stochastic differential equations (SDEs)—is a crucial modeling step in many fields, from protein folding to materials discovery. These problems are often multiscale in nature: they can be described in terms of low-dimensional effective free energy surfaces parametrized by a small number of “slow” reaction coordinates; the remaining “fast” degrees of freedom populate an equilibrium measure conditioned on the reaction coordinate values. Sampling procedures for such problems are used to estimate effective free energy differences as well as ensemble averages with respect to the conditional equilibrium distributions; these latter averages lead to closures for effective reduced dynamic models. Over the years, enhanced sampling techniques coupled with molecular simulation have been developed; they often use knowledge of the system order parameters in order to sample the corresponding conditional equilibrium distributions, and estimate ensemble averages of observables. An intriguing analogy arises with the field of machine learning (ML), where generative adversarial networks (GANs) can produce high-dimensional samples from low-dimensional probability distributions. This sample generation is what in equation-free multiscale modeling is called a “lifting process”: it returns plausible (or realistic) high-dimensional space realizations of a model state, from information about its low-dimensional representation. In this work, we elaborate on this analogy, and we present an approach that couples physics-based simulations and biasing methods for sampling conditional distributions with ML-based conditional generative adversarial networks (cGANs) for the same task. The “coarse descriptors” on which we condition the fine scale realizations can either be known a priori or learned through nonlinear dimensionality reduction (here, using diffusion maps). We suggest that this may bring out the best features of both approaches: we demonstrate that a framework that couples cGANs with physics-based enhanced sampling techniques can improve multiscale SDE dynamical systems sampling, and even shows promise for systems of increasing complexity (here, simple molecules).","PeriodicalId":49791,"journal":{"name":"Multiscale Modeling & Simulation","volume":"12 1","pages":"0"},"PeriodicalIF":1.9000,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multiscale Modeling & Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/22m1517834","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Sampling the phase space of molecular systems—and, more generally, of complex systems effectively modeled by stochastic differential equations (SDEs)—is a crucial modeling step in many fields, from protein folding to materials discovery. These problems are often multiscale in nature: they can be described in terms of low-dimensional effective free energy surfaces parametrized by a small number of “slow” reaction coordinates; the remaining “fast” degrees of freedom populate an equilibrium measure conditioned on the reaction coordinate values. Sampling procedures for such problems are used to estimate effective free energy differences as well as ensemble averages with respect to the conditional equilibrium distributions; these latter averages lead to closures for effective reduced dynamic models. Over the years, enhanced sampling techniques coupled with molecular simulation have been developed; they often use knowledge of the system order parameters in order to sample the corresponding conditional equilibrium distributions, and estimate ensemble averages of observables. An intriguing analogy arises with the field of machine learning (ML), where generative adversarial networks (GANs) can produce high-dimensional samples from low-dimensional probability distributions. This sample generation is what in equation-free multiscale modeling is called a “lifting process”: it returns plausible (or realistic) high-dimensional space realizations of a model state, from information about its low-dimensional representation. In this work, we elaborate on this analogy, and we present an approach that couples physics-based simulations and biasing methods for sampling conditional distributions with ML-based conditional generative adversarial networks (cGANs) for the same task. The “coarse descriptors” on which we condition the fine scale realizations can either be known a priori or learned through nonlinear dimensionality reduction (here, using diffusion maps). We suggest that this may bring out the best features of both approaches: we demonstrate that a framework that couples cGANs with physics-based enhanced sampling techniques can improve multiscale SDE dynamical systems sampling, and even shows promise for systems of increasing complexity (here, simple molecules).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

gan和闭包:多尺度建模中的微观-宏观一致性

从蛋白质折叠到材料发现，对分子系统的相空间进行采样——更一般地说，是对随机微分方程(SDEs)有效建模的复杂系统进行采样——是许多领域中至关重要的建模步骤。这些问题在本质上往往是多尺度的:它们可以用由少量“慢”反应坐标参数化的低维有效自由能面来描述;其余的“快速”自由度构成一个以反应坐标值为条件的平衡测度。这类问题的抽样程序用于估计有效自由能差以及相对于条件平衡分布的系综平均;后一种平均值导致了有效简化动态模型的闭包。多年来，与分子模拟相结合的增强采样技术得到了发展;他们经常使用系统序参数的知识来对相应的条件平衡分布进行采样，并估计可观测值的集合平均值。一个有趣的类比出现在机器学习(ML)领域，其中生成对抗网络(gan)可以从低维概率分布中产生高维样本。这种样本生成在无方程多尺度建模中被称为“提升过程”:它从关于其低维表示的信息中返回模型状态的合理(或现实)高维空间实现。在这项工作中，我们详细阐述了这种类比，并提出了一种方法，将基于物理的模拟和偏置方法与基于ml的条件生成对抗网络(cgan)相结合，用于相同任务的采样条件分布。我们设定精细尺度实现的“粗描述符”既可以先验地知道，也可以通过非线性降维(这里使用扩散图)来学习。我们认为这可能会带来两种方法的最佳特征:我们证明了将cgan与基于物理的增强采样技术相结合的框架可以改善多尺度SDE动态系统的采样，甚至显示出增加复杂性的系统(这里是简单分子)的前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Multiscale Modeling & Simulation 数学-数学跨学科应用

CiteScore

2.80

自引率

6.20%

发文量

审稿时长

6-12 weeks

期刊介绍： Centered around multiscale phenomena, Multiscale Modeling and Simulation (MMS) is an interdisciplinary journal focusing on the fundamental modeling and computational principles underlying various multiscale methods. By its nature, multiscale modeling is highly interdisciplinary, with developments occurring independently across fields. A broad range of scientific and engineering problems involve multiple scales. Traditional monoscale approaches have proven to be inadequate, even with the largest supercomputers, because of the range of scales and the prohibitively large number of variables involved. Thus, there is a growing need to develop systematic modeling and simulation approaches for multiscale problems. MMS will provide a single broad, authoritative source for results in this area.