MITgcm-AD v2: Open source tangent linear and adjoint modeling framework for the oceans and atmosphere enabled by the Automatic Differentiation tool Tapenade
Shreyas Sunil Gaikwad , Sri Hari Krishna Narayanan , Laurent Hascoët , Jean-Michel Campin , Helen Pillar , An Nguyen , Jan Hückelheim , Paul Hovland , Patrick Heimbach
{"title":"MITgcm-AD v2: Open source tangent linear and adjoint modeling framework for the oceans and atmosphere enabled by the Automatic Differentiation tool Tapenade","authors":"Shreyas Sunil Gaikwad , Sri Hari Krishna Narayanan , Laurent Hascoët , Jean-Michel Campin , Helen Pillar , An Nguyen , Jan Hückelheim , Paul Hovland , Patrick Heimbach","doi":"10.1016/j.future.2024.107512","DOIUrl":null,"url":null,"abstract":"<div><div>The Massachusetts Institute of Technology General Circulation Model (MITgcm) is widely used by the climate science community to simulate planetary atmosphere and ocean circulations. A defining feature of the MITgcm is that it has been developed to be compatible with an algorithmic differentiation (AD) tool, TAF, enabling the generation of tangent-linear and adjoint models. These provide gradient information which enables dynamics-based sensitivity and attribution studies, state and parameter estimation, and rigorous uncertainty quantification. Importantly, gradient information is essential for computing comprehensive sensitivities and performing efficient large-scale data assimilation, ensuring that observations collected from satellites and in-situ measuring instruments can be effectively used to optimize a large uncertain control space. As a result, the MITgcm forms the dynamical core of a key data assimilation product employed by the physical oceanography research community: Estimating the Circulation and Climate of the Ocean (ECCO) state estimate. Although MITgcm and ECCO are used extensively within the research community, the AD tool TAF is proprietary and hence inaccessible to a large proportion of these users.</div><div>The new version 2 (MITgcm-AD v2) framework introduced here is based on the source-to-source AD tool Tapenade, which has recently been open-sourced. Another feature of Tapenade is that it stores required variables by default (instead of recomputing them) which simplifies the implementation of efficient, AD-compatible code. The framework has been integrated with the MITgcm model’s main branch and is now freely available.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X2400476X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The Massachusetts Institute of Technology General Circulation Model (MITgcm) is widely used by the climate science community to simulate planetary atmosphere and ocean circulations. A defining feature of the MITgcm is that it has been developed to be compatible with an algorithmic differentiation (AD) tool, TAF, enabling the generation of tangent-linear and adjoint models. These provide gradient information which enables dynamics-based sensitivity and attribution studies, state and parameter estimation, and rigorous uncertainty quantification. Importantly, gradient information is essential for computing comprehensive sensitivities and performing efficient large-scale data assimilation, ensuring that observations collected from satellites and in-situ measuring instruments can be effectively used to optimize a large uncertain control space. As a result, the MITgcm forms the dynamical core of a key data assimilation product employed by the physical oceanography research community: Estimating the Circulation and Climate of the Ocean (ECCO) state estimate. Although MITgcm and ECCO are used extensively within the research community, the AD tool TAF is proprietary and hence inaccessible to a large proportion of these users.
The new version 2 (MITgcm-AD v2) framework introduced here is based on the source-to-source AD tool Tapenade, which has recently been open-sourced. Another feature of Tapenade is that it stores required variables by default (instead of recomputing them) which simplifies the implementation of efficient, AD-compatible code. The framework has been integrated with the MITgcm model’s main branch and is now freely available.
麻省理工学院大气环流模式(MITgcm)被气候科学界广泛用于模拟行星大气和海洋环流。麻省理工学院大气环流模型的一个显著特点是,它与算法微分(AD)工具 TAF 兼容,能够生成切线和邻接模型。这些模型提供梯度信息,可用于基于动力学的敏感性和归因研究、状态和参数估计以及严格的不确定性量化。重要的是,梯度信息对于计算综合敏感性和执行高效的大规模数据同化至关重要,可确保卫星和现场测量仪器收集的观测数据能有效用于优化大型不确定控制空间。因此,MITgcm 构成了物理海洋学研究界使用的关键数据同化产品的动态核心:Estimating the Circulation and Climate of the Ocean (ECCO) state estimate。尽管 MITgcm 和 ECCO 在研究界被广泛使用,但 AD 工具 TAF 是专有的,因此这些用户中的很大一部分无法使用。本文介绍的新版本 2(MITgcm-AD v2)框架基于源对源 AD 工具 Tapenade,该工具最近已开源。Tapenade 的另一个特点是默认存储所需的变量(而不是重新计算变量),从而简化了高效的 AD 兼容代码的实现。该框架已与 MITgcm 模型的主分支集成,现在可以免费使用。
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.