{"title":"Reproducible modelling: Why is it so hard?","authors":"D. Holzworth, N. Huth","doi":"10.36334/modsim.2023.holzworth","DOIUrl":null,"url":null,"abstract":": Modelling at scale involves creating workflows that connect data to tools, utilities, and models. Often this is a manual process (e.g. scripts with no automation) that evolves over time. Unless there is clear, detailed documentation, that is accessible, it can be very difficult to reproduce simulation results at some point in the future. Journal paper descriptions of simulation results are often not reproducible! The software development industry created Docker images to very clearly define an execution environment that is reproducible. The docker user creates a simple text-based recipe (dockerfile) that installs the software application (model) and its dependencies into an image that can be executed repeatedly. If the image is pushed to a docker repository (e.g. DockerHub) then it will be accessible by others. This solves part of the reproducibility problem by encapsulating the execution environment into a sharable image. It doesn’t solve the problem of identifying the model input data.","PeriodicalId":390064,"journal":{"name":"MODSIM2023, 25th International Congress on Modelling and Simulation.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MODSIM2023, 25th International Congress on Modelling and Simulation.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36334/modsim.2023.holzworth","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
: Modelling at scale involves creating workflows that connect data to tools, utilities, and models. Often this is a manual process (e.g. scripts with no automation) that evolves over time. Unless there is clear, detailed documentation, that is accessible, it can be very difficult to reproduce simulation results at some point in the future. Journal paper descriptions of simulation results are often not reproducible! The software development industry created Docker images to very clearly define an execution environment that is reproducible. The docker user creates a simple text-based recipe (dockerfile) that installs the software application (model) and its dependencies into an image that can be executed repeatedly. If the image is pushed to a docker repository (e.g. DockerHub) then it will be accessible by others. This solves part of the reproducibility problem by encapsulating the execution environment into a sharable image. It doesn’t solve the problem of identifying the model input data.