S. Tikka, Jussi Hakanen, Mirka Saarela, J. Karvanen
{"title":"Sima – an Open-source Simulation Framework for Realistic Large-scale Individual-level Data Generation","authors":"S. Tikka, Jussi Hakanen, Mirka Saarela, J. Karvanen","doi":"10.34196/ijm.00240","DOIUrl":null,"url":null,"abstract":"We propose a framework for realistic data generation and the simulation of complex systems and demonstrate its capabilities in a health domain example. The main use cases of the framework are predicting the development of variables of interest, evaluating the impact of interventions and policy decisions, and supporting statistical method development. We present the fundamentals of the framework by using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available opensource implementation in R embraces efficient data structures, parallel computing, and fast random number generation, hence ensuring reproducibility and scalability. With the framework, it is possible to run dailylevel simulations for populations of millions of individuals for decades of simulated time. An example using the occurrence of stroke, type 2 diabetes, and mortality illustrates the usage of the framework in the Finnish context. In the example, we demonstrate the data collection functionality by studying the impact of nonparticipation on the estimated risk models and interventions related to controlling excessive salt consumption. DOI: https:// doi. org/ 10. 34196/ ijm. 00240","PeriodicalId":37916,"journal":{"name":"International Journal of Microsimulation","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Microsimulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34196/ijm.00240","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
We propose a framework for realistic data generation and the simulation of complex systems and demonstrate its capabilities in a health domain example. The main use cases of the framework are predicting the development of variables of interest, evaluating the impact of interventions and policy decisions, and supporting statistical method development. We present the fundamentals of the framework by using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available opensource implementation in R embraces efficient data structures, parallel computing, and fast random number generation, hence ensuring reproducibility and scalability. With the framework, it is possible to run dailylevel simulations for populations of millions of individuals for decades of simulated time. An example using the occurrence of stroke, type 2 diabetes, and mortality illustrates the usage of the framework in the Finnish context. In the example, we demonstrate the data collection functionality by studying the impact of nonparticipation on the estimated risk models and interventions related to controlling excessive salt consumption. DOI: https:// doi. org/ 10. 34196/ ijm. 00240
期刊介绍:
The IJM covers research in all aspects of microsimulation modelling. It publishes high quality contributions making use of microsimulation models to address specific research questions in all scientific areas, as well as methodological and technical issues. IJM concern: the description, validation, benchmarking and replication of microsimulation models; results coming from microsimulation models, in particular policy evaluation and counterfactual analysis; technical or methodological aspect of microsimulation modelling; reviews of models and results, as well as of technical or methodological issues.