M. Jessell, Jiateng Guo, Yunqiang Li, M. Lindsay, R. Scalzo, J. Giraud, G. Pirot, E. Cripps, V. Ogarko
{"title":"进入Noddyverse:用于机器学习和反演应用的3D地质模型的海量数据存储","authors":"M. Jessell, Jiateng Guo, Yunqiang Li, M. Lindsay, R. Scalzo, J. Giraud, G. Pirot, E. Cripps, V. Ogarko","doi":"10.5194/essd-2021-304","DOIUrl":null,"url":null,"abstract":"Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.\n","PeriodicalId":326085,"journal":{"name":"Earth System Science Data Discussions","volume":"142 3-4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Into the Noddyverse: A massive data store of 3D geological models for Machine Learning & inversion applications\",\"authors\":\"M. Jessell, Jiateng Guo, Yunqiang Li, M. Lindsay, R. Scalzo, J. Giraud, G. Pirot, E. Cripps, V. Ogarko\",\"doi\":\"10.5194/essd-2021-304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.\\n\",\"PeriodicalId\":326085,\"journal\":{\"name\":\"Earth System Science Data Discussions\",\"volume\":\"142 3-4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Earth System Science Data Discussions\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/essd-2021-304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth System Science Data Discussions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/essd-2021-304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Into the Noddyverse: A massive data store of 3D geological models for Machine Learning & inversion applications
Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.