Into the Noddyverse: A massive data store of 3D geological models for Machine Learning & inversion applications

M. Jessell, Jiateng Guo, Yunqiang Li, M. Lindsay, R. Scalzo, J. Giraud, G. Pirot, E. Cripps, V. Ogarko
{"title":"Into the Noddyverse: A massive data store of 3D geological models for Machine Learning & inversion applications","authors":"M. Jessell, Jiateng Guo, Yunqiang Li, M. Lindsay, R. Scalzo, J. Giraud, G. Pirot, E. Cripps, V. Ogarko","doi":"10.5194/essd-2021-304","DOIUrl":null,"url":null,"abstract":"Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.\n","PeriodicalId":326085,"journal":{"name":"Earth System Science Data Discussions","volume":"142 3-4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth System Science Data Discussions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/essd-2021-304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
进入Noddyverse:用于机器学习和反演应用的3D地质模型的海量数据存储
摘要与其他一些众所周知的挑战不同,如面部识别,机器学习和反演算法得到了广泛的发展,地球科学缺乏可用于验证或训练强大的机器学习和反演方案的大型标记数据集。公开可用的3D地质模型在数量和地质场景的范围上都非常有限,无法满足这些目的。对于地球物理数据的反演,这个问题进一步加剧,因为在大多数情况下,真实的地球物理观测结果来自未知的三维地质,而合成测试数据集通常不是特别地质,也没有地质多样性。为了克服这些限制,我们使用Noddy建模平台生成了100万个模型,这是第一个公开访问的大规模3D地质训练集以及由此产生的重力和磁数据集。该模型套件可用于训练机器学习系统,并为地球物理反演提供全面的测试套件。我们描述了生成模型套件的方法,并讨论了这样的模型套件提供的机会,以及它的局限性,以及我们如何发展和访问该资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
New SMOS SSS maps in the framework of the Earth Observation data For Science and Innovation in the Black Sea LGHAP: a Long-term Gap-free High-resolution Air Pollutants concentration dataset derived via tensor flow based multimodal data fusion Pre- and post-production processes along supply chains increasingly dominate GHG emissions from agri-food systems globally and in most countries Last Interglacial sea-level data points from Northwest Europe A machine learning approach to address air quality changes during the COVID-19 lockdown in Buenos Aires, Argentina
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1