基于岩石地球化学的机器学习算法在太古宙页岩单元地层对比中的应用

IF 1.3 4区地球科学 Q2 GEOLOGY Journal of Geology Pub Date : 2021-11-01 DOI:10.1086/717847

Steven E. Zhang, G. Nwaila, J. Bourdeau, H. Frimmel, Y. Ghorbani, Riham Elhabyan

{"title":"基于岩石地球化学的机器学习算法在太古宙页岩单元地层对比中的应用","authors":"Steven E. Zhang, G. Nwaila, J. Bourdeau, H. Frimmel, Y. Ghorbani, Riham Elhabyan","doi":"10.1086/717847","DOIUrl":null,"url":null,"abstract":"Data-driven methods have increasingly been applied to solve geoscientific problems. Incorporation of data-driven methods with hypothesis testing can be effective to address some long-standing debates and reduce interpretation uncertainty by leveraging larger volumes of data and more objective data analytics, which leads to increased reproducibility. In this study, lithogeochemical data from regionally persistent Archean shale units were aggregated from literature, with special reference to the Kaapvaal Craton of South Africa—namely, shales from the Barberton, Witwatersrand, Pongola, and Transvaal Supergroups—and the Belingwe and Buhwa Greenstone Belts of the Zimbabwe Craton. We examine the feasibility of using machine-learning algorithms to produce a geochemical classification and demonstrate that machine learning is capable of accurately correlating stratigraphy at the formation, group, and supergroup levels. We demonstrate the ability to extract highly useful scientific findings through a data-driven approach, such as geological implications for the uniqueness of the sediment compositions of the Central Rand and West Rand Groups. We further demonstrate that when lithogeochemistry and machine-learning algorithms are used, only about 50 samples per geological unit are necessary to reach accuracy levels of around 80%–90% for our shale samples. Consequently, for many traditional tasks, such as rock identification and mapping, some expensive analyses and manual labor can be replaced by an abundance of cheaper data and machine learning. This approach could transform large-scale geological surveys by enabling more detailed mapping than currently possible, by vastly increasing the coverage rate and total coverage. In addition, the aggregation of historical data facilitates data reuse and open science. These results justify the need to bridge data- and hypothesis-driven techniques for the stratigraphic correlation and prediction of rock units, which can improve the accuracy of the inferred stratigraphic correlation and basin setting.","PeriodicalId":54826,"journal":{"name":"Journal of Geology","volume":"129 1","pages":"647 - 672"},"PeriodicalIF":1.3000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Application of Machine-Learning Algorithms to the Stratigraphic Correlation of Archean Shale Units Based on Lithogeochemistry\",\"authors\":\"Steven E. Zhang, G. Nwaila, J. Bourdeau, H. Frimmel, Y. Ghorbani, Riham Elhabyan\",\"doi\":\"10.1086/717847\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-driven methods have increasingly been applied to solve geoscientific problems. Incorporation of data-driven methods with hypothesis testing can be effective to address some long-standing debates and reduce interpretation uncertainty by leveraging larger volumes of data and more objective data analytics, which leads to increased reproducibility. In this study, lithogeochemical data from regionally persistent Archean shale units were aggregated from literature, with special reference to the Kaapvaal Craton of South Africa—namely, shales from the Barberton, Witwatersrand, Pongola, and Transvaal Supergroups—and the Belingwe and Buhwa Greenstone Belts of the Zimbabwe Craton. We examine the feasibility of using machine-learning algorithms to produce a geochemical classification and demonstrate that machine learning is capable of accurately correlating stratigraphy at the formation, group, and supergroup levels. We demonstrate the ability to extract highly useful scientific findings through a data-driven approach, such as geological implications for the uniqueness of the sediment compositions of the Central Rand and West Rand Groups. We further demonstrate that when lithogeochemistry and machine-learning algorithms are used, only about 50 samples per geological unit are necessary to reach accuracy levels of around 80%–90% for our shale samples. Consequently, for many traditional tasks, such as rock identification and mapping, some expensive analyses and manual labor can be replaced by an abundance of cheaper data and machine learning. This approach could transform large-scale geological surveys by enabling more detailed mapping than currently possible, by vastly increasing the coverage rate and total coverage. In addition, the aggregation of historical data facilitates data reuse and open science. These results justify the need to bridge data- and hypothesis-driven techniques for the stratigraphic correlation and prediction of rock units, which can improve the accuracy of the inferred stratigraphic correlation and basin setting.\",\"PeriodicalId\":54826,\"journal\":{\"name\":\"Journal of Geology\",\"volume\":\"129 1\",\"pages\":\"647 - 672\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Geology\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1086/717847\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GEOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geology","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1086/717847","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOLOGY","Score":null,"Total":0}

引用次数: 3

摘要

数据驱动的方法越来越多地应用于解决地球科学问题。将数据驱动的方法与假设检验相结合，可以有效地解决一些长期存在的争论，并通过利用更大量的数据和更客观的数据分析来减少解释的不确定性，从而提高可重复性。本研究收集了区域性太古宙页岩单元的岩石地球化学数据，特别参考了南非Kaapvaal克拉通(即Barberton, Witwatersrand, Pongola和Transvaal超群)和津巴布韦克拉通的Belingwe和Buhwa绿岩带的页岩。我们研究了使用机器学习算法生成地球化学分类的可行性，并证明机器学习能够在地层、群和超群水平上准确地关联地层。我们展示了通过数据驱动的方法提取非常有用的科学发现的能力，例如对中央兰德和西兰德集团沉积物成分独特性的地质含义。我们进一步证明，当使用岩石地球化学和机器学习算法时，每个地质单元只需要大约50个样本，就可以达到页岩样品80%-90%的精度水平。因此，对于许多传统任务，如岩石识别和测绘，一些昂贵的分析和人工劳动可以被大量更便宜的数据和机器学习所取代。这种方法可以通过大大增加覆盖率和总覆盖率，使绘制比目前可能的更详细的地图，从而改变大规模的地质调查。此外，历史数据的聚合有利于数据重用和开放科学。这些结果证明，在地层对比和岩石单元预测中，有必要将数据和假设驱动技术相结合，从而提高推断地层对比和盆地背景的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Application of Machine-Learning Algorithms to the Stratigraphic Correlation of Archean Shale Units Based on Lithogeochemistry

Data-driven methods have increasingly been applied to solve geoscientific problems. Incorporation of data-driven methods with hypothesis testing can be effective to address some long-standing debates and reduce interpretation uncertainty by leveraging larger volumes of data and more objective data analytics, which leads to increased reproducibility. In this study, lithogeochemical data from regionally persistent Archean shale units were aggregated from literature, with special reference to the Kaapvaal Craton of South Africa—namely, shales from the Barberton, Witwatersrand, Pongola, and Transvaal Supergroups—and the Belingwe and Buhwa Greenstone Belts of the Zimbabwe Craton. We examine the feasibility of using machine-learning algorithms to produce a geochemical classification and demonstrate that machine learning is capable of accurately correlating stratigraphy at the formation, group, and supergroup levels. We demonstrate the ability to extract highly useful scientific findings through a data-driven approach, such as geological implications for the uniqueness of the sediment compositions of the Central Rand and West Rand Groups. We further demonstrate that when lithogeochemistry and machine-learning algorithms are used, only about 50 samples per geological unit are necessary to reach accuracy levels of around 80%–90% for our shale samples. Consequently, for many traditional tasks, such as rock identification and mapping, some expensive analyses and manual labor can be replaced by an abundance of cheaper data and machine learning. This approach could transform large-scale geological surveys by enabling more detailed mapping than currently possible, by vastly increasing the coverage rate and total coverage. In addition, the aggregation of historical data facilitates data reuse and open science. These results justify the need to bridge data- and hypothesis-driven techniques for the stratigraphic correlation and prediction of rock units, which can improve the accuracy of the inferred stratigraphic correlation and basin setting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Geology 地学-地质学

CiteScore

3.50

自引率

5.60%

发文量

审稿时长

3 months

期刊介绍： One of the oldest journals in geology, The Journal of Geology has since 1893 promoted the systematic philosophical and fundamental study of geology. The Journal publishes original research across a broad range of subfields in geology, including geophysics, geochemistry, sedimentology, geomorphology, petrology, plate tectonics, volcanology, structural geology, mineralogy, and planetary sciences. Many of its articles have wide appeal for geologists, present research of topical relevance, and offer new geological insights through the application of innovative approaches and methods.