Steven E. Zhang, G. Nwaila, J. Bourdeau, H. Frimmel, Y. Ghorbani, Riham Elhabyan
{"title":"基于岩石地球化学的机器学习算法在太古宙页岩单元地层对比中的应用","authors":"Steven E. Zhang, G. Nwaila, J. Bourdeau, H. Frimmel, Y. Ghorbani, Riham Elhabyan","doi":"10.1086/717847","DOIUrl":null,"url":null,"abstract":"Data-driven methods have increasingly been applied to solve geoscientific problems. Incorporation of data-driven methods with hypothesis testing can be effective to address some long-standing debates and reduce interpretation uncertainty by leveraging larger volumes of data and more objective data analytics, which leads to increased reproducibility. In this study, lithogeochemical data from regionally persistent Archean shale units were aggregated from literature, with special reference to the Kaapvaal Craton of South Africa—namely, shales from the Barberton, Witwatersrand, Pongola, and Transvaal Supergroups—and the Belingwe and Buhwa Greenstone Belts of the Zimbabwe Craton. We examine the feasibility of using machine-learning algorithms to produce a geochemical classification and demonstrate that machine learning is capable of accurately correlating stratigraphy at the formation, group, and supergroup levels. We demonstrate the ability to extract highly useful scientific findings through a data-driven approach, such as geological implications for the uniqueness of the sediment compositions of the Central Rand and West Rand Groups. We further demonstrate that when lithogeochemistry and machine-learning algorithms are used, only about 50 samples per geological unit are necessary to reach accuracy levels of around 80%–90% for our shale samples. Consequently, for many traditional tasks, such as rock identification and mapping, some expensive analyses and manual labor can be replaced by an abundance of cheaper data and machine learning. This approach could transform large-scale geological surveys by enabling more detailed mapping than currently possible, by vastly increasing the coverage rate and total coverage. In addition, the aggregation of historical data facilitates data reuse and open science. These results justify the need to bridge data- and hypothesis-driven techniques for the stratigraphic correlation and prediction of rock units, which can improve the accuracy of the inferred stratigraphic correlation and basin setting.","PeriodicalId":54826,"journal":{"name":"Journal of Geology","volume":"129 1","pages":"647 - 672"},"PeriodicalIF":1.5000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Application of Machine-Learning Algorithms to the Stratigraphic Correlation of Archean Shale Units Based on Lithogeochemistry\",\"authors\":\"Steven E. Zhang, G. Nwaila, J. Bourdeau, H. Frimmel, Y. Ghorbani, Riham Elhabyan\",\"doi\":\"10.1086/717847\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-driven methods have increasingly been applied to solve geoscientific problems. Incorporation of data-driven methods with hypothesis testing can be effective to address some long-standing debates and reduce interpretation uncertainty by leveraging larger volumes of data and more objective data analytics, which leads to increased reproducibility. In this study, lithogeochemical data from regionally persistent Archean shale units were aggregated from literature, with special reference to the Kaapvaal Craton of South Africa—namely, shales from the Barberton, Witwatersrand, Pongola, and Transvaal Supergroups—and the Belingwe and Buhwa Greenstone Belts of the Zimbabwe Craton. We examine the feasibility of using machine-learning algorithms to produce a geochemical classification and demonstrate that machine learning is capable of accurately correlating stratigraphy at the formation, group, and supergroup levels. We demonstrate the ability to extract highly useful scientific findings through a data-driven approach, such as geological implications for the uniqueness of the sediment compositions of the Central Rand and West Rand Groups. We further demonstrate that when lithogeochemistry and machine-learning algorithms are used, only about 50 samples per geological unit are necessary to reach accuracy levels of around 80%–90% for our shale samples. Consequently, for many traditional tasks, such as rock identification and mapping, some expensive analyses and manual labor can be replaced by an abundance of cheaper data and machine learning. This approach could transform large-scale geological surveys by enabling more detailed mapping than currently possible, by vastly increasing the coverage rate and total coverage. In addition, the aggregation of historical data facilitates data reuse and open science. These results justify the need to bridge data- and hypothesis-driven techniques for the stratigraphic correlation and prediction of rock units, which can improve the accuracy of the inferred stratigraphic correlation and basin setting.\",\"PeriodicalId\":54826,\"journal\":{\"name\":\"Journal of Geology\",\"volume\":\"129 1\",\"pages\":\"647 - 672\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Geology\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1086/717847\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GEOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geology","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1086/717847","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOLOGY","Score":null,"Total":0}
Application of Machine-Learning Algorithms to the Stratigraphic Correlation of Archean Shale Units Based on Lithogeochemistry
Data-driven methods have increasingly been applied to solve geoscientific problems. Incorporation of data-driven methods with hypothesis testing can be effective to address some long-standing debates and reduce interpretation uncertainty by leveraging larger volumes of data and more objective data analytics, which leads to increased reproducibility. In this study, lithogeochemical data from regionally persistent Archean shale units were aggregated from literature, with special reference to the Kaapvaal Craton of South Africa—namely, shales from the Barberton, Witwatersrand, Pongola, and Transvaal Supergroups—and the Belingwe and Buhwa Greenstone Belts of the Zimbabwe Craton. We examine the feasibility of using machine-learning algorithms to produce a geochemical classification and demonstrate that machine learning is capable of accurately correlating stratigraphy at the formation, group, and supergroup levels. We demonstrate the ability to extract highly useful scientific findings through a data-driven approach, such as geological implications for the uniqueness of the sediment compositions of the Central Rand and West Rand Groups. We further demonstrate that when lithogeochemistry and machine-learning algorithms are used, only about 50 samples per geological unit are necessary to reach accuracy levels of around 80%–90% for our shale samples. Consequently, for many traditional tasks, such as rock identification and mapping, some expensive analyses and manual labor can be replaced by an abundance of cheaper data and machine learning. This approach could transform large-scale geological surveys by enabling more detailed mapping than currently possible, by vastly increasing the coverage rate and total coverage. In addition, the aggregation of historical data facilitates data reuse and open science. These results justify the need to bridge data- and hypothesis-driven techniques for the stratigraphic correlation and prediction of rock units, which can improve the accuracy of the inferred stratigraphic correlation and basin setting.
期刊介绍:
One of the oldest journals in geology, The Journal of Geology has since 1893 promoted the systematic philosophical and fundamental study of geology.
The Journal publishes original research across a broad range of subfields in geology, including geophysics, geochemistry, sedimentology, geomorphology, petrology, plate tectonics, volcanology, structural geology, mineralogy, and planetary sciences. Many of its articles have wide appeal for geologists, present research of topical relevance, and offer new geological insights through the application of innovative approaches and methods.