Applied Computing and Geosciences最新文献_第2页

Do more with less: Exploring semi-supervised learning for geological image classification

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2024.100216

Hisham I. Mamode, Gary J. Hampson, Cédric M. John

Labelled datasets within geoscience can often be small, with data acquisition both costly and challenging, and their interpretation and downstream use in machine learning difficult due to data scarcity. Deep learning algorithms require large datasets to learn a robust relationship between the data and its label and avoid overfitting. To overcome the paucity of data, transfer learning has been employed in classification tasks. But an alternative exists: there often is a large corpus of unlabeled data which may enhance the learning process. To evaluate this potential for subsurface data, we compare a high-performance semi-supervised learning (SSL) algorithm (SimCLRv2) with supervised transfer learning on a Convolutional Neural Network (CNN) in geological image classification.

We tested the two approaches on a classification task of sediment disturbance from cores of International Ocean Drilling Program (IODP) Expeditions 383 and 385. Our results show that semi-supervised transfer learning can be an effective strategy to adopt, with SimCLRv2 capable of producing representations comparable to those of supervised transfer learning. However attempts to enhance the performance of semi-supervised transfer learning with task-specific unlabeled images during self-supervision degraded representations. Significantly, we demonstrate that SimCLRv2 trained on a dataset of core disturbance images can out-perform supervised transfer learning of a CNN once a critical number of task-specific unlabeled images are available for self-supervision. The gain in performance compared to supervised transfer learning is 1% and 3% for binary and multi-class classification, respectively.

Supervised transfer learning can be deployed with comparative ease, whereas the current SSL algorithms such as SimCLRv2 require more effort. We recommend that SSL be explored in cases when large amounts of unlabeled task-specific images exist and improvement of a few percent in metrics matter. When examining small, highly specialized datasets, without large amounts of unlabeled images, supervised transfer learning might be the best strategy to adopt. Overall, SSL is a promising approach and future work should explore this approach utilizing different dataset types, quantity, and quality.

{"title":"Do more with less: Exploring semi-supervised learning for geological image classification","authors":"Hisham I. Mamode, Gary J. Hampson, Cédric M. John","doi":"10.1016/j.acags.2024.100216","DOIUrl":"10.1016/j.acags.2024.100216","url":null,"abstract":"<div><div>Labelled datasets within geoscience can often be small, with data acquisition both costly and challenging, and their interpretation and downstream use in machine learning difficult due to data scarcity. Deep learning algorithms require large datasets to learn a robust relationship between the data and its label and avoid overfitting. To overcome the paucity of data, transfer learning has been employed in classification tasks. But an alternative exists: there often is a large corpus of unlabeled data which may enhance the learning process. To evaluate this potential for subsurface data, we compare a high-performance semi-supervised learning (SSL) algorithm (SimCLRv2) with supervised transfer learning on a Convolutional Neural Network (CNN) in geological image classification.</div><div>We tested the two approaches on a classification task of sediment disturbance from cores of International Ocean Drilling Program (IODP) Expeditions 383 and 385. Our results show that semi-supervised transfer learning can be an effective strategy to adopt, with SimCLRv2 capable of producing representations comparable to those of supervised transfer learning. However attempts to enhance the performance of semi-supervised transfer learning with task-specific unlabeled images during self-supervision degraded representations. Significantly, we demonstrate that SimCLRv2 trained on a dataset of core disturbance images can out-perform supervised transfer learning of a CNN once a critical number of task-specific unlabeled images are available for self-supervision. The gain in performance compared to supervised transfer learning is 1% and 3% for binary and multi-class classification, respectively.</div><div>Supervised transfer learning can be deployed with comparative ease, whereas the current SSL algorithms such as SimCLRv2 require more effort. We recommend that SSL be explored in cases when large amounts of unlabeled task-specific images exist and improvement of a few percent in metrics matter. When examining small, highly specialized datasets, without large amounts of unlabeled images, supervised transfer learning might be the best strategy to adopt. Overall, SSL is a promising approach and future work should explore this approach utilizing different dataset types, quantity, and quality.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100216"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143166137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

lasertram: A Python library for time resolved analysis of laser ablation inductively coupled plasma mass spectrometry data

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2025.100225

Jordan Lubbers , Adam J.R. Kent , Chris Russo

Laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) data has a wide variety of uses in the geosciences for in-situ chemical analysis of complex natural materials. Improvements to instrument capabilities and operating software have drastically reduced the time required to generate large volumes of data relative to previous methodologies. Raw data from LA-ICP-MS, however, is in counts per unit time (typically counts per second), not elemental concentrations and converting these count ratesto concentrations requires additional processing. For complex materials where the ablated volume may contain a range of material compositions, a moderate amount of user input is also required if appropriate concentrations are to be accurately calculated. In geologic materials such as glasses and minerals that potentially have numerous heterogeneities (e.g., microlites or other inclusions) within them, this is typically determiningwhether the total ablation signal should be filtered to remove these heterogeneities. This necessitates that the LA-ICP-MS data processing pipeline is one that is not automated, but is also designed to enable rapid and efficient processing of large volumes of data.

Here we introduce

, a Python library for the time resolved analysis of LA-ICP-MS data. We outline its mathematical theory, code structure, and provide an example of how it can be used to provide the time resolved analysis necessitated by LA-ICP-MS data of complex geologic materials. Throughout the

pipeline we show how metadata and data are incrementally added to the objects created such that virtually any aspect of an experiment may be interrogated and its quality assessed. We also show, that when combined with other Python libraries for building graphical user interfaces, it can be utilized outside of a pure scripting environment.

can be found at https://doi.org/10.5066/P1DZUR3Z.

{"title":"lasertram: A Python library for time resolved analysis of laser ablation inductively coupled plasma mass spectrometry data","authors":"Jordan Lubbers , Adam J.R. Kent , Chris Russo","doi":"10.1016/j.acags.2025.100225","DOIUrl":"10.1016/j.acags.2025.100225","url":null,"abstract":"<div><div>Laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) data has a wide variety of uses in the geosciences for in-situ chemical analysis of complex natural materials. Improvements to instrument capabilities and operating software have drastically reduced the time required to generate large volumes of data relative to previous methodologies. Raw data from LA-ICP-MS, however, is in counts per unit time (typically counts per second), not elemental concentrations and converting these count ratesto concentrations requires additional processing. For complex materials where the ablated volume may contain a range of material compositions, a moderate amount of user input is also required if appropriate concentrations are to be accurately calculated. In geologic materials such as glasses and minerals that potentially have numerous heterogeneities (e.g., microlites or other inclusions) within them, this is typically determiningwhether the total ablation signal should be filtered to remove these heterogeneities. This necessitates that the LA-ICP-MS data processing pipeline is one that is not automated, but is also designed to enable rapid and efficient processing of large volumes of data.</div><div>Here we introduce <figure><img></figure> , a Python library for the time resolved analysis of LA-ICP-MS data. We outline its mathematical theory, code structure, and provide an example of how it can be used to provide the time resolved analysis necessitated by LA-ICP-MS data of complex geologic materials. Throughout the <figure><img></figure> pipeline we show how metadata and data are incrementally added to the objects created such that virtually any aspect of an experiment may be interrogated and its quality assessed. We also show, that when combined with other Python libraries for building graphical user interfaces, it can be utilized outside of a pure scripting environment. <figure><img></figure> can be found at <span><span>https://doi.org/10.5066/P1DZUR3Z</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100225"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A new inversion algorithm (PyMDS) based on the Pyro library to use chlorine 36 data as a paleoseismological tool on normal fault scarps

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2025.100234

Maureen Llinares, Ghislain Gassier, Sophie Viseur, Lucilla Benedetti

Paleoseismology (study of earthquakes that occurred before records were kept and before instruments can record them) provides useful information such as recurrence periods and slip rate to assess seismic hazard and better understand fault mechanisms. Chlorine 36 is one of the paleoseismological tools that can be used to date scarp exhumation associated with earthquakes events.

We propose an algorithm, PyMDS, that uses chlorine 36 data sampled on a fault scarp to retrieve seismic sequences (age and slip associated to each earthquake) and long term slip rate on a normal fault.

We show that the algorithm, based on Hamiltonian kernels, can successfully retrieve earthquakes and long term slip rate on a synthetic dataset. The precision on the ages can vary between few thousand years for old earthquakes (>5000 yr BP) and down to few hundreds of years for the most recent ones (<2000 yr BP). The resolution on the slip is ∼30–50 cm and on the slip rate is ∼ 1 mm/yr. Diagnostic tools (R_hat and divergences on chains) are used to check the convergence of the results.

Our new code is applied to a site in Central Italy, the results yielded are in agreement with the ones obtained previously with another inversion procedure. We found 4 events 7800±400 yr, 4700±400 yr, 3000±200 and 400 ±20 yr BP on the MA3 site. The associated slips were of 130±10 cm, 140±20 cm, 580 ± 20 cm and 205±20 cm. The results are comparable with a previous study made by (Schlagenhauf et al., 2010). The yielded slip rate of 2.7 mm/yr ± 0.4 mm/yr is also coherent with the one determined by Tesson et al. (2020).

{"title":"A new inversion algorithm (PyMDS) based on the Pyro library to use chlorine 36 data as a paleoseismological tool on normal fault scarps","authors":"Maureen Llinares, Ghislain Gassier, Sophie Viseur, Lucilla Benedetti","doi":"10.1016/j.acags.2025.100234","DOIUrl":"10.1016/j.acags.2025.100234","url":null,"abstract":"<div><div>Paleoseismology (study of earthquakes that occurred before records were kept and before instruments can record them) provides useful information such as recurrence periods and slip rate to assess seismic hazard and better understand fault mechanisms. Chlorine 36 is one of the paleoseismological tools that can be used to date scarp exhumation associated with earthquakes events.</div><div>We propose an algorithm, PyMDS, that uses chlorine 36 data sampled on a fault scarp to retrieve seismic sequences (age and slip associated to each earthquake) and long term slip rate on a normal fault.</div><div>We show that the algorithm, based on Hamiltonian kernels, can successfully retrieve earthquakes and long term slip rate on a synthetic dataset. The precision on the ages can vary between few thousand years for old earthquakes (>5000 yr BP) and down to few hundreds of years for the most recent ones (<2000 yr BP). The resolution on the slip is ∼30–50 cm and on the slip rate is ∼ 1 mm/yr. Diagnostic tools (R<sub>hat</sub> and divergences on chains) are used to check the convergence of the results.</div><div>Our new code is applied to a site in Central Italy, the results yielded are in agreement with the ones obtained previously with another inversion procedure. We found 4 events 7800±400 yr, 4700±400 yr, 3000±200 and 400 ±20 yr BP on the MA3 site. The associated slips were of 130±10 cm, 140±20 cm, 580 ± 20 cm and 205±20 cm. The results are comparable with a previous study made by (Schlagenhauf et al., 2010). The yielded slip rate of 2.7 mm/yr ± 0.4 mm/yr is also coherent with the one determined by Tesson et al. (2020).</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100234"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143631906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantic segmentation framework for atoll satellite imagery: An in-depth exploration using UNet variants and Segmentation Gym

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2024.100217

Ray Wang , Tahiya Chowdhury , Alejandra C. Ortiz

This paper presents a framework for semantic segmentation of satellite imagery aimed at studying atoll morphometrics. Recent advances in deep neural networks for automated segmentation have been valuable across a variety of satellite and aerial imagery applications, such as land cover classification, mineral characterization, and disaster impact assessment. However, identifying an appropriate segmentation approach for geoscience research remains challenging, often relying on trial-and-error experimentation for data preparation, model selection, and validation. Building on prior efforts to create reproducible research pipelines for aerial image segmentation, we propose a systematic framework for custom segmentation model development using Segmentation Gym, a software tool designed for efficient model experimentation. Additionally, we evaluate state-of-the-art U-Net model variants to identify the most accurate and precise model for specific segmentation tasks. Using a dataset of 288 Landsat images of atolls as a case study, we conduct a detailed analysis of various annotation techniques, image types, and training methods, offering a structured framework for practitioners to design and explore segmentation models. Furthermore, we address dataset imbalance, a common challenge in geographical data, and discuss strategies to mitigate its impact on segmentation outcomes. Based on our findings, we provide recommendations for applying this framework to other geoscience research areas to address similar challenges.

{"title":"Semantic segmentation framework for atoll satellite imagery: An in-depth exploration using UNet variants and Segmentation Gym","authors":"Ray Wang , Tahiya Chowdhury , Alejandra C. Ortiz","doi":"10.1016/j.acags.2024.100217","DOIUrl":"10.1016/j.acags.2024.100217","url":null,"abstract":"<div><div>This paper presents a framework for semantic segmentation of satellite imagery aimed at studying atoll morphometrics. Recent advances in deep neural networks for automated segmentation have been valuable across a variety of satellite and aerial imagery applications, such as land cover classification, mineral characterization, and disaster impact assessment. However, identifying an appropriate segmentation approach for geoscience research remains challenging, often relying on trial-and-error experimentation for data preparation, model selection, and validation. Building on prior efforts to create reproducible research pipelines for aerial image segmentation, we propose a systematic framework for custom segmentation model development using Segmentation Gym, a software tool designed for efficient model experimentation. Additionally, we evaluate state-of-the-art U-Net model variants to identify the most accurate and precise model for specific segmentation tasks. Using a dataset of 288 Landsat images of atolls as a case study, we conduct a detailed analysis of various annotation techniques, image types, and training methods, offering a structured framework for practitioners to design and explore segmentation models. Furthermore, we address dataset imbalance, a common challenge in geographical data, and discuss strategies to mitigate its impact on segmentation outcomes. Based on our findings, we provide recommendations for applying this framework to other geoscience research areas to address similar challenges.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100217"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143166135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SeisAug: A data augmentation python toolkit

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2025.100232

D. Pragnath , G. Srijayanthi , Santosh Kumar , Sumer Chopra

A common limitation in applying any deep learning and machine learning techniques is the limited labelled dataset which can be addressed through Data augmentation (DA). SeisAug is a DA python toolkit to address this challenge in seismological studies. DA. DA helps to balance the imbalanced classes of a dataset by creating more examples of under-represented classes. It significantly mitigates overfitting by increasing the volume of training data and introducing variability, thereby improving the model's performance on unseen data. Given the rapid advancements in deep learning for seismology, ‘SeisAug’ assists in extensibility by generating a substantial amount of data (2–6 times more data) which can aid in developing an indigenous robust model. Further, this study demonstrates the role of DA in developing a robust model. For this we utilized a basic two class identification models between earthquake/signal and noise/(non-earthquake). The model is trained with original, 1 and 5 times augmented datasets and their performance metrics are evaluated. The model trained with 5X times augmented dataset significantly outperforms with accuracy of 0.991, AUC 0.999 and AUC-PR 0.999 compared to the model trained with original dataset with accuracy of 0.50, AUC 0.75 and AUC-PR 0.80. Furthermore, by making all codes available on GitHub, the toolkit facilitates the easy application of DA techniques, empowering end-users to enhance their seismological waveform datasets effectively and overcome the initial drawbacks posed by the scarcity of labelled data.

{"title":"SeisAug: A data augmentation python toolkit","authors":"D. Pragnath , G. Srijayanthi , Santosh Kumar , Sumer Chopra","doi":"10.1016/j.acags.2025.100232","DOIUrl":"10.1016/j.acags.2025.100232","url":null,"abstract":"<div><div>A common limitation in applying any deep learning and machine learning techniques is the limited labelled dataset which can be addressed through Data augmentation (DA). SeisAug is a DA python toolkit to address this challenge in seismological studies. DA. DA helps to balance the imbalanced classes of a dataset by creating more examples of under-represented classes. It significantly mitigates overfitting by increasing the volume of training data and introducing variability, thereby improving the model's performance on unseen data. Given the rapid advancements in deep learning for seismology, ‘SeisAug’ assists in extensibility by generating a substantial amount of data (2–6 times more data) which can aid in developing an indigenous robust model. Further, this study demonstrates the role of DA in developing a robust model. For this we utilized a basic two class identification models between earthquake/signal and noise/(non-earthquake). The model is trained with original, 1 and 5 times augmented datasets and their performance metrics are evaluated. The model trained with 5X times augmented dataset significantly outperforms with accuracy of 0.991, AUC 0.999 and AUC-PR 0.999 compared to the model trained with original dataset with accuracy of 0.50, AUC 0.75 and AUC-PR 0.80. Furthermore, by making all codes available on GitHub, the toolkit facilitates the easy application of DA techniques, empowering end-users to enhance their seismological waveform datasets effectively and overcome the initial drawbacks posed by the scarcity of labelled data.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100232"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143591747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predictive regressive models of recent marsh sediment thickness improve the quantification of coastal marsh sediment budgets

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2024.100215

Christopher G. Smith , Julie Bernier , Alisha M. Ellis , Kathryn E.L. Smith

Coastal marsh wetlands experience variations in vertical gains and losses through time, which have allowed them to infill relict topography and record variations in drivers. The stratigraphic unit associated with the development of the marsh also reflects the long-term importance of key ecosystem services supplied by the marsh environment, including carbon storage and storm mitigation. Mapping these coastal wetland sediments and the marsh unit thickness is challenging as traditional coastal geophysical tools are not easily deployable (acoustic methods) or are unreliable in saline-soil environments (e.g., ground-penetrating radar), leaving core-based methods the most viable mapping method. In the present study, we utilized prior information on the geologic architecture of the region to select spatial and physical metrics that likely persisted throughout evolution of the marsh during the late Holocene. We then assessed the individual and collective power of these metrics to predict marsh thickness observed from cores. Employing regressive predictive models powered by these data, we improve the quantification of marsh thickness for a coastal fringing marsh within the Grand Bay estuary in Mississippi and Alabama (USA). The information gained from this approach yields improved estimates of the carbon stocks in this environment. Additionally, the stored sediment masses reflect the past, and potential future, persistence of the Grand Bay marsh under historical and present marsh-estuarine sediment exchange fluxes. Such improvements to both the sediment budget of recent marsh stratigraphic units and the spatial extent provide new resources for comparison with large-scale landscape models, the latter of which may be used, when validated, to predict future change and ecosystem transformations.

{"title":"Predictive regressive models of recent marsh sediment thickness improve the quantification of coastal marsh sediment budgets","authors":"Christopher G. Smith , Julie Bernier , Alisha M. Ellis , Kathryn E.L. Smith","doi":"10.1016/j.acags.2024.100215","DOIUrl":"10.1016/j.acags.2024.100215","url":null,"abstract":"<div><div>Coastal marsh wetlands experience variations in vertical gains and losses through time, which have allowed them to infill relict topography and record variations in drivers. The stratigraphic unit associated with the development of the marsh also reflects the long-term importance of key ecosystem services supplied by the marsh environment, including carbon storage and storm mitigation. Mapping these coastal wetland sediments and the marsh unit thickness is challenging as traditional coastal geophysical tools are not easily deployable (acoustic methods) or are unreliable in saline-soil environments (e.g., ground-penetrating radar), leaving core-based methods the most viable mapping method. In the present study, we utilized prior information on the geologic architecture of the region to select spatial and physical metrics that likely persisted throughout evolution of the marsh during the late Holocene. We then assessed the individual and collective power of these metrics to predict marsh thickness observed from cores. Employing regressive predictive models powered by these data, we improve the quantification of marsh thickness for a coastal fringing marsh within the Grand Bay estuary in Mississippi and Alabama (USA). The information gained from this approach yields improved estimates of the carbon stocks in this environment. Additionally, the stored sediment masses reflect the past, and potential future, persistence of the Grand Bay marsh under historical and present marsh-estuarine sediment exchange fluxes. Such improvements to both the sediment budget of recent marsh stratigraphic units and the spatial extent provide new resources for comparison with large-scale landscape models, the latter of which may be used, when validated, to predict future change and ecosystem transformations.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100215"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143166136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Relationships between fault friction, slip time, and physical parameters explored by experiment-based friction model: A machine learning approach using recurrent neural networks (RNNs)

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2025.100231

Tae-Hoon Uhmb , Yohei Hamada , Takehiro Hirose

Understanding the relationship between fault friction and physical parameters is crucial for comprehending earthquake physics. Despite various friction models developed to explain this relationship, representing the relationships in a friction model with greater detail remains a challenge due to intricate correlations, including the nonlinear interplay between physical parameters and friction. Here we develop new models to define the relationship between various physical parameters (slip velocity, axial displacement, temperature, rate of temperature, and rate of axial displacement), friction coefficient, and slip time. The models are established by utilizing Recurrent Neural Networks (RNNs) to analyze continuous data in high-velocity rotary shear experiments (HVR), as reported by previous work. The experiment has been conducted on diorite specimens at a slip velocity (0.004 m/s) in various normal stress (0.3–5.8 MPa). At this conditions, frictional heating occurs inevitably at the sliding surface, reaching temperature up to 68 °C. We first identified the optimal model by assessing its accuracy in relation to the time interval for defining friction. Following this, we explored the relationship between friction and physical parameters with varying slip time and conditions by analyzing the gradient importance of physical parameters within the identified model. Our results demonstrate that the importance of physical parameters continuously shifts over slip time and conditions, and temperature stands out as the most influential parameter affecting fault friction under slip conditions of this study accompanied by frictional heating. Our study demonstrates the potential of deep learning analysis in enhancing our understanding of complex frictional processes, contributing to the development of more refined friction models and improving predictive models for earthquake physics.

{"title":"Relationships between fault friction, slip time, and physical parameters explored by experiment-based friction model: A machine learning approach using recurrent neural networks (RNNs)","authors":"Tae-Hoon Uhmb , Yohei Hamada , Takehiro Hirose","doi":"10.1016/j.acags.2025.100231","DOIUrl":"10.1016/j.acags.2025.100231","url":null,"abstract":"<div><div>Understanding the relationship between fault friction and physical parameters is crucial for comprehending earthquake physics. Despite various friction models developed to explain this relationship, representing the relationships in a friction model with greater detail remains a challenge due to intricate correlations, including the nonlinear interplay between physical parameters and friction. Here we develop new models to define the relationship between various physical parameters (slip velocity, axial displacement, temperature, rate of temperature, and rate of axial displacement), friction coefficient, and slip time. The models are established by utilizing Recurrent Neural Networks (RNNs) to analyze continuous data in high-velocity rotary shear experiments (HVR), as reported by previous work. The experiment has been conducted on diorite specimens at a slip velocity (0.004 m/s) in various normal stress (0.3–5.8 MPa). At this conditions, frictional heating occurs inevitably at the sliding surface, reaching temperature up to 68 °C. We first identified the optimal model by assessing its accuracy in relation to the time interval for defining friction. Following this, we explored the relationship between friction and physical parameters with varying slip time and conditions by analyzing the gradient importance of physical parameters within the identified model. Our results demonstrate that the importance of physical parameters continuously shifts over slip time and conditions, and temperature stands out as the most influential parameter affecting fault friction under slip conditions of this study accompanied by frictional heating. Our study demonstrates the potential of deep learning analysis in enhancing our understanding of complex frictional processes, contributing to the development of more refined friction models and improving predictive models for earthquake physics.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100231"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143562771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

X-ray Micro-CT based characterization of rock cuttings with deep learning

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2025.100220

Nils Olsen , Yifeng Chen , Pascal Turberg , Alexandre Moreau , Alexandre Alahi

Rock cuttings from destructive boreholes are a common and cheaper source of drilling materials that can be used to determine underground geology compared to rock core samples. Classifying manually the series of cuttings can be a long and tedious process and can also be prone to subjectivity leading to errors. In this paper, a framework for the classification of multiple types of rock structures is introduced based on rock cutting images from X-ray micro-CT technology. The classification is performed using a simple yet effective deep learning model (a ResNet-18 architecture) to categorize five different lithologies: micritic limestone, bioclastic limestone, oolithic limestone, molassic sandstone and gneiss. The proposed network is trained on 2 datasets (laboratory and borehole) both containing the five lithologies and comprise over 10 000 images. The laboratory dataset consists of a well-controlled experiments with homogeneous samples and the borehole dataset with heterogeneous samples corresponding to a real case application. Among all the considered models, including ResNet-34, and SPP-CNN and human experts manual classification, ResNet-18 demonstrates superior performance across multiple evaluation metrics, including precision, recall, and F1-score. It is to our best knowledge, the first time a test comparing deep neural network and human performance is performed for this task. To optimize the performance of the proposed model, the transfer learning method is implemented. Furthermore, the experiments demonstrate that when employing transfer learning, the size of the dataset significantly impacts the performance of the model. In the studied design, the experimental results confirm that the proposed approach is a cost-effective and efficient method for automated rock cutting classification using the micro-CT technique, and it can be easily modified to adapt the rock cutting classification from various types and sources.

{"title":"X-ray Micro-CT based characterization of rock cuttings with deep learning","authors":"Nils Olsen , Yifeng Chen , Pascal Turberg , Alexandre Moreau , Alexandre Alahi","doi":"10.1016/j.acags.2025.100220","DOIUrl":"10.1016/j.acags.2025.100220","url":null,"abstract":"<div><div>Rock cuttings from destructive boreholes are a common and cheaper source of drilling materials that can be used to determine underground geology compared to rock core samples. Classifying manually the series of cuttings can be a long and tedious process and can also be prone to subjectivity leading to errors. In this paper, a framework for the classification of multiple types of rock structures is introduced based on rock cutting images from X-ray micro-CT technology. The classification is performed using a simple yet effective deep learning model (a ResNet-18 architecture) to categorize five different lithologies: micritic limestone, bioclastic limestone, oolithic limestone, molassic sandstone and gneiss. The proposed network is trained on 2 datasets (laboratory and borehole) both containing the five lithologies and comprise over 10 000 images. The laboratory dataset consists of a well-controlled experiments with homogeneous samples and the borehole dataset with heterogeneous samples corresponding to a real case application. Among all the considered models, including ResNet-34, and SPP-CNN and human experts manual classification, ResNet-18 demonstrates superior performance across multiple evaluation metrics, including precision, recall, and F1-score. It is to our best knowledge, the first time a test comparing deep neural network and human performance is performed for this task. To optimize the performance of the proposed model, the transfer learning method is implemented. Furthermore, the experiments demonstrate that when employing transfer learning, the size of the dataset significantly impacts the performance of the model. In the studied design, the experimental results confirm that the proposed approach is a cost-effective and efficient method for automated rock cutting classification using the micro-CT technique, and it can be easily modified to adapt the rock cutting classification from various types and sources.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100220"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Geological object recognition in legacy maps through data augmentation and transfer learning techniques

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2025.100233

Wenjia Li, Weilin Chen, Jiyin Zhang, Chenhao Li, Xiaogang Ma

Maps are crucial tools in geosciences, providing detailed representations of the spatial distribution and relationships among geological features. Accurate recognition and classification of geological objects within these maps are essential for applications in resource exploration, environmental management, and geological hazard assessment. Along the years, many legacy geological maps have been accumulated, and many of them are not in data formats ready for machines to read and analyze. The inherent diversity and complexity of geological features, combined with the labor-intensive process of manual annotation, pose significant challenges in the usage of those maps. This study addresses these challenges by proposing an innovative approach that leverages legend data for data augmentation and employs transfer learning techniques to improve the quality of object recognition. Legend data from geological maps offer standardized symbols and annotations. Using them to augment existing datasets increases the diversity and volume of training data, thereby enhances the model's ability to generalize across various geological contexts. A deep learning model called EfficientNet is then fine-tuned using the augmented dataset to recognize and classify geological features more accurately. The model's performance is evaluated based on accuracy, recall, and F1-score, with results showing significant improvements, particularly for datasets with texture-rich information. The proposed method demonstrates that the combination of data augmentation and transfer learning significantly enhances the accuracy and efficiency of geological object recognition. This approach not only reduces the manual effort needed for geological object recognition but also contributes to the advancement of geological mapping and analysis.

{"title":"Geological object recognition in legacy maps through data augmentation and transfer learning techniques","authors":"Wenjia Li, Weilin Chen, Jiyin Zhang, Chenhao Li, Xiaogang Ma","doi":"10.1016/j.acags.2025.100233","DOIUrl":"10.1016/j.acags.2025.100233","url":null,"abstract":"<div><div>Maps are crucial tools in geosciences, providing detailed representations of the spatial distribution and relationships among geological features. Accurate recognition and classification of geological objects within these maps are essential for applications in resource exploration, environmental management, and geological hazard assessment. Along the years, many legacy geological maps have been accumulated, and many of them are not in data formats ready for machines to read and analyze. The inherent diversity and complexity of geological features, combined with the labor-intensive process of manual annotation, pose significant challenges in the usage of those maps. This study addresses these challenges by proposing an innovative approach that leverages legend data for data augmentation and employs transfer learning techniques to improve the quality of object recognition. Legend data from geological maps offer standardized symbols and annotations. Using them to augment existing datasets increases the diversity and volume of training data, thereby enhances the model's ability to generalize across various geological contexts. A deep learning model called EfficientNet is then fine-tuned using the augmented dataset to recognize and classify geological features more accurately. The model's performance is evaluated based on accuracy, recall, and F1-score, with results showing significant improvements, particularly for datasets with texture-rich information. The proposed method demonstrates that the combination of data augmentation and transfer learning significantly enhances the accuracy and efficiency of geological object recognition. This approach not only reduces the manual effort needed for geological object recognition but also contributes to the advancement of geological mapping and analysis.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100233"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143578333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Streamlining geoscience data analysis with an LLM-driven workflow

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences

Pub Date : 2025-02-01 DOI: 10.1016/j.acags.2024.100218

Jiyin Zhang, Cory Clairmont, Xiang Que, Wenjia Li, Weilin Chen, Chenhao Li, Xiaogang Ma

Large Language Models (LLMs) have made significant advancements in natural language processing and human-like response generation. However, training and fine-tuning an LLM to fit the strict requirements in the scope of academic research, such as geoscience, still requires significant computational resources and human expert alignment to ensure the quality and reliability of the generated content. The challenges highlight the need for a more flexible and reliable LLM workflow to meet domain-specific analysis needs. This study proposes an LLM-driven workflow that addresses the challenges of utilizing LLMs in geoscience data analysis. The work was built upon the open data API (application programming interface) of Mindat, one of the largest databases in mineralogy. We designed and developed an open-source LLM-driven workflow that processes natural language requests and automatically utilizes the Mindat API, mineral co-occurrence network analysis, and locality distribution heat map visualization to conduct geoscience data analysis tasks. Using prompt engineering techniques, we developed a supervisor-based agentic framework that enables LLM agents to not only interpret context information but also autonomously addressing complex geoscience analysis tasks, bridging the gap between automated workflows and human expertise. This agentic design emphasizes autonomy, allowing the workflow to adapt seamlessly to future advancements in LLM capabilities without requiring additional fine-tuning or domain-specific embedding. By providing the comprehensive context of the task in the workflow and the professional tool, we ensure the quality of LLM-generated content without the need to embed geoscience knowledge into LLMs through fine-tuning or human alignment. Our approach integrates LLMs into geoscience data analysis, addressing the need for specialized tools while reducing the learning curve through LLM-driven interactions between users and APIs. This streamlined workflow enhances the efficiency of exploratory data analysis, as demonstrated by the several use cases presented. In our future work we will explore the scalability of this workflow through the integration of additional agents and diverse geoscience data sources.

{"title":"Streamlining geoscience data analysis with an LLM-driven workflow","authors":"Jiyin Zhang, Cory Clairmont, Xiang Que, Wenjia Li, Weilin Chen, Chenhao Li, Xiaogang Ma","doi":"10.1016/j.acags.2024.100218","DOIUrl":"10.1016/j.acags.2024.100218","url":null,"abstract":"<div><div>Large Language Models (LLMs) have made significant advancements in natural language processing and human-like response generation. However, training and fine-tuning an LLM to fit the strict requirements in the scope of academic research, such as geoscience, still requires significant computational resources and human expert alignment to ensure the quality and reliability of the generated content. The challenges highlight the need for a more flexible and reliable LLM workflow to meet domain-specific analysis needs. This study proposes an LLM-driven workflow that addresses the challenges of utilizing LLMs in geoscience data analysis. The work was built upon the open data API (application programming interface) of Mindat, one of the largest databases in mineralogy. We designed and developed an open-source LLM-driven workflow that processes natural language requests and automatically utilizes the Mindat API, mineral co-occurrence network analysis, and locality distribution heat map visualization to conduct geoscience data analysis tasks. Using prompt engineering techniques, we developed a supervisor-based agentic framework that enables LLM agents to not only interpret context information but also autonomously addressing complex geoscience analysis tasks, bridging the gap between automated workflows and human expertise. This agentic design emphasizes autonomy, allowing the workflow to adapt seamlessly to future advancements in LLM capabilities without requiring additional fine-tuning or domain-specific embedding. By providing the comprehensive context of the task in the workflow and the professional tool, we ensure the quality of LLM-generated content without the need to embed geoscience knowledge into LLMs through fine-tuning or human alignment. Our approach integrates LLMs into geoscience data analysis, addressing the need for specialized tools while reducing the learning curve through LLM-driven interactions between users and APIs. This streamlined workflow enhances the efficiency of exploratory data analysis, as demonstrated by the several use cases presented. In our future work we will explore the scalability of this workflow through the integration of additional agents and diverse geoscience data sources.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"25 ","pages":"Article 100218"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143166138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0