{"title":"Soft causal constraints in groundwater machine learning: a new way to balance accuracy and physical consistency","authors":"Adoubi Vincent De Paul Adombi, Romain Chesnaux","doi":"10.1007/s12665-024-12063-6","DOIUrl":null,"url":null,"abstract":"<div><p>Physics-informed machine learning (PIML) seeks to integrate scientific knowledge into conventional machine learning models to mitigate the black-box nature of the latter and prevent them from producing physically inconsistent results. Recently, Adombi et al. (2024) [<i>a causal physics-informed deep learning formulation for groundwater flow modeling and climate change effect analysis</i>] have shown that incorporating scientific knowledge into machine learning models is not enough to make them obey certain fundamental principles of physics, such as causality. They then derived certain constraints, called causal relationship constraints (CRC), to force PIML to obey the principle of causality. However, in some situations, CRC constraints in PIML prioritize the satisfaction of the principle of causality to the detriment of performance. In this study, we propose new CRC conditions and a new architecture for PIML, with the aim of testing the hypothesis that these conditions improve the performance of PIML models without transgressing the principle of causality. The models were tasked with simulating groundwater levels in six piezometers located in Quebec, Canada. A conventional machine learning model (convolutional neural network, 1D-CNN), a PIML model based on Adombi et al. (2024) (H-Lin) and a PIML model based on the architecture proposed in this work (H-LinC) were trained and subsequently compared. The results show that 1D-CNN outperforms H-LinC, which in turn outperforms H-Lin in terms of accuracy, with median NSE and KGE of 0.76 and 0.87 for 1D-CNN, 0.68 and 0.76 fir H-LinC, and 0.53 and 0.59 fir H-Lin. However, only H-LinC and H-Lin satisfy the principle of causality.</p></div>","PeriodicalId":542,"journal":{"name":"Environmental Earth Sciences","volume":"84 2","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Earth Sciences","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s12665-024-12063-6","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Physics-informed machine learning (PIML) seeks to integrate scientific knowledge into conventional machine learning models to mitigate the black-box nature of the latter and prevent them from producing physically inconsistent results. Recently, Adombi et al. (2024) [a causal physics-informed deep learning formulation for groundwater flow modeling and climate change effect analysis] have shown that incorporating scientific knowledge into machine learning models is not enough to make them obey certain fundamental principles of physics, such as causality. They then derived certain constraints, called causal relationship constraints (CRC), to force PIML to obey the principle of causality. However, in some situations, CRC constraints in PIML prioritize the satisfaction of the principle of causality to the detriment of performance. In this study, we propose new CRC conditions and a new architecture for PIML, with the aim of testing the hypothesis that these conditions improve the performance of PIML models without transgressing the principle of causality. The models were tasked with simulating groundwater levels in six piezometers located in Quebec, Canada. A conventional machine learning model (convolutional neural network, 1D-CNN), a PIML model based on Adombi et al. (2024) (H-Lin) and a PIML model based on the architecture proposed in this work (H-LinC) were trained and subsequently compared. The results show that 1D-CNN outperforms H-LinC, which in turn outperforms H-Lin in terms of accuracy, with median NSE and KGE of 0.76 and 0.87 for 1D-CNN, 0.68 and 0.76 fir H-LinC, and 0.53 and 0.59 fir H-Lin. However, only H-LinC and H-Lin satisfy the principle of causality.
期刊介绍:
Environmental Earth Sciences is an international multidisciplinary journal concerned with all aspects of interaction between humans, natural resources, ecosystems, special climates or unique geographic zones, and the earth:
Water and soil contamination caused by waste management and disposal practices
Environmental problems associated with transportation by land, air, or water
Geological processes that may impact biosystems or humans
Man-made or naturally occurring geological or hydrological hazards
Environmental problems associated with the recovery of materials from the earth
Environmental problems caused by extraction of minerals, coal, and ores, as well as oil and gas, water and alternative energy sources
Environmental impacts of exploration and recultivation – Environmental impacts of hazardous materials
Management of environmental data and information in data banks and information systems
Dissemination of knowledge on techniques, methods, approaches and experiences to improve and remediate the environment
In pursuit of these topics, the geoscientific disciplines are invited to contribute their knowledge and experience. Major disciplines include: hydrogeology, hydrochemistry, geochemistry, geophysics, engineering geology, remediation science, natural resources management, environmental climatology and biota, environmental geography, soil science and geomicrobiology.