{"title":"Deep learning in standard least-squares theory of linear models: Perspective, development and vision","authors":"","doi":"10.1016/j.engappai.2024.109376","DOIUrl":null,"url":null,"abstract":"<div><div>Inspired by the attractive features of least-squares theory in many practical applications, this contribution introduces least-squares-based deep learning (LSBDL). Least-squares theory connects explanatory variables to predicted variables, called observations, through a linear(ized) model in which the unknown parameters of this relation are estimated using the principle of least-squares. Conversely, deep learning (DL) methods establish nonlinear relationships for applications where predicted variables are unknown (nonlinear) functions of explanatory variables. This contribution presents the DL formulation based on least-squares theory in linear models. As a data-driven method, a network is trained to construct an appropriate design matrix of which its entries are estimated using two descent optimization methods: steepest descent and Gauss–Newton. In conjunction with interpretable and explainable artificial intelligence, LSBDL leverages the well-established least-squares theory for DL applications through the following three-fold objectives: (i) Quality control measures such as covariance matrix of predicted outcome can directly be determined. (ii) Available least-squares reliability theory and hypothesis testing can be established to identify mis-specification and outlying observations. (iii) Observations’ covariance matrix can be exploited to train a network with inconsistent, heterogeneous and statistically correlated data. Three examples are presented to demonstrate the theory. The first example uses LSBDL to train coordinate basis functions for a surface fitting problem. The second example applies LSBDL to time series forecasting. The third example showcases a real-world application of LSBDL to downscale groundwater storage anomaly data. LSBDL offers opportunities in many fields of geoscience, aviation, time series analysis, data assimilation and data fusion of multiple sensors.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624015343","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Inspired by the attractive features of least-squares theory in many practical applications, this contribution introduces least-squares-based deep learning (LSBDL). Least-squares theory connects explanatory variables to predicted variables, called observations, through a linear(ized) model in which the unknown parameters of this relation are estimated using the principle of least-squares. Conversely, deep learning (DL) methods establish nonlinear relationships for applications where predicted variables are unknown (nonlinear) functions of explanatory variables. This contribution presents the DL formulation based on least-squares theory in linear models. As a data-driven method, a network is trained to construct an appropriate design matrix of which its entries are estimated using two descent optimization methods: steepest descent and Gauss–Newton. In conjunction with interpretable and explainable artificial intelligence, LSBDL leverages the well-established least-squares theory for DL applications through the following three-fold objectives: (i) Quality control measures such as covariance matrix of predicted outcome can directly be determined. (ii) Available least-squares reliability theory and hypothesis testing can be established to identify mis-specification and outlying observations. (iii) Observations’ covariance matrix can be exploited to train a network with inconsistent, heterogeneous and statistically correlated data. Three examples are presented to demonstrate the theory. The first example uses LSBDL to train coordinate basis functions for a surface fitting problem. The second example applies LSBDL to time series forecasting. The third example showcases a real-world application of LSBDL to downscale groundwater storage anomaly data. LSBDL offers opportunities in many fields of geoscience, aviation, time series analysis, data assimilation and data fusion of multiple sensors.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.