{"title":"Ridge estimation for uncertain regression model with imprecise observations","authors":"Shuang Zhang, Xin Gao","doi":"10.1007/s00500-024-09656-5","DOIUrl":null,"url":null,"abstract":"<p>In traditional regression analysis, the observed data are all accurate, but the observed data that we can obtain in real life is often not accurate. For that reason, there is the uncertain regression analysis based on the uncertain variable, in the framework of the uncertainty theory. Under the premise of imprecise observations, the data obtained often contains outliers due to human input errors or incorrect measurements. Outliers can affect parameter estimation, resulting in misleading results and making model fitting inaccurate. In parameter estimation, the most commonly used method is least squares estimation, but this method is extremely sensitive to outliers and makes parameter estimation inaccurate. To solve this problem, this paper proposes an uncertain regression model based on ridge estimation, which adds a square penalty term when performing least squares estimation of unknown parameters. The advantage of ridge estimation is that the tolerance of pathological data is much better than other parameter estimation methods, which can reduce the influence of outliers. In this paper, the optimal shrinkage parameter is determined by K-fold cross-validation to estimate the parameters of the regression model, and then we conduct the residual analysis and hypothesis test on the fitted model to obtain the predicted value and the predicted confidence interval. Finally, the validity of the model is demonstrated by two numerical examples.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"174 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09656-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In traditional regression analysis, the observed data are all accurate, but the observed data that we can obtain in real life is often not accurate. For that reason, there is the uncertain regression analysis based on the uncertain variable, in the framework of the uncertainty theory. Under the premise of imprecise observations, the data obtained often contains outliers due to human input errors or incorrect measurements. Outliers can affect parameter estimation, resulting in misleading results and making model fitting inaccurate. In parameter estimation, the most commonly used method is least squares estimation, but this method is extremely sensitive to outliers and makes parameter estimation inaccurate. To solve this problem, this paper proposes an uncertain regression model based on ridge estimation, which adds a square penalty term when performing least squares estimation of unknown parameters. The advantage of ridge estimation is that the tolerance of pathological data is much better than other parameter estimation methods, which can reduce the influence of outliers. In this paper, the optimal shrinkage parameter is determined by K-fold cross-validation to estimate the parameters of the regression model, and then we conduct the residual analysis and hypothesis test on the fitted model to obtain the predicted value and the predicted confidence interval. Finally, the validity of the model is demonstrated by two numerical examples.
在传统的回归分析中,观测到的数据都是准确的,但我们在现实生活中能得到的观测数据往往并不准确。为此,在不确定性理论的框架下,出现了基于不确定变量的不确定回归分析。在观测数据不精确的前提下,由于人为输入错误或测量结果不正确,所获得的数据往往包含异常值。异常值会影响参数估计,导致误导性结果,使模型拟合不准确。在参数估计中,最常用的方法是最小二乘估计法,但这种方法对异常值极为敏感,会导致参数估计不准确。为解决这一问题,本文提出了一种基于脊估计的不确定回归模型,该模型在对未知参数进行最小二乘估计时增加了一个平方惩罚项。脊估计法的优点是对病态数据的容忍度远远优于其他参数估计方法,可以减少异常值的影响。本文通过 K 倍交叉验证确定最优收缩参数,估计回归模型参数,然后对拟合模型进行残差分析和假设检验,得到预测值和预测置信区间。最后,通过两个数值实例证明了模型的有效性。
期刊介绍:
Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems.
Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.