{"title":"Penerapan Regresi Rebust Menggunakan Estimasi-S dengan Pembobotan Tukey Bisquare dan Welsch dalam Mengatasi Outlier","authors":"Mutia Salsabila, N. Rifai","doi":"10.29313/bcss.v3i2.8475","DOIUrl":null,"url":null,"abstract":"Abstract. Multiple linear regression analysis is a method for predicting the value of the dependent variable based on more than one independent variable. If in the multiple linear regression analysis there is a violation of the classical assumption, then the Least Squares Method (MKT) is not appropriate to use. In this study, the assumption of homoscedasticity was not met because there were outliers that affected the regression model. The right solution to overcome this is using robust regression without removing outlier data. Therefore, the author will discuss the robust regression of S-estimation using Tukey Bisquare and Welsch weighting on the human development index data for Central Java Province in 2021. The data includes the human development index as the dependent variable (Y), the net enrollment rate as the 1st independent variable (X1), the number of health facilities as the 2nd independent variable (X2), and the open unemployment rate as the 3rd independent variable (X3). Based on the results of the study, it was found that Tukey Bisquare's weighted S-estimation produces the best robust regression model because the Adjusted R-Square value of Tukey Bisquare's weighting is greater than Welsch's weighting (89.83% > 89.05%) and the Residual Standard Error (RSE) value of Tukey Bisquare's weighting is smaller than Welsch's weighting (2.783 <2.860). \nAbstrak. Analisis regresi linear berganda adalah metode untuk memprediksi nilai variabel terikat berdasarkan lebih dari satu variabel bebas. Jika dalam analisis regresi linear berganda terdapat pelanggaran asumsi klasik maka Metode Kuadrat Terkecil (MKT) tidak tepat digunakan. Pada penelitian ini, asumsi homoskedastisitas tidak terpenuhi karena ada outlier yang mempengaruhi model regresi. Solusi yang tepat untuk mengatasinya digunakan regresi robust tanpa menghapus data pencilan. Maka dari itu, penulis akan membahas mengenai regresi robust estimasi-S menggunakan pembobotan Tukey Bisquare dan Welsch pada data indeks pembangunan manusia Provinsi Jawa Tengah tahun 2021. Data tersebut meliputi indeks pembangunan manusia sebagai variabel tak bebas (Y), angka partisipasi murni sebagai variabel bebas ke-1 (X1), jumlah sarana kesehatan sebagai variabel bebas ke-2 (X2), dan tingkat pengangguran terbuka sebagai variabel bebas ke-3 (X3). Berdasarkan hasil penelitian diperoleh bahwa estimasi-S pembobotan Tukey Bisquare menghasilkan model regresi robust terbaik karena nilai Adjusted R-Square dari pembobotan Tukey Bisquare lebih besar daripada pembobotan Welsch (89,83% > 89,05%) dan nilai Residual Standard Error (RSE) dari pembobotan Tukey Bisquare lebih kecil daripada pembobotan Welsch (2,783 < 2,860).","PeriodicalId":337947,"journal":{"name":"Bandung Conference Series: Statistics","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bandung Conference Series: Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29313/bcss.v3i2.8475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract. Multiple linear regression analysis is a method for predicting the value of the dependent variable based on more than one independent variable. If in the multiple linear regression analysis there is a violation of the classical assumption, then the Least Squares Method (MKT) is not appropriate to use. In this study, the assumption of homoscedasticity was not met because there were outliers that affected the regression model. The right solution to overcome this is using robust regression without removing outlier data. Therefore, the author will discuss the robust regression of S-estimation using Tukey Bisquare and Welsch weighting on the human development index data for Central Java Province in 2021. The data includes the human development index as the dependent variable (Y), the net enrollment rate as the 1st independent variable (X1), the number of health facilities as the 2nd independent variable (X2), and the open unemployment rate as the 3rd independent variable (X3). Based on the results of the study, it was found that Tukey Bisquare's weighted S-estimation produces the best robust regression model because the Adjusted R-Square value of Tukey Bisquare's weighting is greater than Welsch's weighting (89.83% > 89.05%) and the Residual Standard Error (RSE) value of Tukey Bisquare's weighting is smaller than Welsch's weighting (2.783 <2.860).
Abstrak. Analisis regresi linear berganda adalah metode untuk memprediksi nilai variabel terikat berdasarkan lebih dari satu variabel bebas. Jika dalam analisis regresi linear berganda terdapat pelanggaran asumsi klasik maka Metode Kuadrat Terkecil (MKT) tidak tepat digunakan. Pada penelitian ini, asumsi homoskedastisitas tidak terpenuhi karena ada outlier yang mempengaruhi model regresi. Solusi yang tepat untuk mengatasinya digunakan regresi robust tanpa menghapus data pencilan. Maka dari itu, penulis akan membahas mengenai regresi robust estimasi-S menggunakan pembobotan Tukey Bisquare dan Welsch pada data indeks pembangunan manusia Provinsi Jawa Tengah tahun 2021. Data tersebut meliputi indeks pembangunan manusia sebagai variabel tak bebas (Y), angka partisipasi murni sebagai variabel bebas ke-1 (X1), jumlah sarana kesehatan sebagai variabel bebas ke-2 (X2), dan tingkat pengangguran terbuka sebagai variabel bebas ke-3 (X3). Berdasarkan hasil penelitian diperoleh bahwa estimasi-S pembobotan Tukey Bisquare menghasilkan model regresi robust terbaik karena nilai Adjusted R-Square dari pembobotan Tukey Bisquare lebih besar daripada pembobotan Welsch (89,83% > 89,05%) dan nilai Residual Standard Error (RSE) dari pembobotan Tukey Bisquare lebih kecil daripada pembobotan Welsch (2,783 < 2,860).
摘要。多元线性回归分析是一种基于多个自变量预测因变量值的方法。如果在多元线性回归分析中存在违反经典假设的情况,那么最小二乘法(MKT)就不适合使用。在本研究中,由于存在异常值影响回归模型,所以不满足异方差假设。克服这一问题的正确解决方案是在不删除异常数据的情况下使用稳健回归。因此,作者将对2021年中爪哇省人类发展指数数据使用Tukey bissquared和Welsch加权来讨论s估计的稳健回归。数据以人类发展指数为因变量(Y),净入学率为第一自变量(X1),卫生设施数量为第二自变量(X2),公开失业率为第三自变量(X3)。根据研究结果,结果发现,Tukey Bisquare的加权s估计产生了最好的稳健回归模型,因为Tukey Bisquare的权重的Adjusted R-Square值大于Welsch的权重(89.83% > 89.05%),而Tukey Bisquare的权重的残差标准误差(RSE)值小于Welsch的权重(2.783 89.05%),但nilai残差标准误差(RSE) dari pembobotan Tukey Bisquare lebih kecil daripada pembobotan Welsch(2,783 < 2,860)。