基于自适应 LASSO 模型的缺失值构成数据估算方法的应用:中国太原市的就业行业构成

IF 0.8 Q3 MULTIDISCIPLINARY SCIENCES Malaysian Journal of Fundamental and Applied Sciences Pub Date : 2023-12-04 DOI:10.11113/mjfas.v20n1.3034
Ying Tian, Majid Khan Majahar Ali, Fam Pei Shan, Lili Wu, Siti Zulaikha Mohd Jamaludin
{"title":"基于自适应 LASSO 模型的缺失值构成数据估算方法的应用:中国太原市的就业行业构成","authors":"Ying Tian, Majid Khan Majahar Ali, Fam Pei Shan, Lili Wu, Siti Zulaikha Mohd Jamaludin","doi":"10.11113/mjfas.v20n1.3034","DOIUrl":null,"url":null,"abstract":"The tripartite industry classification, which divides all economic activities into three parts, is a classification method to reflect the dynamic process of economic development and the historical trend of the change of resource allocation structure.The fact shows that the proportion of each industry has become an important symbol of the level of national economic development. The proportion of each industry is compositional data,which is a kind of complex multidimensional data used in many fields. All components in the compositional data are non-negative and carry only relative information. In practice, there could be missing values in compositional data. However, general statistical analysis methods cannot be firstly used for compositional data with missing values. The complexity of the missing value of compositional data makes traditional imputation methods no longer suitable. Thus, how to carry out effective statistical inference for compositional data with missing values attracts the attention of many scholars, recently. In this paper, we focus on the imputation problem in compositional data containing missing values, and propose an Adaptive Least Absolute Shrinkage and Selection Operator (ALASSO) imputation method to obtain a complete datasets through variable selection and parameter estimation. Then, the new method is simulated and empirically analyzed, and a comparative study with mean imputation, k-nearest neighbor imputation, and iterative regression imputation is conducted. The results show that the ALASSO imputation method has the highest accuracy for different missing rates, dimensions and correlation coefficients.","PeriodicalId":18149,"journal":{"name":"Malaysian Journal of Fundamental and Applied Sciences","volume":"62 20","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of Imputation Method for Compositional Data with Missing Values based on Adaptive LASSO Model: the Composition of Employment Industry in Taiyuan, China\",\"authors\":\"Ying Tian, Majid Khan Majahar Ali, Fam Pei Shan, Lili Wu, Siti Zulaikha Mohd Jamaludin\",\"doi\":\"10.11113/mjfas.v20n1.3034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The tripartite industry classification, which divides all economic activities into three parts, is a classification method to reflect the dynamic process of economic development and the historical trend of the change of resource allocation structure.The fact shows that the proportion of each industry has become an important symbol of the level of national economic development. The proportion of each industry is compositional data,which is a kind of complex multidimensional data used in many fields. All components in the compositional data are non-negative and carry only relative information. In practice, there could be missing values in compositional data. However, general statistical analysis methods cannot be firstly used for compositional data with missing values. The complexity of the missing value of compositional data makes traditional imputation methods no longer suitable. Thus, how to carry out effective statistical inference for compositional data with missing values attracts the attention of many scholars, recently. In this paper, we focus on the imputation problem in compositional data containing missing values, and propose an Adaptive Least Absolute Shrinkage and Selection Operator (ALASSO) imputation method to obtain a complete datasets through variable selection and parameter estimation. Then, the new method is simulated and empirically analyzed, and a comparative study with mean imputation, k-nearest neighbor imputation, and iterative regression imputation is conducted. The results show that the ALASSO imputation method has the highest accuracy for different missing rates, dimensions and correlation coefficients.\",\"PeriodicalId\":18149,\"journal\":{\"name\":\"Malaysian Journal of Fundamental and Applied Sciences\",\"volume\":\"62 20\",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Malaysian Journal of Fundamental and Applied Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11113/mjfas.v20n1.3034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Malaysian Journal of Fundamental and Applied Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/mjfas.v20n1.3034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

将所有经济活动分成三部分的产业三段式分类法,是一种反映经济发展动态过程和资源配置结构变化历史趋势的分类方法。事实表明,各行业的比重已成为衡量国民经济发展水平的重要标志。各个行业的比重是组成数据,是一种复杂的多维数据,应用于很多领域。成分数据中的所有分量都是非负的,只携带相对信息。在实践中,组合数据中可能存在缺失值。然而,一般的统计分析方法不能首先用于含有缺失值的成分数据。由于成分数据缺失值的复杂性,传统的成分数据补全方法已不再适用。因此,如何对含有缺失值的成分数据进行有效的统计推断是近年来众多学者关注的问题。本文针对含有缺失值的成分数据的插值问题,提出了一种自适应最小绝对收缩和选择算子(ALASSO)插值方法,通过变量选择和参数估计获得完整的数据集。然后,对新方法进行了仿真和实证分析,并与均值归算、k近邻归算和迭代回归归算进行了比较研究。结果表明,在不同的缺失率、维度和相关系数下,ALASSO估算方法具有最高的精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Application of Imputation Method for Compositional Data with Missing Values based on Adaptive LASSO Model: the Composition of Employment Industry in Taiyuan, China
The tripartite industry classification, which divides all economic activities into three parts, is a classification method to reflect the dynamic process of economic development and the historical trend of the change of resource allocation structure.The fact shows that the proportion of each industry has become an important symbol of the level of national economic development. The proportion of each industry is compositional data,which is a kind of complex multidimensional data used in many fields. All components in the compositional data are non-negative and carry only relative information. In practice, there could be missing values in compositional data. However, general statistical analysis methods cannot be firstly used for compositional data with missing values. The complexity of the missing value of compositional data makes traditional imputation methods no longer suitable. Thus, how to carry out effective statistical inference for compositional data with missing values attracts the attention of many scholars, recently. In this paper, we focus on the imputation problem in compositional data containing missing values, and propose an Adaptive Least Absolute Shrinkage and Selection Operator (ALASSO) imputation method to obtain a complete datasets through variable selection and parameter estimation. Then, the new method is simulated and empirically analyzed, and a comparative study with mean imputation, k-nearest neighbor imputation, and iterative regression imputation is conducted. The results show that the ALASSO imputation method has the highest accuracy for different missing rates, dimensions and correlation coefficients.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.40
自引率
0.00%
发文量
45
期刊最新文献
A Review on Synthesis and Physicochemical Properties-Photocatalytic Activity Relationships of Carbon Quantum Dots Graphitic Carbon Nitride in Reduction of Carbon Dioxide A Multi-Criteria Generalised L-R Intuitionistic Fuzzy TOPSIS with CRITIC for River Water Pollution Classification Phytochemical Screening and Antioxidant Activities of Geniotrigona thoracica Propolis Extracts Derived from Different Locations in Malaysia Two-Dimensional Heavy Metal Migration in Soil with Adsorption and Instantaneous Injection Fuzzy Intuitionistic Alpha-cut Interpolation Rational Bézier Curve Modeling for Shoreline Island Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1