Mohammad Hossein Zarinkolah, Hadi Jabbari, Mohammad Mehdi Saber
{"title":"中位排序集抽样中使用辅助变量的偏态分布总体均值的最优估计","authors":"Mohammad Hossein Zarinkolah, Hadi Jabbari, Mohammad Mehdi Saber","doi":"10.1080/08898480.2023.2251852","DOIUrl":null,"url":null,"abstract":"ABSTRACTIn an asymmetric population, individuals are concentrated toward one tail of the distribution. An estimator of the population mean in this asymmetric case is constructed on the basis of median ranked-set sampling, that is, the population is divided into subsets of equal size and the intersections of these sets depend on the chosen order of ranking according to a known auxiliary variable. Ranking individuals according to this auxiliary variable should approximate their ranking with respect to the unknown variable of interest. This procedure is a cost-effective way of selecting the sample when the variable of interest is unknown. To do this, the auxiliary variable must be at least weakly correlated with the variable of interest. The proposed estimator extends that constructed with extreme ranked-set sampling, whose principle is to divide the population into subsets whose intersections depend on the extreme values of the auxiliary variable. The mean square error of the estimator is expressed analytically. A simulation allows for comparing the proposed estimator with estimators based on simple random sampling and with those based on sampling sets of extreme values. A simulation shows that when the response variable is correlated with both auxiliary variables, even if these correlations are weak, around 0.5 in absolute value, then the mean square error of the proposed estimator is at least 175% lower than the mean square error of estimators based either on simple random or on extreme ranked-set samplings. A first application focuses on household incomes in the Iranian provinces of Fars and Khuzestan in 2022, first with the single gross income, which is the total income that an individual or household earns before tax as auxiliary variable and then with the two auxiliary variables of total gross household income and wages paid year-round to heads of households through the banking network. In this application, the mean square error of the proposed estimator with median ranked-set sampling is at least 60% lower than that obtained with simple random and extreme ranked-set samplings. In the application of the physical preparation score with runners’ track records as an auxiliary variable concerning 160 Iranian athletes in 2022 with sample sizes of 6, 8, 10, 25, and 30, the mean square error of the proposed estimator with median ranked-set sampling is at least 50% lower than that obtained with simple random and extreme ranked-set samplings. In the third application of the COVID-19 mean mortality rate in 2022 in the USA, Iran, Turkey, and Germany, with sample sizes of 6, 8, 10, 25, and 30, estimations of the mean mortality rate are based on new cases. In each of the four countries, the mean square error of the proposed estimator under median ranked-set sampling is at least 60% lower than that obtained with simple random and extreme ranked-set samplings.KEYWORDS: Median ranked-set samplingpopulation meanranked-set samplingratio estimationsampling surveysJEL CLASSIFICATION: 62D0562D99 AcknowledgementsWe thank two reviewers for their constructive comments.Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 Iran’s national portal of statistics: https://www.amar.org.ir.Additional informationFundingWe received no fund or grant for this article.","PeriodicalId":49859,"journal":{"name":"Mathematical Population Studies","volume":"85 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal estimators of the population mean of a skewed distribution using auxiliary variables in median ranked-set sampling\",\"authors\":\"Mohammad Hossein Zarinkolah, Hadi Jabbari, Mohammad Mehdi Saber\",\"doi\":\"10.1080/08898480.2023.2251852\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACTIn an asymmetric population, individuals are concentrated toward one tail of the distribution. An estimator of the population mean in this asymmetric case is constructed on the basis of median ranked-set sampling, that is, the population is divided into subsets of equal size and the intersections of these sets depend on the chosen order of ranking according to a known auxiliary variable. Ranking individuals according to this auxiliary variable should approximate their ranking with respect to the unknown variable of interest. This procedure is a cost-effective way of selecting the sample when the variable of interest is unknown. To do this, the auxiliary variable must be at least weakly correlated with the variable of interest. The proposed estimator extends that constructed with extreme ranked-set sampling, whose principle is to divide the population into subsets whose intersections depend on the extreme values of the auxiliary variable. The mean square error of the estimator is expressed analytically. A simulation allows for comparing the proposed estimator with estimators based on simple random sampling and with those based on sampling sets of extreme values. A simulation shows that when the response variable is correlated with both auxiliary variables, even if these correlations are weak, around 0.5 in absolute value, then the mean square error of the proposed estimator is at least 175% lower than the mean square error of estimators based either on simple random or on extreme ranked-set samplings. A first application focuses on household incomes in the Iranian provinces of Fars and Khuzestan in 2022, first with the single gross income, which is the total income that an individual or household earns before tax as auxiliary variable and then with the two auxiliary variables of total gross household income and wages paid year-round to heads of households through the banking network. In this application, the mean square error of the proposed estimator with median ranked-set sampling is at least 60% lower than that obtained with simple random and extreme ranked-set samplings. In the application of the physical preparation score with runners’ track records as an auxiliary variable concerning 160 Iranian athletes in 2022 with sample sizes of 6, 8, 10, 25, and 30, the mean square error of the proposed estimator with median ranked-set sampling is at least 50% lower than that obtained with simple random and extreme ranked-set samplings. In the third application of the COVID-19 mean mortality rate in 2022 in the USA, Iran, Turkey, and Germany, with sample sizes of 6, 8, 10, 25, and 30, estimations of the mean mortality rate are based on new cases. In each of the four countries, the mean square error of the proposed estimator under median ranked-set sampling is at least 60% lower than that obtained with simple random and extreme ranked-set samplings.KEYWORDS: Median ranked-set samplingpopulation meanranked-set samplingratio estimationsampling surveysJEL CLASSIFICATION: 62D0562D99 AcknowledgementsWe thank two reviewers for their constructive comments.Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 Iran’s national portal of statistics: https://www.amar.org.ir.Additional informationFundingWe received no fund or grant for this article.\",\"PeriodicalId\":49859,\"journal\":{\"name\":\"Mathematical Population Studies\",\"volume\":\"85 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-10-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Population Studies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/08898480.2023.2251852\",\"RegionNum\":3,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"DEMOGRAPHY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Population Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/08898480.2023.2251852","RegionNum":3,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"DEMOGRAPHY","Score":null,"Total":0}
Optimal estimators of the population mean of a skewed distribution using auxiliary variables in median ranked-set sampling
ABSTRACTIn an asymmetric population, individuals are concentrated toward one tail of the distribution. An estimator of the population mean in this asymmetric case is constructed on the basis of median ranked-set sampling, that is, the population is divided into subsets of equal size and the intersections of these sets depend on the chosen order of ranking according to a known auxiliary variable. Ranking individuals according to this auxiliary variable should approximate their ranking with respect to the unknown variable of interest. This procedure is a cost-effective way of selecting the sample when the variable of interest is unknown. To do this, the auxiliary variable must be at least weakly correlated with the variable of interest. The proposed estimator extends that constructed with extreme ranked-set sampling, whose principle is to divide the population into subsets whose intersections depend on the extreme values of the auxiliary variable. The mean square error of the estimator is expressed analytically. A simulation allows for comparing the proposed estimator with estimators based on simple random sampling and with those based on sampling sets of extreme values. A simulation shows that when the response variable is correlated with both auxiliary variables, even if these correlations are weak, around 0.5 in absolute value, then the mean square error of the proposed estimator is at least 175% lower than the mean square error of estimators based either on simple random or on extreme ranked-set samplings. A first application focuses on household incomes in the Iranian provinces of Fars and Khuzestan in 2022, first with the single gross income, which is the total income that an individual or household earns before tax as auxiliary variable and then with the two auxiliary variables of total gross household income and wages paid year-round to heads of households through the banking network. In this application, the mean square error of the proposed estimator with median ranked-set sampling is at least 60% lower than that obtained with simple random and extreme ranked-set samplings. In the application of the physical preparation score with runners’ track records as an auxiliary variable concerning 160 Iranian athletes in 2022 with sample sizes of 6, 8, 10, 25, and 30, the mean square error of the proposed estimator with median ranked-set sampling is at least 50% lower than that obtained with simple random and extreme ranked-set samplings. In the third application of the COVID-19 mean mortality rate in 2022 in the USA, Iran, Turkey, and Germany, with sample sizes of 6, 8, 10, 25, and 30, estimations of the mean mortality rate are based on new cases. In each of the four countries, the mean square error of the proposed estimator under median ranked-set sampling is at least 60% lower than that obtained with simple random and extreme ranked-set samplings.KEYWORDS: Median ranked-set samplingpopulation meanranked-set samplingratio estimationsampling surveysJEL CLASSIFICATION: 62D0562D99 AcknowledgementsWe thank two reviewers for their constructive comments.Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 Iran’s national portal of statistics: https://www.amar.org.ir.Additional informationFundingWe received no fund or grant for this article.
期刊介绍:
Mathematical Population Studies publishes carefully selected research papers in the mathematical and statistical study of populations. The journal is strongly interdisciplinary and invites contributions by mathematicians, demographers, (bio)statisticians, sociologists, economists, biologists, epidemiologists, actuaries, geographers, and others who are interested in the mathematical formulation of population-related questions.
The scope covers both theoretical and empirical work. Manuscripts should be sent to Manuscript central for review. The editor-in-chief has final say on the suitability for publication.