Pub Date : 2022-12-01DOI: 10.2478/stattrans-2022-0041
A. Hassan, Rasha S. Elshaarawy, H. Nagy
Abstract Partial ranked set sampling (PRSS) is a cost-effective sampling method. It is a combination of simple random sample (SRS) and ranked set sampling (RSS) designs. The PRSS method allows flexibility for the experimenter in selecting the sample when it is either difficult to rank the units within each set with full confidence or when experimental units are not available. In this article, we introduce and define the likelihood function of any probability distribution under the PRSS scheme. The performance of the maximum likelihood estimators is examined when the available data are assumed to have an exponentiated exponential (EE) distribution via some selective RSS schemes as well as SRS. The suggested ranked schemes include the PRSS, RSS, neoteric RSS (NRSS), and extreme RSS (ERSS). An intensive simulation study was conducted to compare and explore the behaviour of the proposed estimators. The study demonstrated that the maximum likelihood estimators via PRSS, NRSS, ERSS, and RSS schemes are more efficient than the corresponding estimators under SRS. A real data set is presented for illustrative purposes.
{"title":"Parameter estimation of exponentiated exponential distribution under selective ranked set sampling","authors":"A. Hassan, Rasha S. Elshaarawy, H. Nagy","doi":"10.2478/stattrans-2022-0041","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0041","url":null,"abstract":"Abstract Partial ranked set sampling (PRSS) is a cost-effective sampling method. It is a combination of simple random sample (SRS) and ranked set sampling (RSS) designs. The PRSS method allows flexibility for the experimenter in selecting the sample when it is either difficult to rank the units within each set with full confidence or when experimental units are not available. In this article, we introduce and define the likelihood function of any probability distribution under the PRSS scheme. The performance of the maximum likelihood estimators is examined when the available data are assumed to have an exponentiated exponential (EE) distribution via some selective RSS schemes as well as SRS. The suggested ranked schemes include the PRSS, RSS, neoteric RSS (NRSS), and extreme RSS (ERSS). An intensive simulation study was conducted to compare and explore the behaviour of the proposed estimators. The study demonstrated that the maximum likelihood estimators via PRSS, NRSS, ERSS, and RSS schemes are more efficient than the corresponding estimators under SRS. A real data set is presented for illustrative purposes.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46652129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.2478/stattrans-2022-0043
A. Oladugba, Ajali John Obasi, O. C. Asogwa
Abstract Randomisation tests (R-tests) are regularly proposed as an alternative method of hypothesis testing when assumptions of classical statistical methods are violated in data analysis. In this paper, the robustness in terms of the type-I-error and the power of the R-test were evaluated and compared with that of the F-test in the analysis of a single factor repeated measures design. The study took into account normal and non-normal data (skewed: exponential, lognormal, Chi-squared, and Weibull distributions), the presence and lack of outliers, and a situation in which the sphericity assumption was met or not under varied sample sizes and number of treatments. The Monte Carlo approach was used in the simulation study. The results showed that when the data were normal, the R-test was approximately as sensitive and robust as the F-test, while being more sensitive than the F-test when data had skewed distributions. The R-test was more sensitive and robust than the F-test in the presence of an outlier. When the sphericity assumption was met, both the R-test and the F-test were approximately equally sensitive, whereas the R-test was more sensitive and robust than the F-test when the sphericity assumption was not met.
{"title":"Robustness of randomisation tests as alternative analysis methods for repeated measures design","authors":"A. Oladugba, Ajali John Obasi, O. C. Asogwa","doi":"10.2478/stattrans-2022-0043","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0043","url":null,"abstract":"Abstract Randomisation tests (R-tests) are regularly proposed as an alternative method of hypothesis testing when assumptions of classical statistical methods are violated in data analysis. In this paper, the robustness in terms of the type-I-error and the power of the R-test were evaluated and compared with that of the F-test in the analysis of a single factor repeated measures design. The study took into account normal and non-normal data (skewed: exponential, lognormal, Chi-squared, and Weibull distributions), the presence and lack of outliers, and a situation in which the sphericity assumption was met or not under varied sample sizes and number of treatments. The Monte Carlo approach was used in the simulation study. The results showed that when the data were normal, the R-test was approximately as sensitive and robust as the F-test, while being more sensitive than the F-test when data had skewed distributions. The R-test was more sensitive and robust than the F-test in the presence of an outlier. When the sphericity assumption was met, both the R-test and the F-test were approximately equally sensitive, whereas the R-test was more sensitive and robust than the F-test when the sphericity assumption was not met.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44712278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.2478/stattrans-2022-0046
W. Molefe
Abstract This paper develops optimal designs when it is not feasible for every cluster to be represented in a sample as in stratified design, by assuming equal probability two-stage sampling where clusters are small areas. The paper develops allocation methods for two-stage sample surveys where small-area estimates are a priority. We seek efficient allocations where the aim is to minimize the linear combination of the mean squared errors of composite small area estimators and of an estimator of the overall mean. We suggest some alternative allocations with a view to minimizing the same objective. Several alternatives, including the area-only stratified design, are found to perform nearly as well as the optimal allocation but with better practical properties. Designs are evaluated numerically using Switzerland canton data as well as Botswana administrative districts data.
{"title":"Optimal allocation for equal probability two-stage design","authors":"W. Molefe","doi":"10.2478/stattrans-2022-0046","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0046","url":null,"abstract":"Abstract This paper develops optimal designs when it is not feasible for every cluster to be represented in a sample as in stratified design, by assuming equal probability two-stage sampling where clusters are small areas. The paper develops allocation methods for two-stage sample surveys where small-area estimates are a priority. We seek efficient allocations where the aim is to minimize the linear combination of the mean squared errors of composite small area estimators and of an estimator of the overall mean. We suggest some alternative allocations with a view to minimizing the same objective. Several alternatives, including the area-only stratified design, are found to perform nearly as well as the optimal allocation but with better practical properties. Designs are evaluated numerically using Switzerland canton data as well as Botswana administrative districts data.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44434994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.2478/stattrans-2022-0049
M. Nazifi, Hamid Fadishei
Abstract Two-predictor suppression situations continue to produce uninterpretable conditions in linear regression. In an attempt to address the theoretical complexities related to suppression situations, the current study introduces two different versions of a software called suppression simulator (Supsim): a) the command-line Python package, and b) the web-based JavaScript tool, both of which are able to simulate numerous random two-predictor models (RTMs). RTMs are randomly generated, normally distributed data vectors x1, x2, and y simulated in such a way that regressing y on both x1 and x2 results in the occurrence of numerous suppression and non-suppression situations. The web-based Supsim requires no coding skills and additionally, it provides users with 3D scatterplots of the simulated RTMs. This study shows that comparing 3D scatterplots of different suppression and non-suppression situations provides important new insights into the underlying mechanisms of two-predictor suppression situations. An important focus is on the comparison of 3D scatterplots of certain enhancement situations called Hamilton’s extreme example with those of redundancy situations. Such a comparison suggests that the basic mathematical concepts of two-predictor suppression situations need to be reconsidered with regard to the important issue of the statistical control function.
{"title":"Supsim: a Python package and a web-based JavaScript tool to address the theoretical complexities in two-predictor suppression situations","authors":"M. Nazifi, Hamid Fadishei","doi":"10.2478/stattrans-2022-0049","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0049","url":null,"abstract":"Abstract Two-predictor suppression situations continue to produce uninterpretable conditions in linear regression. In an attempt to address the theoretical complexities related to suppression situations, the current study introduces two different versions of a software called suppression simulator (Supsim): a) the command-line Python package, and b) the web-based JavaScript tool, both of which are able to simulate numerous random two-predictor models (RTMs). RTMs are randomly generated, normally distributed data vectors x1, x2, and y simulated in such a way that regressing y on both x1 and x2 results in the occurrence of numerous suppression and non-suppression situations. The web-based Supsim requires no coding skills and additionally, it provides users with 3D scatterplots of the simulated RTMs. This study shows that comparing 3D scatterplots of different suppression and non-suppression situations provides important new insights into the underlying mechanisms of two-predictor suppression situations. An important focus is on the comparison of 3D scatterplots of certain enhancement situations called Hamilton’s extreme example with those of redundancy situations. Such a comparison suggests that the basic mathematical concepts of two-predictor suppression situations need to be reconsidered with regard to the important issue of the statistical control function.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44934212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.2478/stattrans-2022-0048
Arvind Pandey, David D. Hanagal, Shikha Tyagi
Abstract Frailty models are the possible choice to counter the problem of the unobserved heterogeneity in individual risks of disease and death. Based on earlier studies, shared frailty models can be utilised in the analysis of bivariate data related to survival times (e.g. matched pairs experiments, twin or family data). In this article, we assume that frailty acts additively to the hazard rate. A new class of shared frailty models based on generalised Lindley distribution is established. By assuming generalised Weibull and generalised log-logistic baseline distributions, we propose a new class of shared frailty models based on the additive hazard rate. We estimate the parameters in these frailty models and use the Bayesian paradigm of the Markov Chain Monte Carlo (MCMC) technique. Model selection criteria have been applied for the comparison of models. We analyse kidney infection data and suggest the best model.
{"title":"Generalised Lindley shared additive frailty regression model for bivariate survival data","authors":"Arvind Pandey, David D. Hanagal, Shikha Tyagi","doi":"10.2478/stattrans-2022-0048","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0048","url":null,"abstract":"Abstract Frailty models are the possible choice to counter the problem of the unobserved heterogeneity in individual risks of disease and death. Based on earlier studies, shared frailty models can be utilised in the analysis of bivariate data related to survival times (e.g. matched pairs experiments, twin or family data). In this article, we assume that frailty acts additively to the hazard rate. A new class of shared frailty models based on generalised Lindley distribution is established. By assuming generalised Weibull and generalised log-logistic baseline distributions, we propose a new class of shared frailty models based on the additive hazard rate. We estimate the parameters in these frailty models and use the Bayesian paradigm of the Markov Chain Monte Carlo (MCMC) technique. Model selection criteria have been applied for the comparison of models. We analyse kidney infection data and suggest the best model.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45417661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.2478/stattrans-2022-0033
N. Varathan
Abstract In this paper, an improved ridge type estimator is introduced to overcome the effect of multi-collinearity in logistic regression. The proposed estimator is called a modified almost unbiased ridge logistic estimator. It is obtained by combining the ridge estimator and the almost unbiased ridge estimator. In order to asses the superiority of the proposed estimator over the existing estimators, theoretical comparisons based on the mean square error and the scalar mean square error criterion are presented. A Monte Carlo simulation study is carried out to compare the performance of the proposed estimator with the existing ones. Finally, a real data example is provided to support the findings.
{"title":"An improved ridge type estimator for logistic regression","authors":"N. Varathan","doi":"10.2478/stattrans-2022-0033","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0033","url":null,"abstract":"Abstract In this paper, an improved ridge type estimator is introduced to overcome the effect of multi-collinearity in logistic regression. The proposed estimator is called a modified almost unbiased ridge logistic estimator. It is obtained by combining the ridge estimator and the almost unbiased ridge estimator. In order to asses the superiority of the proposed estimator over the existing estimators, theoretical comparisons based on the mean square error and the scalar mean square error criterion are presented. A Monte Carlo simulation study is carried out to compare the performance of the proposed estimator with the existing ones. Finally, a real data example is provided to support the findings.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43172262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.2478/stattrans-2022-0038
M. C. Ugwu, M. Madukaife
Abstract In this research work we introduce a new sampling design, namely a two-stage cluster sampling, where probability proportional to size with replacement is used in the first stage unit and ranked set sampling in the second in order to address the issue of marked variability in the sizes of population units concerned with first stage sampling. We obtained an unbiased estimator of the population mean and total, as well as the variance of the mean estimator. We calculated the relative efficiency of the new sampling design to the two-stage cluster sampling with simple random sampling in the first stage and ranked set sampling in the second stage. The results demonstrated that the new sampling design is more efficient than the competing design when a significant variation is observed in the first stage units.
{"title":"Two-stage cluster sampling with unequal probability sampling in the first stage and ranked set sampling in the second stage","authors":"M. C. Ugwu, M. Madukaife","doi":"10.2478/stattrans-2022-0038","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0038","url":null,"abstract":"Abstract In this research work we introduce a new sampling design, namely a two-stage cluster sampling, where probability proportional to size with replacement is used in the first stage unit and ranked set sampling in the second in order to address the issue of marked variability in the sizes of population units concerned with first stage sampling. We obtained an unbiased estimator of the population mean and total, as well as the variance of the mean estimator. We calculated the relative efficiency of the new sampling design to the two-stage cluster sampling with simple random sampling in the first stage and ranked set sampling in the second stage. The results demonstrated that the new sampling design is more efficient than the competing design when a significant variation is observed in the first stage units.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46814105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.2478/stattrans-2022-0035
W. Permpoonsinsup, R. Sunthornwat
Abstract The coronavirus (COVID-19) pandemic affected every country worldwide. In particular, outbreaks in Belgium, the Czech Republic, Poland and Switzerland entered the second wave and was exponentially increasing between July and November, 2020. The aims of the study are: to estimate the compound growth rate, to develop a modified exponential time-series model compared with the hyperbolic time-series model, and to estimate the optimal parameters for the models based on the exponential least-squares, three selected points, partial-sums methods, and the hyperbolic least-squares for the daily COVID-19 cases in Belgium, the Czech Republic, Poland and Switzerland. The speed and spreading power of COVID-19 infections were obtained by using derivative and root-mean-squared methods, respectively. The results show that the exponential least-squares method was the most suitable for the parameter estimation. The compound growth rate of COVID-19 infection was the highest in Switzerland, and the speed and spreading power of COVID-19 infection were the highest in Poland between July and November, 2020.
{"title":"Modified exponential time series model with prediction of total COVID-19 cases in Belgium, Czech Republic, Poland and Switzerland","authors":"W. Permpoonsinsup, R. Sunthornwat","doi":"10.2478/stattrans-2022-0035","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0035","url":null,"abstract":"Abstract The coronavirus (COVID-19) pandemic affected every country worldwide. In particular, outbreaks in Belgium, the Czech Republic, Poland and Switzerland entered the second wave and was exponentially increasing between July and November, 2020. The aims of the study are: to estimate the compound growth rate, to develop a modified exponential time-series model compared with the hyperbolic time-series model, and to estimate the optimal parameters for the models based on the exponential least-squares, three selected points, partial-sums methods, and the hyperbolic least-squares for the daily COVID-19 cases in Belgium, the Czech Republic, Poland and Switzerland. The speed and spreading power of COVID-19 infections were obtained by using derivative and root-mean-squared methods, respectively. The results show that the exponential least-squares method was the most suitable for the parameter estimation. The compound growth rate of COVID-19 infection was the highest in Switzerland, and the speed and spreading power of COVID-19 infection were the highest in Poland between July and November, 2020.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42217162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The study describes the general concept of the XLindley distribution. Forms of density and hazard rate functions are investigated. Moreover, precise formulations for several numerical properties of distributions are derived. Extreme order statistics are established using stochastic ordering, the moment method, the maximum likelihood estimation, entropies and the limiting distribution. We demonstrate the new family’s adaptability by applying it to a variety of real-world datasets.
{"title":"New polynomial exponential distribution: properties and applications","authors":"Abdelfateh Beghriche, Halim zeghdoudi, Vinoth Raman, Sarra Chouia","doi":"10.2478/stattrans-2022-0032","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0032","url":null,"abstract":"Abstract The study describes the general concept of the XLindley distribution. Forms of density and hazard rate functions are investigated. Moreover, precise formulations for several numerical properties of distributions are derived. Extreme order statistics are established using stochastic ordering, the moment method, the maximum likelihood estimation, entropies and the limiting distribution. We demonstrate the new family’s adaptability by applying it to a variety of real-world datasets.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42552956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.2478/stattrans-2022-0031
A. Szulc
Abstract In the present study income inequality in Poland is evaluated using corrected income data to provide more reliable estimates. According to most empirical studies based on household surveys and considering the European standards, the recent income inequality in Poland is moderate and decreased significantly after reaching its peaks during the first decade of the 21st century. These findings were challenged by Brzeziński et al. (2022), who placed Polish income inequality among the highest in Europe. Such a conclusion was possible when combining the household survey data with information on personal income tax. In the present study the above-mentioned findings are further explored using 2014 and 2015 data and employing additional corrections to the household survey incomes. Incomes of the poorest people are replaced by their predictions made on a large set of well-being correlates, using the hierarchical correlation reconstruction. Applying this method together with the corrections based on Brzeziński’s et al. results reduces the 2014 and 2015 revised Gini indices, still keeping them above the values obtained with the use of the survey data only. It seems that the hierarchical correlation reconstruction offers more accurate proxies to the actual low incomes, while matching tax data provides better proxies to the top incomes.
{"title":"Polish inequality statistics reconsidered: are the poor really that poor?","authors":"A. Szulc","doi":"10.2478/stattrans-2022-0031","DOIUrl":"https://doi.org/10.2478/stattrans-2022-0031","url":null,"abstract":"Abstract In the present study income inequality in Poland is evaluated using corrected income data to provide more reliable estimates. According to most empirical studies based on household surveys and considering the European standards, the recent income inequality in Poland is moderate and decreased significantly after reaching its peaks during the first decade of the 21st century. These findings were challenged by Brzeziński et al. (2022), who placed Polish income inequality among the highest in Europe. Such a conclusion was possible when combining the household survey data with information on personal income tax. In the present study the above-mentioned findings are further explored using 2014 and 2015 data and employing additional corrections to the household survey incomes. Incomes of the poorest people are replaced by their predictions made on a large set of well-being correlates, using the hierarchical correlation reconstruction. Applying this method together with the corrections based on Brzeziński’s et al. results reduces the 2014 and 2015 revised Gini indices, still keeping them above the values obtained with the use of the survey data only. It seems that the hierarchical correlation reconstruction offers more accurate proxies to the actual low incomes, while matching tax data provides better proxies to the top incomes.","PeriodicalId":37985,"journal":{"name":"Statistics in Transition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41427141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}