The article offers the results of statistical analysis of data on the highest wages of employees in the Slovak Republic in 2020. Descriptive analysis of sample data is supplemented by generalizing the results to the population of all employees whose salary exceeds the 99th percentile of the sample, by selected methods of statistical inference, which are probability models of the highest wages and analysis of variance. The analysis focuses on assessing the significance of the impact of selected demographic and social factors on the highest salaries of employees in SR in 2020 and their differences. The investigated factors there are gender, level of education, region of residence, the label of occupation, and age category. The article also focuses on inequalities in the number of employees at different levels of the monitored factors. The obtained results of the analysis are compared with the results of similar analysis from 2010.
{"title":"Factors of Differences in the Highest Wages of Employees in the Slovak Republic (2020 vs. 2010)","authors":"V. Pacáková, Ľ. Sipková, Petr Šild","doi":"10.54694/stat.2022.6","DOIUrl":"https://doi.org/10.54694/stat.2022.6","url":null,"abstract":"The article offers the results of statistical analysis of data on the highest wages of employees in the Slovak Republic in 2020. Descriptive analysis of sample data is supplemented by generalizing the results to the population of all employees whose salary exceeds the 99th percentile of the sample, by selected methods of statistical inference, which are probability models of the highest wages and analysis of variance. The analysis focuses on assessing the significance of the impact of selected demographic and social factors on the highest salaries of employees in SR in 2020 and their differences. The investigated factors there are gender, level of education, region of residence, the label of occupation, and age category. The article also focuses on inequalities in the number of employees at different levels of the monitored factors. The obtained results of the analysis are compared with the results of similar analysis from 2010.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43837369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper focuses on visualization methods suitable for outcomes of cluster analysis of categorical data (nominal data, specifically). Since nominal data have no inherent order, their graphical representation is often challenging or very limited. This paper aims to provide a list of common visualization methods in the domain of cluster analysis of objects characterized by nominal variables. Firstly, the various plot types (such as clustering scatter plot, dendrogram, icicle plot) for cluster analysis are presented, and their suitability for presenting clusters of nominal data is discussed. Then, we study approaches of sorting nominal values on chart axes in such a way that would improve visualization of the data. Lastly, we introduce a simple alternative to cluster scatter plot for nominal data, that makes the final visualization of clustering solution more efficient since the pattern and groups in data are now more apparent. The suggested method is demonstrated in illustrative examples.
{"title":"Review of Visualization Methods for Categorical Data in Cluster Analysis","authors":"J. Cibulková, Barbora Kupková","doi":"10.54694/stat.2022.4","DOIUrl":"https://doi.org/10.54694/stat.2022.4","url":null,"abstract":"The paper focuses on visualization methods suitable for outcomes of cluster analysis of categorical data (nominal data, specifically). Since nominal data have no inherent order, their graphical representation is often challenging or very limited. This paper aims to provide a list of common visualization methods in the domain of cluster analysis of objects characterized by nominal variables. Firstly, the various plot types (such as clustering scatter plot, dendrogram, icicle plot) for cluster analysis are presented, and their suitability for presenting clusters of nominal data is discussed. Then, we study approaches of sorting nominal values on chart axes in such a way that would improve visualization of the data. Lastly, we introduce a simple alternative to cluster scatter plot for nominal data, that makes the final visualization of clustering solution more efficient since the pattern and groups in data are now more apparent. The suggested method is demonstrated in illustrative examples.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41602310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dealing with missing data is a crucial part of everyday data analysis. The IMIC algorithm is a missing data imputation method that can handle mixed numerical and categorical datasets. However, the categorical data are crucial for this work. This paper proposes the new improvement of the IMIC algorithm. The two proposed modifications consider the number of categories in each categorical variable. Based on this information, the factor, which modifies the original measure, is computed. The factor equation is inspired by the Eskin similarity measure that is known in the hierarchical clustering of categorical data. The results show that as the missing value ratio in the dataset grows, better results are achieved using the second modification. The paper also shortly analyzes the advantages and disadvantages of using the IMIC algorithm.
{"title":"Missing Data Imputation for Categorical Variables","authors":"Jaroslav Horníček, H. Řezanková","doi":"10.54694/stat.2022.3","DOIUrl":"https://doi.org/10.54694/stat.2022.3","url":null,"abstract":"Dealing with missing data is a crucial part of everyday data analysis. The IMIC algorithm is a missing data imputation method that can handle mixed numerical and categorical datasets. However, the categorical data are crucial for this work. This paper proposes the new improvement of the IMIC algorithm. The two proposed modifications consider the number of categories in each categorical variable. Based on this information, the factor, which modifies the original measure, is computed. The factor equation is inspired by the Eskin similarity measure that is known in the hierarchical clustering of categorical data. The results show that as the missing value ratio in the dataset grows, better results are achieved using the second modification. The paper also shortly analyzes the advantages and disadvantages of using the IMIC algorithm.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46369653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The article deals with the problem of the proper selection of the theoretical distribution to describe the empirical distribution of scanner prices. In the empirical study we use scanner data from one retail chain in Poland, i.e. monthly data on natural yoghurt, drinking yoghurt, long grain rice and coffee powder sold in 212 outlets in January and February 2022. Prices and price relatives were modeled using selected ten probability distributions with non-negative support, including two, three and four-parameter family of distributions In addition to the visual assessment in the form of empirical PDF and CDF figures, numerical criteria were used. These include information criteria values such as AIC, BIC, HQIC and p values calculated for the K-S, AD and CVM goodness-of-fit tests. Our research showed that at least two models could be distinguished as very accurate, which provides a good background for simulation research on price indices or for the construction of so-called population price indices.
{"title":"Probability Distribution Modeling of Scanner Prices and Relative Prices","authors":"P. Sulewski, Jacek Białek","doi":"10.54694/stat.2022.14","DOIUrl":"https://doi.org/10.54694/stat.2022.14","url":null,"abstract":"The article deals with the problem of the proper selection of the theoretical distribution to describe the empirical distribution of scanner prices. In the empirical study we use scanner data from one retail chain in Poland, i.e. monthly data on natural yoghurt, drinking yoghurt, long grain rice and coffee powder sold in 212 outlets in January and February 2022. Prices and price relatives were modeled using selected ten probability distributions with non-negative support, including two, three and four-parameter family of distributions In addition to the visual assessment in the form of empirical PDF and CDF figures, numerical criteria were used. These include information criteria values such as AIC, BIC, HQIC and p values calculated for the K-S, AD and CVM goodness-of-fit tests. Our research showed that at least two models could be distinguished as very accurate, which provides a good background for simulation research on price indices or for the construction of so-called population price indices.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42782674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, the author aims to describe the dissemination of microdata from the population census by National Statistical Offices. This type of data is highly confidential, and approaches to protection vary across the world. National Statistical Offices mostly strive to publish their data as much as possible, but they are bounded by national and international laws to protect the personal data of respondents. The primary goal is mapping the differences between countries and their categorization. Different approaches to microdata availability are described, and various data access approaches are depicted. The information was obtained from publicly available documentation and a survey in which selected statistical offices were contacted. Discovered were that of the 223 countries (including dependent territories), 100 countries have made microdata available for the scientific community, with 30 countries also providing microdata access to the public. This paper presents a mapped overview and aggregated information on the publication of microdata of the population census from around the world.
{"title":"Population Census Microdata Availability","authors":"J. Novak","doi":"10.54694/stat.2021.44","DOIUrl":"https://doi.org/10.54694/stat.2021.44","url":null,"abstract":"In this paper, the author aims to describe the dissemination of microdata from the population census by National Statistical Offices. This type of data is highly confidential, and approaches to protection vary across the world. National Statistical Offices mostly strive to publish their data as much as possible, but they are bounded by national and international laws to protect the personal data of respondents. The primary goal is mapping the differences between countries and their categorization. Different approaches to microdata availability are described, and various data access approaches are depicted. The information was obtained from publicly available documentation and a survey in which selected statistical offices were contacted. Discovered were that of the 223 countries (including dependent territories), 100 countries have made microdata available for the scientific community, with 30 countries also providing microdata access to the public. This paper presents a mapped overview and aggregated information on the publication of microdata of the population census from around the world.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46919986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aging process of the population is a natural demographic process, which is gaining more and more intensity both in Poland, Czechia and other countries. This is a demographically important issue, as it is related to many aspects of life, such as the social care system, the healthcare system or the pension system. The article presents selected demographic coefficients in the traditional approach, in which the construction of measures is based on determining the participation of elderly people in the total population or reflecting the relationship between different age groups. The article also presents coefficients in potential (static) terms, in which not only the number of age groups is important, but also how many years a person or age group can still survive. The values of population ageing coefficients in terms of potential and traditional demography were calculated on the example of Poland and Czechia.
{"title":"Selected Coefficients of Demographic Old Age in Traditional and Potential Terms on the Example of Poland and Czechia","authors":"Joanna Adrianowska","doi":"10.54694/stat.2022.22","DOIUrl":"https://doi.org/10.54694/stat.2022.22","url":null,"abstract":"The aging process of the population is a natural demographic process, which is gaining more and more intensity both in Poland, Czechia and other countries. This is a demographically important issue, as it is related to many aspects of life, such as the social care system, the healthcare system or the pension system. The article presents selected demographic coefficients in the traditional approach, in which the construction of measures is based on determining the participation of elderly people in the total population or reflecting the relationship between different age groups. The article also presents coefficients in potential (static) terms, in which not only the number of age groups is important, but also how many years a person or age group can still survive. The values of population ageing coefficients in terms of potential and traditional demography were calculated on the example of Poland and Czechia.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46985301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The fundamental problem with the measurement of preferences is that it not only attempts to measure something that is, by its nature, “unmeasurable”, but also hidden from a direct observation. In addition, a person’s current emotional, material and social situation influences the measurement of preferences resulting from the person’s system of values. The paper is a study on the methodology of preference measurement, a comparison and evaluation of two methods of scale construction. Among various techniques we investigate the two methods: Thurstone procedure for finding scale separations developed by Thurstone and the simplest rank method of scaling. This study examines the relative merits of Thurstone and rank techniques of scale construction.
{"title":"Methodological Aspects of Measuring Preferences Using the Rank and Thurstone Scale","authors":"Joanna Dȩbicka, E. Mazurek, K. Ostasiewicz","doi":"10.54694/stat.2022.5","DOIUrl":"https://doi.org/10.54694/stat.2022.5","url":null,"abstract":"The fundamental problem with the measurement of preferences is that it not only attempts to measure something that is, by its nature, “unmeasurable”, but also hidden from a direct observation. In addition, a person’s current emotional, material and social situation influences the measurement of preferences resulting from the person’s system of values. The paper is a study on the methodology of preference measurement, a comparison and evaluation of two methods of scale construction. Among various techniques we investigate the two methods: Thurstone procedure for finding scale separations developed by Thurstone and the simplest rank method of scaling. This study examines the relative merits of Thurstone and rank techniques of scale construction.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46048105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the paper, we present the methodology of calculating the benefit of a marriage reverse annuity using the multiple state model for marriage life insurance. We model the probabilistic structure and cash flows arising from marriage reverse annuity contracts in the case of the joint-life status and the last surviving status assuming that the spouses' future lifetimes are independent. Usually, it is assumed that the interest rate is constant and the same through the years. It is not a realistic assumption. Therefore, this article's purpose is to calculate benefits under the assumption that the interest rate is a stochastic process or a fuzzy number model of the constant interest rate. We conduct a comparative analysis of the amount of benefit (taking into account the different frequency of their payment) for the different models of interest rates.
{"title":"Modelling Marital Reverse Annuity Contract in a Stochastic Economic Environment","authors":"Joanna Dȩbicka, S. Heilpern, A. Marciniuk","doi":"10.54694/stat.2022.2","DOIUrl":"https://doi.org/10.54694/stat.2022.2","url":null,"abstract":"In the paper, we present the methodology of calculating the benefit of a marriage reverse annuity using the multiple state model for marriage life insurance. We model the probabilistic structure and cash flows arising from marriage reverse annuity contracts in the case of the joint-life status and the last surviving status assuming that the spouses' future lifetimes are independent. Usually, it is assumed that the interest rate is constant and the same through the years. It is not a realistic assumption. Therefore, this article's purpose is to calculate benefits under the assumption that the interest rate is a stochastic process or a fuzzy number model of the constant interest rate. We conduct a comparative analysis of the amount of benefit (taking into account the different frequency of their payment) for the different models of interest rates.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46136640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Financial intermediation services indirectly measured, or simply FISIM, is an adjustment made in national accounts which constitutes significant element in output of the financial institutions. Therefore, the methodological aspects of this adjustment are still broadly discussed issue. In case of the Czech Republic, the institution responsible for the estimation is the Czech Statistical Office. The paper deeply analyses the approach of this institution and compare it with opinions of many authors. Based on this literature research, the aim of this paper is to propose improvements in the current estimation and find out other options how to estimate the most accurate value of FISIM.
{"title":"Fisim Methodology and Options of Its Estimation: the Case of the Czech Republic","authors":"J. Vincenc","doi":"10.54694/stat.2021.26","DOIUrl":"https://doi.org/10.54694/stat.2021.26","url":null,"abstract":"Financial intermediation services indirectly measured, or simply FISIM, is an adjustment made in national accounts which constitutes significant element in output of the financial institutions. Therefore, the methodological aspects of this adjustment are still broadly discussed issue. In case of the Czech Republic, the institution responsible for the estimation is the Czech Statistical Office. The paper deeply analyses the approach of this institution and compare it with opinions of many authors. Based on this literature research, the aim of this paper is to propose improvements in the current estimation and find out other options how to estimate the most accurate value of FISIM.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46836857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In every corner of quality of the world, the issue of pension system is being addressed. One of the most important documents that has offered its evaluation is the Mercer consulting firm and the CFA Institute, in cooperation with the Monash Center for Financial Studies. Since Slovakia is not included among the countries that evaluate these companies in their study, this paper offers the calculation of the Global Pension Index for Slovakia in the year 2020. Based on the data obtained and the grade from A to E, Slovakia is one of the countries that are rated by C+ with a total score of 65 points out of 100 as a country with a pension system “that has some good features, but also includes major risks and/or shortcomings that should be addressed. Without these improvements, its efficacy and/or long-term sustainability can be questioned.” The problems that affect the pension index of Slovakia are very low pensions for low-income groups, the level of pension assets as a percentage of GDP at the level of 14.35%, the participation in the labour rate at the level of 4.5% for the age 65 and over, and low real economic growth.
{"title":"The Global Pension Index of Slovakia","authors":"J. Gubalová, Petra Medveďová, Jana Špirková","doi":"10.54694/stat.2021.38","DOIUrl":"https://doi.org/10.54694/stat.2021.38","url":null,"abstract":"In every corner of quality of the world, the issue of pension system is being addressed. One of the most important documents that has offered its evaluation is the Mercer consulting firm and the CFA Institute, in cooperation with the Monash Center for Financial Studies. Since Slovakia is not included among the countries that evaluate these companies in their study, this paper offers the calculation of the Global Pension Index for Slovakia in the year 2020. Based on the data obtained and the grade from A to E, Slovakia is one of the countries that are rated by C+ with a total score of 65 points out of 100 as a country with a pension system “that has some good features, but also includes major risks and/or shortcomings that should be addressed. Without these improvements, its efficacy and/or long-term sustainability can be questioned.” The problems that affect the pension index of Slovakia are very low pensions for low-income groups, the level of pension assets as a percentage of GDP at the level of 14.35%, the participation in the labour rate at the level of 4.5% for the age 65 and over, and low real economic growth.","PeriodicalId":43106,"journal":{"name":"Statistika-Statistics and Economy Journal","volume":" ","pages":""},"PeriodicalIF":0.2,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46703692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}