Pub Date : 2025-08-01DOI: 10.1007/s10182-025-00536-3
Christian Mücher, Giorgio Calzolari, Roxana Halbleib
We provide the first “frequentist” method to estimate the parameters of multivariate stochastic volatility models with latent factor structures to capture the time-varying variance–covariance of financial returns. These models alleviate the standard curse of dimensionality, allowing the number of parameters to increase only linearly with the number of series. Although theoretically very appealing, they have only found limited practical application due to huge computational burdens. Our estimation method is simple in implementation as it consists of two steps: first, we estimate the loadings and the unconditional variances by maximum likelihood, and then, we use the efficient method of moments to estimate the parameters of the stochastic volatility structure with the generalised autoregressive conditional heteroskedasticity (GARCH) auxiliary models. In a comprehensive Monte Carlo study, we show the good performance of our method to estimate the parameters of interest accurately. The simulation study and an application to the daily returns on 148 stocks in the cross-sectional dimension provide sound evidence on the computational feasibility of the method proposed and its application.
{"title":"Sequential estimation of multivariate factor stochastic volatility models","authors":"Christian Mücher, Giorgio Calzolari, Roxana Halbleib","doi":"10.1007/s10182-025-00536-3","DOIUrl":"10.1007/s10182-025-00536-3","url":null,"abstract":"<div><p>We provide the first “frequentist” method to estimate the parameters of multivariate stochastic volatility models with latent factor structures to capture the time-varying variance–covariance of financial returns. These models alleviate the standard curse of dimensionality, allowing the number of parameters to increase only linearly with the number of series. Although theoretically very appealing, they have only found limited practical application due to huge computational burdens. Our estimation method is simple in implementation as it consists of two steps: first, we estimate the loadings and the unconditional variances by maximum likelihood, and then, we use the efficient method of moments to estimate the parameters of the stochastic volatility structure with the generalised autoregressive conditional heteroskedasticity (GARCH) auxiliary models. In a comprehensive Monte Carlo study, we show the good performance of our method to estimate the parameters of interest accurately. The simulation study and an application to the daily returns on 148 stocks in the cross-sectional dimension provide sound evidence on the computational feasibility of the method proposed and its application.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"110 1","pages":"41 - 63"},"PeriodicalIF":1.4,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00536-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-21DOI: 10.1007/s10182-025-00534-5
Mariaelena Bottazzi Schenone, Maurizio Vichi
This paper explores the use of clustering to rank multivariate observations by linking ranking to clustering through the Linear Ordered Partition (LOP) concept. A LOP allows optimal clustering into ordered “equivalence classes”. In fact, unlike simple units’ ordering, cluster ranking identifies classes where units are “incomparable”. The aim is to partition units into clusters with statistically distinct centroids, leading to an optimally ranked total order of clusters, where units within each one are considered “ties”. The proposed model finds the best least-squares (LS) LOP, alongside with a univariate transformation of the observed variables. This is because it identifies the LS LOP by orthogonally projecting multivariate units onto a line, thus creating a composite indicator that summarizes the observed variables. Model’s theoretical properties are discussed, and a large simulation study demonstrates its performance across different scenarios. Three real data applications highlight the method’s potential across different fields.
{"title":"Clustering for ranking multivariate data by Linear Ordered Partitions","authors":"Mariaelena Bottazzi Schenone, Maurizio Vichi","doi":"10.1007/s10182-025-00534-5","DOIUrl":"10.1007/s10182-025-00534-5","url":null,"abstract":"<div><p>This paper explores the use of clustering to rank multivariate observations by linking ranking to clustering through the Linear Ordered Partition (LOP) concept. A LOP allows optimal clustering into ordered “<i>equivalence classes</i>”. In fact, unlike simple units’ ordering, cluster ranking identifies classes where units are “<i>incomparable</i>”. The aim is to partition units into clusters with statistically distinct centroids, leading to an optimally ranked total order of clusters, where units within each one are considered “<i>ties</i>”. The proposed model finds the best least-squares (LS) LOP, alongside with a univariate transformation of the observed variables. This is because it identifies the LS LOP by orthogonally projecting multivariate units onto a line, thus creating a composite indicator that summarizes the observed variables. Model’s theoretical properties are discussed, and a large simulation study demonstrates its performance across different scenarios. Three real data applications highlight the method’s potential across different fields.\u0000</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"110 1","pages":"117 - 148"},"PeriodicalIF":1.4,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00534-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147341237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-18DOI: 10.1007/s10182-025-00531-8
Nihan Acar-Denizli, Pedro Delicado
Wearable devices and sensors have recently become a popular way to collect data, especially in the health sciences. The use of sensors allows patients to be monitored over a period of time with a high observation frequency. Due to the continuous-on-time structure of the data, novel statistical methods are recommended for the analysis of sensor data. One of the popular approaches in the analysis of wearable sensor data is functional data analysis. The main objective of this paper is to review functional data analysis methods applied to wearable device data according to the type of sensor. In addition, we introduce several freely available software packages and open databases of wearable device data to facilitate access to sensor data in different fields.
{"title":"Functional data analysis for wearable sensor data: a systematic review","authors":"Nihan Acar-Denizli, Pedro Delicado","doi":"10.1007/s10182-025-00531-8","DOIUrl":"10.1007/s10182-025-00531-8","url":null,"abstract":"<div><p>Wearable devices and sensors have recently become a popular way to collect data, especially in the health sciences. The use of sensors allows patients to be monitored over a period of time with a high observation frequency. Due to the continuous-on-time structure of the data, novel statistical methods are recommended for the analysis of sensor data. One of the popular approaches in the analysis of wearable sensor data is functional data analysis. The main objective of this paper is to review functional data analysis methods applied to wearable device data according to the type of sensor. In addition, we introduce several freely available software packages and open databases of wearable device data to facilitate access to sensor data in different fields.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"591 - 631"},"PeriodicalIF":1.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00531-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-10DOI: 10.1007/s10182-025-00529-2
Christoph Muehlmann, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen
This paper aims to introduce a novel approach to spatial blind source separation (SBSS) that addresses the limitations of existing methods. Current SBSS techniques rely on the joint diagonalization of multiple local covariance functions, all of which assume isotropy. To overcome this constraint, anisotropic local covariance matrices that relax the isotropy assumption are proposed. A simulation study and an application on real-world data demonstrate the performance improvement obtained by incorporating these anisotropic covariance matrices into the SBSS framework and highlight the potential of this new approach for more accurate and flexible source separation in spatial data analysis.
{"title":"Anisotropic local covariance matrices for spatial blind source separation","authors":"Christoph Muehlmann, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen","doi":"10.1007/s10182-025-00529-2","DOIUrl":"10.1007/s10182-025-00529-2","url":null,"abstract":"<div><p>This paper aims to introduce a novel approach to spatial blind source separation (SBSS) that addresses the limitations of existing methods. Current SBSS techniques rely on the joint diagonalization of multiple local covariance functions, all of which assume isotropy. To overcome this constraint, anisotropic local covariance matrices that relax the isotropy assumption are proposed. A simulation study and an application on real-world data demonstrate the performance improvement obtained by incorporating these anisotropic covariance matrices into the SBSS framework and highlight the potential of this new approach for more accurate and flexible source separation in spatial data analysis.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 4","pages":"753 - 770"},"PeriodicalIF":1.4,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00529-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145915721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-05DOI: 10.1007/s10182-025-00528-3
Sanghun Cha, Joon Jin Song, Kyeong Eun Lee
When estimating treatment effects in observational studies, propensity score analysis (PSA) is commonly used to reduce the arising bias that results from confounders interfering with causal inference. However, propensity score (PS) estimation is unstable if some confounders are densely measured and formed into high-dimensional data, which could eventually result in a biased estimate of the treatment effect. We propose two-stage analytic procedures to mitigate the high-dimensional problem: ridge PSA and functional PSA. In addition, conventional variance estimation of treatment effect estimates in the PSA methods tends to be biased, so we leverage the empirical bootstrap approach to develop a valid variance estimator. In the simulation study, we compare the bias and MSE of treatment effects estimated by ridge PSA and function PSA under the various confounding structures, including more densely measured confounders, and evaluate the performance of bootstrap variance estimators. The proposed methods are applied in the case study of police shootings.
{"title":"High-dimensional confounding adjustment in causal inference","authors":"Sanghun Cha, Joon Jin Song, Kyeong Eun Lee","doi":"10.1007/s10182-025-00528-3","DOIUrl":"10.1007/s10182-025-00528-3","url":null,"abstract":"<div><p>When estimating treatment effects in observational studies, propensity score analysis (PSA) is commonly used to reduce the arising bias that results from confounders interfering with causal inference. However, propensity score (PS) estimation is unstable if some confounders are densely measured and formed into high-dimensional data, which could eventually result in a biased estimate of the treatment effect. We propose two-stage analytic procedures to mitigate the high-dimensional problem: ridge PSA and functional PSA. In addition, conventional variance estimation of treatment effect estimates in the PSA methods tends to be biased, so we leverage the empirical bootstrap approach to develop a valid variance estimator. In the simulation study, we compare the bias and MSE of treatment effects estimated by ridge PSA and function PSA under the various confounding structures, including more densely measured confounders, and evaluate the performance of bootstrap variance estimators. The proposed methods are applied in the case study of police shootings.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"463 - 481"},"PeriodicalIF":1.4,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-30DOI: 10.1007/s10182-025-00527-4
S. K. Ghoreishi, Jingjing Wu, Qingrun Zhang, Ghazal S. Ghoreishi
In this paper, we define a penalized-distance likelihood function. This function is much more flexible than the available likelihood functions and can be used in many disciplines. Based on this function, we introduce a statistic for hypothesis testing and derive its asymptotic distribution. This statistic can be used to test a partial hypothesis in the parameter space for both non-sparse and sparse high-dimensional data. Relevant Bayesian analysis using the Markov chain Monte Carlo (MCMC) method will be discussed. Finally, we carry out a simulation study and apply our model to a real dataset.
{"title":"Using penalized-distance likelihood functions to analyze high-dimensional sparse/non-sparse data","authors":"S. K. Ghoreishi, Jingjing Wu, Qingrun Zhang, Ghazal S. Ghoreishi","doi":"10.1007/s10182-025-00527-4","DOIUrl":"10.1007/s10182-025-00527-4","url":null,"abstract":"<div><p>In this paper, we define a penalized-distance likelihood function. This function is much more flexible than the available likelihood functions and can be used in many disciplines. Based on this function, we introduce a statistic for hypothesis testing and derive its asymptotic distribution. This statistic can be used to test a partial hypothesis in the parameter space for both non-sparse and sparse high-dimensional data. Relevant Bayesian analysis using the Markov chain Monte Carlo (MCMC) method will be discussed. Finally, we carry out a simulation study and apply our model to a real dataset.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"509 - 528"},"PeriodicalIF":1.4,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-28DOI: 10.1007/s10182-025-00526-5
Michael Balzer, Elisabeth Bergherr, Swen Hutter, Tobias Hepp
In various real-world applications, researchers often work with compositional data which appears as proportions, amounts or rates. As a framework for dealing with the unique nature of compositional data, Dirichlet regression models have been introduced. In this article, we propose a novel model-based gradient boosting approach for Dirichlet regression models embedded in the framework of generalized additive models for location, scale and shape. This approach allows for data-driven variable selection in low- as well as high-dimensional data settings. Moreover, the implementation enables the direct calculation of marginal effects for different predictor variables. Thus, it provides an alternative estimation procedure besides the well-established approach based on the maximum likelihood principle. After conducting detailed simulation studies to evaluate the performance of the estimation procedure regarding prediction accuracy and variable selection in low- and high-dimensional settings, we present a real-world application concerning the changes in election results in the Great Recession utilizing a large-scale European dataset. Using our proposed approach, we investigate the effect of protests on voting proportions of distinct party families while identifying important socioeconomic variables and their effect on those voting proportions via variable selection.
{"title":"Gradient boosting for Dirichlet regression models","authors":"Michael Balzer, Elisabeth Bergherr, Swen Hutter, Tobias Hepp","doi":"10.1007/s10182-025-00526-5","DOIUrl":"10.1007/s10182-025-00526-5","url":null,"abstract":"<div><p>In various real-world applications, researchers often work with compositional data which appears as proportions, amounts or rates. As a framework for dealing with the unique nature of compositional data, Dirichlet regression models have been introduced. In this article, we propose a novel model-based gradient boosting approach for Dirichlet regression models embedded in the framework of generalized additive models for location, scale and shape. This approach allows for data-driven variable selection in low- as well as high-dimensional data settings. Moreover, the implementation enables the direct calculation of marginal effects for different predictor variables. Thus, it provides an alternative estimation procedure besides the well-established approach based on the maximum likelihood principle. After conducting detailed simulation studies to evaluate the performance of the estimation procedure regarding prediction accuracy and variable selection in low- and high-dimensional settings, we present a real-world application concerning the changes in election results in the Great Recession utilizing a large-scale European dataset. Using our proposed approach, we investigate the effect of protests on voting proportions of distinct party families while identifying important socioeconomic variables and their effect on those voting proportions via variable selection.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"110 1","pages":"149 - 189"},"PeriodicalIF":1.4,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00526-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147342295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-31DOI: 10.1007/s10182-025-00525-6
M. F. S. S. Sousa, J. M. Vasconcelos, A. D. C. Nascimento
Synthetic aperture radar (SAR) systems are highly efficient tools for addressing remote sensing challenges. They offer several advantages, such as operating independently of atmospheric conditions and producing high spatial resolution images. However, SAR images are often contaminated by a type of interference called speckle noise, which complicates their analysis and processing. Therefore, proposing statistical methods, such as regression models, that account for speckle behavior is an important step for users of SAR systems. In the work [ISPRS J. Photogramm. Remote Sens., 213, 1–13, 2024], the ({mathcal{G}^{0}_{I}}) regression model (short for (mathcal{R} {mathcal{G}^{0}_{I}})) was proposed as an interpretable tool to relate SAR intensity features to other physical properties. The authors employed maximum likelihood estimators (MLEs), known for their good asymptotic properties but prone to considerable bias in small and medium sample sizes. In this paper, we propose a matrix expression for the second-order bias of MLEs for (mathcal{R} {mathcal{G}^{0}_{I}}) parameters, based on the Cox and Snell method. This proposal is justified by the necessity of using small and moderate windows when processing SAR images, such as for classification and filtering purposes. We compare bias-corrected MLEs with their counterparts using both Monte Carlo experiments and an application to SAR data from a Brazilian region. Numerical evidence demonstrates the effectiveness of our proposal.
{"title":"Bias-corrected estimation for (mathcal{G}^0_I) regression with applications","authors":"M. F. S. S. Sousa, J. M. Vasconcelos, A. D. C. Nascimento","doi":"10.1007/s10182-025-00525-6","DOIUrl":"10.1007/s10182-025-00525-6","url":null,"abstract":"<div><p>Synthetic aperture radar (SAR) systems are highly efficient tools for addressing remote sensing challenges. They offer several advantages, such as operating independently of atmospheric conditions and producing high spatial resolution images. However, SAR images are often contaminated by a type of interference called speckle noise, which complicates their analysis and processing. Therefore, proposing statistical methods, such as regression models, that account for speckle behavior is an important step for users of SAR systems. In the work [<span>ISPRS J. Photogramm. Remote Sens., 213, 1–13, 2024</span>], the <span>({mathcal{G}^{0}_{I}})</span> regression model (short for <span>(mathcal{R} {mathcal{G}^{0}_{I}})</span>) was proposed as an interpretable tool to relate SAR intensity features to other physical properties. The authors employed maximum likelihood estimators (MLEs), known for their good asymptotic properties but prone to considerable bias in small and medium sample sizes. In this paper, we propose a matrix expression for the second-order bias of MLEs for <span>(mathcal{R} {mathcal{G}^{0}_{I}})</span> parameters, based on the Cox and Snell method. This proposal is justified by the necessity of using small and moderate windows when processing SAR images, such as for classification and filtering purposes. We compare bias-corrected MLEs with their counterparts using both Monte Carlo experiments and an application to SAR data from a Brazilian region. Numerical evidence demonstrates the effectiveness of our proposal.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 3","pages":"557 - 589"},"PeriodicalIF":1.4,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-22DOI: 10.1007/s10182-025-00524-7
Maria Felice Arezzo, Giuseppina Guagnano, Domenico Vitale
Quasi-formal employment refers to a situation in which a formal employer and a formal employee agree to declare only part of the wage and the rest being paid in cash to avoid tax liabilities. This phenomenon is detrimental to government budgets and to worker protection. It is therefore crucial to understand its extent and its drivers. The Eurobarometer survey no. 498 provides the information to achieve these objectives. However, several issues in the data need to be addressed with credible solutions in order to rely upon the inferences drawn. These issues are primarily concerned with the reliability of the data, which is compromised due to the phenomenon of social desirability bias. This can be defined as the tendency of respondents to provide false information when answering sensitive questions relating to socially stigmatized behaviors, such as tax evasion. In this work, we present a unified framework for modeling such survey data that overcomes the problems raised by social desirability bias and accommodates the structure of the variables of interest. In particular, we propose a two-part beta regression model where the part one models the participation in quasi-formal employment, whereas the part two models the share of annual gross income earned under the table. We allow the dependent variables of both parts of the model to be mismeasured to handle social desirability bias and generalize the part two using a beta regression framework that is suitable for the limited dependent variable representing the incidence of wage paid in cash. The performance of the estimators is evaluated through a Monte Carlo simulation study and compared with those achieved through a standard procedure that ignores the issues arising from social desirability bias. An application to the Eurobarometer survey no. 498 on quasi-formal employment is provided.
{"title":"A two-part beta regression with mismeasured dependent variable for modeling quasi-formal employment in Europe","authors":"Maria Felice Arezzo, Giuseppina Guagnano, Domenico Vitale","doi":"10.1007/s10182-025-00524-7","DOIUrl":"10.1007/s10182-025-00524-7","url":null,"abstract":"<div><p>Quasi-formal employment refers to a situation in which a formal employer and a formal employee agree to declare only part of the wage and the rest being paid in cash to avoid tax liabilities. This phenomenon is detrimental to government budgets and to worker protection. It is therefore crucial to understand its extent and its drivers. The Eurobarometer survey no. 498 provides the information to achieve these objectives. However, several issues in the data need to be addressed with credible solutions in order to rely upon the inferences drawn. These issues are primarily concerned with the reliability of the data, which is compromised due to the phenomenon of social desirability bias. This can be defined as the tendency of respondents to provide false information when answering sensitive questions relating to socially stigmatized behaviors, such as tax evasion. In this work, we present a unified framework for modeling such survey data that overcomes the problems raised by social desirability bias and accommodates the structure of the variables of interest. In particular, we propose a two-part beta regression model where the part one models the participation in quasi-formal employment, whereas the part two models the share of annual gross income earned under the table. We allow the dependent variables of both parts of the model to be mismeasured to handle social desirability bias and generalize the part two using a beta regression framework that is suitable for the limited dependent variable representing the incidence of wage paid in cash. The performance of the estimators is evaluated through a Monte Carlo simulation study and compared with those achieved through a standard procedure that ignores the issues arising from social desirability bias. An application to the Eurobarometer survey no. 498 on quasi-formal employment is provided.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"110 1","pages":"191 - 218"},"PeriodicalIF":1.4,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-025-00524-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147341238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}