Pub Date : 2024-07-26DOI: 10.1177/0282423x241246617
Farley Ishaak, P. Ouwehand, Hilde Remøy
Constructing price indices for commercial real estate (CPPIs) is challenging due to heterogeneous and limited observations. Common price index methods often result in volatile index series. Attempts to reduce volatility often lead to frequent revisions of the entire index series and a loss of methodological index properties. When it comes to CPPIs in official statistics, both volatility and frequent revisions are undesirable. Revisions could compromise the confidence of users if indicators are allowed to change indefinitely, while instable indices insufficiently reflect structural underlying developments. In this study, a combination of hedonic imputation, multilateral calculations, time series analysis, and window splicing is introduced. The result is a method that produces stable and limited-revisable indices with the ability to detect turning points in an early stage. Commercial real estate transactions in the Netherlands are used to empirically test the method. The resulting CPPIs appear suitable for monitoring financial stability and, therefore, seem appropriate for the use in official statistics.
{"title":"Constructing Limited-Revisable and Stable CPPIs for Small Domains","authors":"Farley Ishaak, P. Ouwehand, Hilde Remøy","doi":"10.1177/0282423x241246617","DOIUrl":"https://doi.org/10.1177/0282423x241246617","url":null,"abstract":"Constructing price indices for commercial real estate (CPPIs) is challenging due to heterogeneous and limited observations. Common price index methods often result in volatile index series. Attempts to reduce volatility often lead to frequent revisions of the entire index series and a loss of methodological index properties. When it comes to CPPIs in official statistics, both volatility and frequent revisions are undesirable. Revisions could compromise the confidence of users if indicators are allowed to change indefinitely, while instable indices insufficiently reflect structural underlying developments. In this study, a combination of hedonic imputation, multilateral calculations, time series analysis, and window splicing is introduced. The result is a method that produces stable and limited-revisable indices with the ability to detect turning points in an early stage. Commercial real estate transactions in the Netherlands are used to empirically test the method. The resulting CPPIs appear suitable for monitoring financial stability and, therefore, seem appropriate for the use in official statistics.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141802057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-26DOI: 10.1177/0282423x241249116
Kaike Wang, Qiang He, Wuyi Zeng, Chunyun Wang
From the perspective of factor input in the production process, this paper puts forward the issue of accounting and capitalization of generalized data output, brings the output of data reprocessing into the scope of statistical accounting, and proposes a systematic evolution chain of data factor forms and the division method of data factor value composition and sources. Then, based on the GDP accounting platform, a three-in-one theoretical framework covering “output-investment-assets” is built for capitalization accounting of data factor, and an accounting path covering “cost → input → output → capital formation → data assets” is designed to improve and highlight the cost accounting method with “value appreciation” as the core. Taking China as an example, its data capital formation and asset size are measured by matching data-intensive industries with data professionals, and by synthesizing data from multiple sources. The rationality and self-consistency of the theoretical and methodological research is verified by the empirical results. This study can provide theoretical reference for bringing the value accounting of data factor into the basic framework of the Systems of National Accounts (SNA). Moreover, its empirical research paradigm can also provide reference for relevant countries to carry out capitalization accounting for data factor (CADF).
本文从生产过程要素投入的视角,提出了广义数据产出的核算与资本化问题,将数据再加工产出纳入统计核算范畴,提出了系统的数据要素形态演化链条和数据要素价值构成与来源的划分方法。然后,基于 GDP 核算平台,构建 "产出-投入-资产 "三位一体的数据要素资本化核算理论框架,设计 "成本→投入→产出→资本形成→数据资产 "的核算路径,完善和突出以 "价值增值 "为核心的成本核算方法。以中国为例,通过数据密集型产业与数据专业人才的匹配,综合多方数据,测算中国的数据资本形成和资产规模。实证结果验证了理论和方法研究的合理性和自洽性。本研究可为将数据要素的价值核算纳入国民账户体系(SNA)的基本框架提供理论参考。此外,其实证研究范式也可为相关国家开展数据要素资本化核算(CADF)提供借鉴。
{"title":"Capitalization Accounting of Data Factor: Theoretical Mechanism, Methodological Path, and Statistical Measurement","authors":"Kaike Wang, Qiang He, Wuyi Zeng, Chunyun Wang","doi":"10.1177/0282423x241249116","DOIUrl":"https://doi.org/10.1177/0282423x241249116","url":null,"abstract":"From the perspective of factor input in the production process, this paper puts forward the issue of accounting and capitalization of generalized data output, brings the output of data reprocessing into the scope of statistical accounting, and proposes a systematic evolution chain of data factor forms and the division method of data factor value composition and sources. Then, based on the GDP accounting platform, a three-in-one theoretical framework covering “output-investment-assets” is built for capitalization accounting of data factor, and an accounting path covering “cost → input → output → capital formation → data assets” is designed to improve and highlight the cost accounting method with “value appreciation” as the core. Taking China as an example, its data capital formation and asset size are measured by matching data-intensive industries with data professionals, and by synthesizing data from multiple sources. The rationality and self-consistency of the theoretical and methodological research is verified by the empirical results. This study can provide theoretical reference for bringing the value accounting of data factor into the basic framework of the Systems of National Accounts (SNA). Moreover, its empirical research paradigm can also provide reference for relevant countries to carry out capitalization accounting for data factor (CADF).","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141799278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-24DOI: 10.1177/0282423x241240366
Laura Bisio
ISTAT has recently released an updated version of short-term statistics on hours worked in Italy, which are used in labor input estimates by the Quarterly National Accounts (QNA). The coverage of these statistics has been expanded from larger-than-ten workers firms to include the entire universe of Italian private firms. To include the updated indicator within estimates by QNA, the series must be reconstructed back to 1995 first quarter (1995q1) due to methodological requirements of QNA. In this paper, we first reconstruct the updated indicator using the Kalman filter and smoother algorithms applied to a state-space representation of a multivariate structural model (SUTSE). Next, we comparatively assess the performance of the new indicator against the non-updated one. This assessment is based on estimates of quarterly per-employee hours worked using temporal disaggregation methods for seven economic sections spanning the non-agricultural private business economy over the period 1995q1 to 2020q4. Compared to the previous indicator, the reconstructed indicator (i) implies improvements in temporal disaggregation model fitting in the majority of economic sections considered; (ii) returns smaller forecast errors in the 64.3% of the estimations, based on MAE; (iii) ensures a higher correlation between the estimated quarterly series to the indicator in the 71.4% of the estimates.
{"title":"Reconstructing a Short-Term Indicator by State-Space Models: An Application to Estimate Hours Worked by Quarterly National Accounts","authors":"Laura Bisio","doi":"10.1177/0282423x241240366","DOIUrl":"https://doi.org/10.1177/0282423x241240366","url":null,"abstract":"ISTAT has recently released an updated version of short-term statistics on hours worked in Italy, which are used in labor input estimates by the Quarterly National Accounts (QNA). The coverage of these statistics has been expanded from larger-than-ten workers firms to include the entire universe of Italian private firms. To include the updated indicator within estimates by QNA, the series must be reconstructed back to 1995 first quarter (1995q1) due to methodological requirements of QNA. In this paper, we first reconstruct the updated indicator using the Kalman filter and smoother algorithms applied to a state-space representation of a multivariate structural model (SUTSE). Next, we comparatively assess the performance of the new indicator against the non-updated one. This assessment is based on estimates of quarterly per-employee hours worked using temporal disaggregation methods for seven economic sections spanning the non-agricultural private business economy over the period 1995q1 to 2020q4. Compared to the previous indicator, the reconstructed indicator (i) implies improvements in temporal disaggregation model fitting in the majority of economic sections considered; (ii) returns smaller forecast errors in the 64.3% of the estimations, based on MAE; (iii) ensures a higher correlation between the estimated quarterly series to the indicator in the 71.4% of the estimates.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141101786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-23DOI: 10.1177/0282423x241246517
James O. Chipperfield, Randall Chu, Li-Chun Zhang, Bernard Baffour
The first key use of a nation’s Census is to count its resident population. A Census will have counting errors, often referred to as over-coverage and under-coverage. So it is common practice in many countries to conduct an independent count of its residents, a so-called coverage survey, and estimate or adjust for these counting errors within the capture-recapture framework. In recent times, many censuses and coverage surveys have faced challenges in counting the population efficiently and effectively due to rising costs, declining response rates, and respondent burden. This has led to a shift toward exploring the role that administrative registers could play in counting the population within the capture-recapture framework. Administrative registers are relatively inexpensive and can have high coverage of a nation’s population. This paper explores methods to overcome common problems with the use of administrative registers within this framework, including linking errors and scoping the register to only capture residents. These methods are empirically assessed in the context of the Australian population.
{"title":"Robust Statistical Estimation for Capture-Recapture Using Administrative Data","authors":"James O. Chipperfield, Randall Chu, Li-Chun Zhang, Bernard Baffour","doi":"10.1177/0282423x241246517","DOIUrl":"https://doi.org/10.1177/0282423x241246517","url":null,"abstract":"The first key use of a nation’s Census is to count its resident population. A Census will have counting errors, often referred to as over-coverage and under-coverage. So it is common practice in many countries to conduct an independent count of its residents, a so-called coverage survey, and estimate or adjust for these counting errors within the capture-recapture framework. In recent times, many censuses and coverage surveys have faced challenges in counting the population efficiently and effectively due to rising costs, declining response rates, and respondent burden. This has led to a shift toward exploring the role that administrative registers could play in counting the population within the capture-recapture framework. Administrative registers are relatively inexpensive and can have high coverage of a nation’s population. This paper explores methods to overcome common problems with the use of administrative registers within this framework, including linking errors and scoping the register to only capture residents. These methods are empirically assessed in the context of the Australian population.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141105861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-23DOI: 10.1177/0282423x241248010
Özlem Yiğit
In recent years, researchers and statisticians have increasingly used econometric techniques to generate timely, high-quality, and detailed official statistics. This study presents a modified form of regression-based temporal disaggregation model to compile the index of production in construction for Türkiye, employing a state-space modeling approach. The model incorporates the twelve-month moving sum of deflated turnover as an observed variable and the number of employees as an exogenous variable. Finally, four alternative models—three assuming constant labor productivity and one assuming time-varying labor productivity—are compared.
{"title":"State-Space Modeling Approach to Exploring the Index of Production in Construction for Türkiye","authors":"Özlem Yiğit","doi":"10.1177/0282423x241248010","DOIUrl":"https://doi.org/10.1177/0282423x241248010","url":null,"abstract":"In recent years, researchers and statisticians have increasingly used econometric techniques to generate timely, high-quality, and detailed official statistics. This study presents a modified form of regression-based temporal disaggregation model to compile the index of production in construction for Türkiye, employing a state-space modeling approach. The model incorporates the twelve-month moving sum of deflated turnover as an observed variable and the number of employees as an exogenous variable. Finally, four alternative models—three assuming constant labor productivity and one assuming time-varying labor productivity—are compared.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141108037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-15DOI: 10.1177/0282423x241244671
Paul A. Parker
Small area estimation models are critical for dissemination and understanding of important population characteristics within sub-domains that often have limited sample size. The classic Fay-Herriot model is perhaps the most widely used approach to generate such estimates. However, a limiting assumption of this approach is that the latent true population quantity has a linear relationship with the given covariates. Through the use of random weight neural networks, we develop a Bayesian hierarchical extension of this framework that allows for estimation of nonlinear relationships between the true population quantity and the covariates. We illustrate our approach through an empirical simulation study as well as an analysis of median household income for census tracts in the state of California.
{"title":"Nonlinear Fay-Herriot Models for Small Area Estimation Using Random Weight Neural Networks","authors":"Paul A. Parker","doi":"10.1177/0282423x241244671","DOIUrl":"https://doi.org/10.1177/0282423x241244671","url":null,"abstract":"Small area estimation models are critical for dissemination and understanding of important population characteristics within sub-domains that often have limited sample size. The classic Fay-Herriot model is perhaps the most widely used approach to generate such estimates. However, a limiting assumption of this approach is that the latent true population quantity has a linear relationship with the given covariates. Through the use of random weight neural networks, we develop a Bayesian hierarchical extension of this framework that allows for estimation of nonlinear relationships between the true population quantity and the covariates. We illustrate our approach through an empirical simulation study as well as an analysis of median household income for census tracts in the state of California.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140977630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1177/0282423x241240739
A. Nigri, Susanna Levantesi, Salvatore Scognamiglio
Reliable estimates of age-specific vital rates are crucial in demographic studies, while ages are, in most cases, commonly grouped in bins of five years. Indeed, public health and national systems require single age-specific data to achieve accurate social planning. This paper introduces a deep learning approach for splitting the abridged death rates, providing a more comprehensive perspective on the indirect age-specific vital rates estimation from grouped data. Additionally, we contribute to the existing literature by introducing a multi-population (countries and genders) approach, providing reliable estimates considering the heterogeneity of longevity dynamics over age, years, and across populations. We also contribute to the state of the art in indirect estimation by introducing, for the first time, a multi-population indirect estimation leveraging subnational data. Our model accurately captures mortality dynamics by age over time and among different populations. We prove the model’s ability to estimate reliable predictions of age-specific mortality rates by also studying how the hyperparameters’ choice affects the model reliability and analyzing the age-specific relative differences between the real and the estimated mortality rates.
{"title":"Disaggregating Death Rates of Age-Groups Using Deep Learning Algorithms","authors":"A. Nigri, Susanna Levantesi, Salvatore Scognamiglio","doi":"10.1177/0282423x241240739","DOIUrl":"https://doi.org/10.1177/0282423x241240739","url":null,"abstract":"Reliable estimates of age-specific vital rates are crucial in demographic studies, while ages are, in most cases, commonly grouped in bins of five years. Indeed, public health and national systems require single age-specific data to achieve accurate social planning. This paper introduces a deep learning approach for splitting the abridged death rates, providing a more comprehensive perspective on the indirect age-specific vital rates estimation from grouped data. Additionally, we contribute to the existing literature by introducing a multi-population (countries and genders) approach, providing reliable estimates considering the heterogeneity of longevity dynamics over age, years, and across populations. We also contribute to the state of the art in indirect estimation by introducing, for the first time, a multi-population indirect estimation leveraging subnational data. Our model accurately captures mortality dynamics by age over time and among different populations. We prove the model’s ability to estimate reliable predictions of age-specific mortality rates by also studying how the hyperparameters’ choice affects the model reliability and analyzing the age-specific relative differences between the real and the estimated mortality rates.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141025389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1177/0282423x241235265
Piet Daas, Wolter Hassink, Bart Klijs
A statistical classification model was developed to identify online platform organizations based on the texts on their website. The model was subsequently used to identify all (potential) platform organizations with a website included in the Dutch Business Register. The empirical outcomes of the statistical model were plausible in terms of the words and the bimodal distribution of fitted probabilities, but the results indicated an overestimation of the number of platform organizations. Next, the external validity of the outcomes was investigated through a survey of the organizations that were identified as a platform organization by the statistical classification model. The response by the organizations to the survey confirmed a substantial number of type-I errors. Furthermore, it revealed a positive association between the fitted probability of the text-based classification model and the organization’s response to the survey question on being an online platform organization. The survey results indicated that the text-based classification model can be used to obtain a subpopulation of potential platform organizations from the entire population of businesses with a website. This subpopulation may form a good starting point to study platform organizations in more detail.
我们开发了一个统计分类模型,用于根据在线平台组织网站上的文本对其进行识别。该模型随后被用于识别所有在荷兰商业登记中拥有网站的(潜在)平台组织。从单词和拟合概率的双峰分布来看,统计模型的经验结果是可信的,但结果表明高估了平台组织的数量。接下来,通过对统计分类模型确定为平台组织的组织进行调查,研究了结果的外部有效性。这些组织对调查的答复证实了大量的 I 类错误。此外,调查还显示,基于文本的分类模型的拟合概率与组织对 "是否为在线平台组织 "调查问题的答复之间存在正相关。调查结果表明,基于文本的分类模型可用于从所有拥有网站的企业中获取潜在平台组织的子群。这个子群可能是更详细研究平台组织的一个良好起点。
{"title":"On the Validity of Using Webpage Texts to Identify the Target Population of a Survey: An Application to Detect Online Platforms","authors":"Piet Daas, Wolter Hassink, Bart Klijs","doi":"10.1177/0282423x241235265","DOIUrl":"https://doi.org/10.1177/0282423x241235265","url":null,"abstract":"A statistical classification model was developed to identify online platform organizations based on the texts on their website. The model was subsequently used to identify all (potential) platform organizations with a website included in the Dutch Business Register. The empirical outcomes of the statistical model were plausible in terms of the words and the bimodal distribution of fitted probabilities, but the results indicated an overestimation of the number of platform organizations. Next, the external validity of the outcomes was investigated through a survey of the organizations that were identified as a platform organization by the statistical classification model. The response by the organizations to the survey confirmed a substantial number of type-I errors. Furthermore, it revealed a positive association between the fitted probability of the text-based classification model and the organization’s response to the survey question on being an online platform organization. The survey results indicated that the text-based classification model can be used to obtain a subpopulation of potential platform organizations from the entire population of businesses with a website. This subpopulation may form a good starting point to study platform organizations in more detail.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140274255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1177/0282423x241236275
Tom Wilson
It is widely appreciated that population forecasts are inherently uncertain. Researchers have responded by quantifying uncertainty using probabilistic forecasting methods. Yet despite several decades of development, probabilistic forecasts have gained little traction outside the academic sector. Therefore, this article suggests an alternative and simpler approach to estimating and communicating uncertainty which might be helpful for population forecast practitioners and users. Drawing on the naïve forecasts idea of Alho, it suggests creating “synthetic historical forecast errors” by running a regular deterministic projection model many times over recent decades. Then, borrowing from perishable food terminology, the “shelf life” of forecast variables, the number of years into the future the forecast is likely to remain “safe for consumption” (within a specified error tolerance), is estimated from the “historical” errors. The shelf lives are then applied to a current set of forecasts and presented in a simple manner in graphs and tables of forecasts using color-coding. The approach is illustrated through a case study of 2021-based population forecasts for Australia. It´s concluded that the approach offers a relatively straightforward way of estimating and communicating population forecast uncertainty.
{"title":"Visualizing the Shelf Life of Population Forecasts: A Simple Approach to Communicating Forecast Uncertainty","authors":"Tom Wilson","doi":"10.1177/0282423x241236275","DOIUrl":"https://doi.org/10.1177/0282423x241236275","url":null,"abstract":"It is widely appreciated that population forecasts are inherently uncertain. Researchers have responded by quantifying uncertainty using probabilistic forecasting methods. Yet despite several decades of development, probabilistic forecasts have gained little traction outside the academic sector. Therefore, this article suggests an alternative and simpler approach to estimating and communicating uncertainty which might be helpful for population forecast practitioners and users. Drawing on the naïve forecasts idea of Alho, it suggests creating “synthetic historical forecast errors” by running a regular deterministic projection model many times over recent decades. Then, borrowing from perishable food terminology, the “shelf life” of forecast variables, the number of years into the future the forecast is likely to remain “safe for consumption” (within a specified error tolerance), is estimated from the “historical” errors. The shelf lives are then applied to a current set of forecasts and presented in a simple manner in graphs and tables of forecasts using color-coding. The approach is illustrated through a case study of 2021-based population forecasts for Australia. It´s concluded that the approach offers a relatively straightforward way of estimating and communicating population forecast uncertainty.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1177/0282423x241235252
T. Savitsky, Luis G. León-Novelo, Helen Engle
Survey data are often randomly drawn from an underlying population of inferential interest under a multistage, complex sampling design. A sampling weight proportional to the number of individuals in the population that each sampled individual represents is released. The sampling design is informative with respect to a response variable of interest if the variable correlates with the sampling weights. The distribution for the variables of interest differs in the sample and in the population, requiring correction to the sample distribution to approximate the population. We focus on model-based Bayesian inference for repeated (continuous) measures associated with each sampled individual. We devise a model for the joint estimation of response variable(s) of interest and sampling weights to account for the informative sampling design in a formulation that captures the association of the measures taken on the same individual incorporating individual-specific random-effects. We show that our approach yields correct population inference on the observed sample of units and compare its performance with competing method via simulation. Methods are compared using bias, mean square error, coverage, and length of credible intervals. We demonstrate our approach using a National Health and Nutrition Examination Survey dietary dataset modeling daily protein consumption.
{"title":"Bayesian Inference for Repeated Measures Under Informative Sampling","authors":"T. Savitsky, Luis G. León-Novelo, Helen Engle","doi":"10.1177/0282423x241235252","DOIUrl":"https://doi.org/10.1177/0282423x241235252","url":null,"abstract":"Survey data are often randomly drawn from an underlying population of inferential interest under a multistage, complex sampling design. A sampling weight proportional to the number of individuals in the population that each sampled individual represents is released. The sampling design is informative with respect to a response variable of interest if the variable correlates with the sampling weights. The distribution for the variables of interest differs in the sample and in the population, requiring correction to the sample distribution to approximate the population. We focus on model-based Bayesian inference for repeated (continuous) measures associated with each sampled individual. We devise a model for the joint estimation of response variable(s) of interest and sampling weights to account for the informative sampling design in a formulation that captures the association of the measures taken on the same individual incorporating individual-specific random-effects. We show that our approach yields correct population inference on the observed sample of units and compare its performance with competing method via simulation. Methods are compared using bias, mean square error, coverage, and length of credible intervals. We demonstrate our approach using a National Health and Nutrition Examination Survey dietary dataset modeling daily protein consumption.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140278712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}