首页 > 最新文献

Journal of Official Statistics最新文献

英文 中文
Constructing Limited-Revisable and Stable CPPIs for Small Domains 为小域构建有限可修订和稳定的 CPPI
IF 0.5 4区 数学 Q4 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2024-07-26 DOI: 10.1177/0282423x241246617
Farley Ishaak, P. Ouwehand, Hilde Remøy
Constructing price indices for commercial real estate (CPPIs) is challenging due to heterogeneous and limited observations. Common price index methods often result in volatile index series. Attempts to reduce volatility often lead to frequent revisions of the entire index series and a loss of methodological index properties. When it comes to CPPIs in official statistics, both volatility and frequent revisions are undesirable. Revisions could compromise the confidence of users if indicators are allowed to change indefinitely, while instable indices insufficiently reflect structural underlying developments. In this study, a combination of hedonic imputation, multilateral calculations, time series analysis, and window splicing is introduced. The result is a method that produces stable and limited-revisable indices with the ability to detect turning points in an early stage. Commercial real estate transactions in the Netherlands are used to empirically test the method. The resulting CPPIs appear suitable for monitoring financial stability and, therefore, seem appropriate for the use in official statistics.
由于观测数据的异质性和有限性,构建商业房地产价格指数(CPPIs)具有挑战性。普通的价格指数方法往往会导致指数序列的波动。试图降低波动性的做法往往会导致整个指数序列的频繁修订和方法指数特性的丧失。就官方统计中的 CPPI 而言,波动性和频繁修订都是不可取的。如果允许指标无限期地变化,修订可能会损害用户的信心,而不稳定的指数则不能充分反映结构性的基本发展。在这项研究中,引入了一种将对冲估算、多边计算、时间序列分析和窗口拼接相结合的方法。结果,该方法产生了稳定且可有限重现的指数,并能在早期阶段发现转折点。荷兰的商业房地产交易被用来对该方法进行实证测试。由此得出的 CPPI 似乎适合用于监测金融稳定性,因此也适合用于官方统计。
{"title":"Constructing Limited-Revisable and Stable CPPIs for Small Domains","authors":"Farley Ishaak, P. Ouwehand, Hilde Remøy","doi":"10.1177/0282423x241246617","DOIUrl":"https://doi.org/10.1177/0282423x241246617","url":null,"abstract":"Constructing price indices for commercial real estate (CPPIs) is challenging due to heterogeneous and limited observations. Common price index methods often result in volatile index series. Attempts to reduce volatility often lead to frequent revisions of the entire index series and a loss of methodological index properties. When it comes to CPPIs in official statistics, both volatility and frequent revisions are undesirable. Revisions could compromise the confidence of users if indicators are allowed to change indefinitely, while instable indices insufficiently reflect structural underlying developments. In this study, a combination of hedonic imputation, multilateral calculations, time series analysis, and window splicing is introduced. The result is a method that produces stable and limited-revisable indices with the ability to detect turning points in an early stage. Commercial real estate transactions in the Netherlands are used to empirically test the method. The resulting CPPIs appear suitable for monitoring financial stability and, therefore, seem appropriate for the use in official statistics.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141802057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Capitalization Accounting of Data Factor: Theoretical Mechanism, Methodological Path, and Statistical Measurement 数据因素的资本化核算:理论机制、方法路径和统计测量
IF 0.5 4区 数学 Q4 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2024-07-26 DOI: 10.1177/0282423x241249116
Kaike Wang, Qiang He, Wuyi Zeng, Chunyun Wang
From the perspective of factor input in the production process, this paper puts forward the issue of accounting and capitalization of generalized data output, brings the output of data reprocessing into the scope of statistical accounting, and proposes a systematic evolution chain of data factor forms and the division method of data factor value composition and sources. Then, based on the GDP accounting platform, a three-in-one theoretical framework covering “output-investment-assets” is built for capitalization accounting of data factor, and an accounting path covering “cost → input → output → capital formation → data assets” is designed to improve and highlight the cost accounting method with “value appreciation” as the core. Taking China as an example, its data capital formation and asset size are measured by matching data-intensive industries with data professionals, and by synthesizing data from multiple sources. The rationality and self-consistency of the theoretical and methodological research is verified by the empirical results. This study can provide theoretical reference for bringing the value accounting of data factor into the basic framework of the Systems of National Accounts (SNA). Moreover, its empirical research paradigm can also provide reference for relevant countries to carry out capitalization accounting for data factor (CADF).
本文从生产过程要素投入的视角,提出了广义数据产出的核算与资本化问题,将数据再加工产出纳入统计核算范畴,提出了系统的数据要素形态演化链条和数据要素价值构成与来源的划分方法。然后,基于 GDP 核算平台,构建 "产出-投入-资产 "三位一体的数据要素资本化核算理论框架,设计 "成本→投入→产出→资本形成→数据资产 "的核算路径,完善和突出以 "价值增值 "为核心的成本核算方法。以中国为例,通过数据密集型产业与数据专业人才的匹配,综合多方数据,测算中国的数据资本形成和资产规模。实证结果验证了理论和方法研究的合理性和自洽性。本研究可为将数据要素的价值核算纳入国民账户体系(SNA)的基本框架提供理论参考。此外,其实证研究范式也可为相关国家开展数据要素资本化核算(CADF)提供借鉴。
{"title":"Capitalization Accounting of Data Factor: Theoretical Mechanism, Methodological Path, and Statistical Measurement","authors":"Kaike Wang, Qiang He, Wuyi Zeng, Chunyun Wang","doi":"10.1177/0282423x241249116","DOIUrl":"https://doi.org/10.1177/0282423x241249116","url":null,"abstract":"From the perspective of factor input in the production process, this paper puts forward the issue of accounting and capitalization of generalized data output, brings the output of data reprocessing into the scope of statistical accounting, and proposes a systematic evolution chain of data factor forms and the division method of data factor value composition and sources. Then, based on the GDP accounting platform, a three-in-one theoretical framework covering “output-investment-assets” is built for capitalization accounting of data factor, and an accounting path covering “cost → input → output → capital formation → data assets” is designed to improve and highlight the cost accounting method with “value appreciation” as the core. Taking China as an example, its data capital formation and asset size are measured by matching data-intensive industries with data professionals, and by synthesizing data from multiple sources. The rationality and self-consistency of the theoretical and methodological research is verified by the empirical results. This study can provide theoretical reference for bringing the value accounting of data factor into the basic framework of the Systems of National Accounts (SNA). Moreover, its empirical research paradigm can also provide reference for relevant countries to carry out capitalization accounting for data factor (CADF).","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141799278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconstructing a Short-Term Indicator by State-Space Models: An Application to Estimate Hours Worked by Quarterly National Accounts 用状态空间模型重构短期指标:应用《季度国民账户》估算工时
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-05-24 DOI: 10.1177/0282423x241240366
Laura Bisio
ISTAT has recently released an updated version of short-term statistics on hours worked in Italy, which are used in labor input estimates by the Quarterly National Accounts (QNA). The coverage of these statistics has been expanded from larger-than-ten workers firms to include the entire universe of Italian private firms. To include the updated indicator within estimates by QNA, the series must be reconstructed back to 1995 first quarter (1995q1) due to methodological requirements of QNA. In this paper, we first reconstruct the updated indicator using the Kalman filter and smoother algorithms applied to a state-space representation of a multivariate structural model (SUTSE). Next, we comparatively assess the performance of the new indicator against the non-updated one. This assessment is based on estimates of quarterly per-employee hours worked using temporal disaggregation methods for seven economic sections spanning the non-agricultural private business economy over the period 1995q1 to 2020q4. Compared to the previous indicator, the reconstructed indicator (i) implies improvements in temporal disaggregation model fitting in the majority of economic sections considered; (ii) returns smaller forecast errors in the 64.3% of the estimations, based on MAE; (iii) ensures a higher correlation between the estimated quarterly series to the indicator in the 71.4% of the estimates.
意大利国家统计局(ISTAT)最近发布了意大利工作时间短期统计数据的更新版,《季度国民账户》(QNA)在估算劳动力投入时使用了这些数据。这些统计数据的覆盖范围已从工人人数超过 10 人的企业扩大到整个意大利私营企业。由于季度国民核算的方法要求,要将更新指标纳入季度国民核算的估算范围,必须将序列重建到 1995 年第一季度(1995q1)。在本文中,我们首先使用卡尔曼滤波器和平滑算法重建更新指标,并将其应用于多元结构模型(SUTSE)的状态空间表示。接下来,我们比较评估了新指标与未更新指标的性能。该评估基于 1995q1 至 2020q4 期间非农业私营企业经济中七个经济部门的季度人均工时估算,采用了时间分解方法。与之前的指标相比,重建后的指标(i)意味着在所考虑的大多数经济部门中,时间分解模型的拟合有所改进;(ii)根据 MAE,64.3%的估计值返回了较小的预测误差;(iii)在 71.4%的估计值中,确保了估计的季度序列与指标之间更高的相关性。
{"title":"Reconstructing a Short-Term Indicator by State-Space Models: An Application to Estimate Hours Worked by Quarterly National Accounts","authors":"Laura Bisio","doi":"10.1177/0282423x241240366","DOIUrl":"https://doi.org/10.1177/0282423x241240366","url":null,"abstract":"ISTAT has recently released an updated version of short-term statistics on hours worked in Italy, which are used in labor input estimates by the Quarterly National Accounts (QNA). The coverage of these statistics has been expanded from larger-than-ten workers firms to include the entire universe of Italian private firms. To include the updated indicator within estimates by QNA, the series must be reconstructed back to 1995 first quarter (1995q1) due to methodological requirements of QNA. In this paper, we first reconstruct the updated indicator using the Kalman filter and smoother algorithms applied to a state-space representation of a multivariate structural model (SUTSE). Next, we comparatively assess the performance of the new indicator against the non-updated one. This assessment is based on estimates of quarterly per-employee hours worked using temporal disaggregation methods for seven economic sections spanning the non-agricultural private business economy over the period 1995q1 to 2020q4. Compared to the previous indicator, the reconstructed indicator (i) implies improvements in temporal disaggregation model fitting in the majority of economic sections considered; (ii) returns smaller forecast errors in the 64.3% of the estimations, based on MAE; (iii) ensures a higher correlation between the estimated quarterly series to the indicator in the 71.4% of the estimates.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141101786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Statistical Estimation for Capture-Recapture Using Administrative Data 利用行政数据进行捕获-再捕获的稳健统计估算
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-05-23 DOI: 10.1177/0282423x241246517
James O. Chipperfield, Randall Chu, Li-Chun Zhang, Bernard Baffour
The first key use of a nation’s Census is to count its resident population. A Census will have counting errors, often referred to as over-coverage and under-coverage. So it is common practice in many countries to conduct an independent count of its residents, a so-called coverage survey, and estimate or adjust for these counting errors within the capture-recapture framework. In recent times, many censuses and coverage surveys have faced challenges in counting the population efficiently and effectively due to rising costs, declining response rates, and respondent burden. This has led to a shift toward exploring the role that administrative registers could play in counting the population within the capture-recapture framework. Administrative registers are relatively inexpensive and can have high coverage of a nation’s population. This paper explores methods to overcome common problems with the use of administrative registers within this framework, including linking errors and scoping the register to only capture residents. These methods are empirically assessed in the context of the Australian population.
一个国家人口普查的第一个主要用途是统计其常住人口。人口普查会出现计数误差,通常称为覆盖率过高和过低。因此,许多国家的通常做法是对其居民进行独立的统计,即所谓的覆盖率调查,并在捕获-再捕获框架内估计或调整这些统计误差。近来,由于成本上升、回复率下降和受访者负担等原因,许多人口普查和覆盖面调查都面临着如何高效、有效地统计人口数量的挑战。这促使人们转向探索行政登记册在 "捕获-再捕获 "框架内统计人口时可发挥的作用。行政登记册的成本相对较低,对全国人口的覆盖率也较高。本文探讨了在此框架内克服行政登记册使用中常见问题的方法,包括链接错误和只捕获居民的登记范围。本文以澳大利亚人口为背景,对这些方法进行了实证评估。
{"title":"Robust Statistical Estimation for Capture-Recapture Using Administrative Data","authors":"James O. Chipperfield, Randall Chu, Li-Chun Zhang, Bernard Baffour","doi":"10.1177/0282423x241246517","DOIUrl":"https://doi.org/10.1177/0282423x241246517","url":null,"abstract":"The first key use of a nation’s Census is to count its resident population. A Census will have counting errors, often referred to as over-coverage and under-coverage. So it is common practice in many countries to conduct an independent count of its residents, a so-called coverage survey, and estimate or adjust for these counting errors within the capture-recapture framework. In recent times, many censuses and coverage surveys have faced challenges in counting the population efficiently and effectively due to rising costs, declining response rates, and respondent burden. This has led to a shift toward exploring the role that administrative registers could play in counting the population within the capture-recapture framework. Administrative registers are relatively inexpensive and can have high coverage of a nation’s population. This paper explores methods to overcome common problems with the use of administrative registers within this framework, including linking errors and scoping the register to only capture residents. These methods are empirically assessed in the context of the Australian population.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141105861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
State-Space Modeling Approach to Exploring the Index of Production in Construction for Türkiye 探索土耳其建筑业生产指数的状态空间建模方法
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-05-23 DOI: 10.1177/0282423x241248010
Özlem Yiğit
In recent years, researchers and statisticians have increasingly used econometric techniques to generate timely, high-quality, and detailed official statistics. This study presents a modified form of regression-based temporal disaggregation model to compile the index of production in construction for Türkiye, employing a state-space modeling approach. The model incorporates the twelve-month moving sum of deflated turnover as an observed variable and the number of employees as an exogenous variable. Finally, four alternative models—three assuming constant labor productivity and one assuming time-varying labor productivity—are compared.
近年来,研究人员和统计人员越来越多地使用计量经济学技术来生成及时、高质量和详细的官方统计数据。本研究采用状态空间建模方法,提出了一种基于回归的时间分解模型的改进形式,用于编制土耳其建筑业生产指数。该模型将 12 个月的平减营业额移动总和作为观察变量,将雇员人数作为外生变量。最后,比较了四个备选模型--三个假设劳动生产率不变,一个假设劳动生产率随时间变化。
{"title":"State-Space Modeling Approach to Exploring the Index of Production in Construction for Türkiye","authors":"Özlem Yiğit","doi":"10.1177/0282423x241248010","DOIUrl":"https://doi.org/10.1177/0282423x241248010","url":null,"abstract":"In recent years, researchers and statisticians have increasingly used econometric techniques to generate timely, high-quality, and detailed official statistics. This study presents a modified form of regression-based temporal disaggregation model to compile the index of production in construction for Türkiye, employing a state-space modeling approach. The model incorporates the twelve-month moving sum of deflated turnover as an observed variable and the number of employees as an exogenous variable. Finally, four alternative models—three assuming constant labor productivity and one assuming time-varying labor productivity—are compared.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141108037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonlinear Fay-Herriot Models for Small Area Estimation Using Random Weight Neural Networks 利用随机加权神经网络进行小区域估算的非线性费-赫里奥特模型
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-05-15 DOI: 10.1177/0282423x241244671
Paul A. Parker
Small area estimation models are critical for dissemination and understanding of important population characteristics within sub-domains that often have limited sample size. The classic Fay-Herriot model is perhaps the most widely used approach to generate such estimates. However, a limiting assumption of this approach is that the latent true population quantity has a linear relationship with the given covariates. Through the use of random weight neural networks, we develop a Bayesian hierarchical extension of this framework that allows for estimation of nonlinear relationships between the true population quantity and the covariates. We illustrate our approach through an empirical simulation study as well as an analysis of median household income for census tracts in the state of California.
小区域估算模型对于传播和了解往往样本量有限的子区域内的重要人口特征至关重要。经典的 Fay-Herriot 模型可能是生成此类估计值最广泛使用的方法。然而,这种方法的一个限制性假设是,潜在的真实人口数量与给定的协变量具有线性关系。通过使用随机加权神经网络,我们开发出了这一框架的贝叶斯分层扩展方法,允许对真实人口数量与协变量之间的非线性关系进行估计。我们通过实证模拟研究以及对加利福尼亚州人口普查区家庭收入中位数的分析来说明我们的方法。
{"title":"Nonlinear Fay-Herriot Models for Small Area Estimation Using Random Weight Neural Networks","authors":"Paul A. Parker","doi":"10.1177/0282423x241244671","DOIUrl":"https://doi.org/10.1177/0282423x241244671","url":null,"abstract":"Small area estimation models are critical for dissemination and understanding of important population characteristics within sub-domains that often have limited sample size. The classic Fay-Herriot model is perhaps the most widely used approach to generate such estimates. However, a limiting assumption of this approach is that the latent true population quantity has a linear relationship with the given covariates. Through the use of random weight neural networks, we develop a Bayesian hierarchical extension of this framework that allows for estimation of nonlinear relationships between the true population quantity and the covariates. We illustrate our approach through an empirical simulation study as well as an analysis of median household income for census tracts in the state of California.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140977630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disaggregating Death Rates of Age-Groups Using Deep Learning Algorithms 利用深度学习算法分列年龄组死亡率
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-05-01 DOI: 10.1177/0282423x241240739
A. Nigri, Susanna Levantesi, Salvatore Scognamiglio
Reliable estimates of age-specific vital rates are crucial in demographic studies, while ages are, in most cases, commonly grouped in bins of five years. Indeed, public health and national systems require single age-specific data to achieve accurate social planning. This paper introduces a deep learning approach for splitting the abridged death rates, providing a more comprehensive perspective on the indirect age-specific vital rates estimation from grouped data. Additionally, we contribute to the existing literature by introducing a multi-population (countries and genders) approach, providing reliable estimates considering the heterogeneity of longevity dynamics over age, years, and across populations. We also contribute to the state of the art in indirect estimation by introducing, for the first time, a multi-population indirect estimation leveraging subnational data. Our model accurately captures mortality dynamics by age over time and among different populations. We prove the model’s ability to estimate reliable predictions of age-specific mortality rates by also studying how the hyperparameters’ choice affects the model reliability and analyzing the age-specific relative differences between the real and the estimated mortality rates.
可靠的特定年龄生命率估计值对人口研究至关重要,而在大多数情况下,年龄通常以五年为一组。事实上,公共卫生和国家系统需要单一的特定年龄数据来实现准确的社会规划。本文介绍了一种深度学习方法,用于拆分简略死亡率,为从分组数据中间接估算特定年龄生命率提供了一个更全面的视角。此外,我们还引入了一种多人口(国家和性别)方法,考虑到不同年龄、不同年份和不同人口的长寿动态的异质性,提供可靠的估算,从而为现有文献做出了贡献。我们还首次引入了利用国家以下各级数据进行多人口间接估算的方法,为间接估算技术的发展做出了贡献。我们的模型准确捕捉了不同年龄段和不同人群的死亡率动态。我们还研究了超参数的选择如何影响模型的可靠性,并分析了实际死亡率和估计死亡率之间特定年龄的相对差异,从而证明该模型能够可靠地预测特定年龄的死亡率。
{"title":"Disaggregating Death Rates of Age-Groups Using Deep Learning Algorithms","authors":"A. Nigri, Susanna Levantesi, Salvatore Scognamiglio","doi":"10.1177/0282423x241240739","DOIUrl":"https://doi.org/10.1177/0282423x241240739","url":null,"abstract":"Reliable estimates of age-specific vital rates are crucial in demographic studies, while ages are, in most cases, commonly grouped in bins of five years. Indeed, public health and national systems require single age-specific data to achieve accurate social planning. This paper introduces a deep learning approach for splitting the abridged death rates, providing a more comprehensive perspective on the indirect age-specific vital rates estimation from grouped data. Additionally, we contribute to the existing literature by introducing a multi-population (countries and genders) approach, providing reliable estimates considering the heterogeneity of longevity dynamics over age, years, and across populations. We also contribute to the state of the art in indirect estimation by introducing, for the first time, a multi-population indirect estimation leveraging subnational data. Our model accurately captures mortality dynamics by age over time and among different populations. We prove the model’s ability to estimate reliable predictions of age-specific mortality rates by also studying how the hyperparameters’ choice affects the model reliability and analyzing the age-specific relative differences between the real and the estimated mortality rates.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141025389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Validity of Using Webpage Texts to Identify the Target Population of a Survey: An Application to Detect Online Platforms 利用网页文本确定调查目标人群的有效性:在线平台检测应用
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-03-01 DOI: 10.1177/0282423x241235265
Piet Daas, Wolter Hassink, Bart Klijs
A statistical classification model was developed to identify online platform organizations based on the texts on their website. The model was subsequently used to identify all (potential) platform organizations with a website included in the Dutch Business Register. The empirical outcomes of the statistical model were plausible in terms of the words and the bimodal distribution of fitted probabilities, but the results indicated an overestimation of the number of platform organizations. Next, the external validity of the outcomes was investigated through a survey of the organizations that were identified as a platform organization by the statistical classification model. The response by the organizations to the survey confirmed a substantial number of type-I errors. Furthermore, it revealed a positive association between the fitted probability of the text-based classification model and the organization’s response to the survey question on being an online platform organization. The survey results indicated that the text-based classification model can be used to obtain a subpopulation of potential platform organizations from the entire population of businesses with a website. This subpopulation may form a good starting point to study platform organizations in more detail.
我们开发了一个统计分类模型,用于根据在线平台组织网站上的文本对其进行识别。该模型随后被用于识别所有在荷兰商业登记中拥有网站的(潜在)平台组织。从单词和拟合概率的双峰分布来看,统计模型的经验结果是可信的,但结果表明高估了平台组织的数量。接下来,通过对统计分类模型确定为平台组织的组织进行调查,研究了结果的外部有效性。这些组织对调查的答复证实了大量的 I 类错误。此外,调查还显示,基于文本的分类模型的拟合概率与组织对 "是否为在线平台组织 "调查问题的答复之间存在正相关。调查结果表明,基于文本的分类模型可用于从所有拥有网站的企业中获取潜在平台组织的子群。这个子群可能是更详细研究平台组织的一个良好起点。
{"title":"On the Validity of Using Webpage Texts to Identify the Target Population of a Survey: An Application to Detect Online Platforms","authors":"Piet Daas, Wolter Hassink, Bart Klijs","doi":"10.1177/0282423x241235265","DOIUrl":"https://doi.org/10.1177/0282423x241235265","url":null,"abstract":"A statistical classification model was developed to identify online platform organizations based on the texts on their website. The model was subsequently used to identify all (potential) platform organizations with a website included in the Dutch Business Register. The empirical outcomes of the statistical model were plausible in terms of the words and the bimodal distribution of fitted probabilities, but the results indicated an overestimation of the number of platform organizations. Next, the external validity of the outcomes was investigated through a survey of the organizations that were identified as a platform organization by the statistical classification model. The response by the organizations to the survey confirmed a substantial number of type-I errors. Furthermore, it revealed a positive association between the fitted probability of the text-based classification model and the organization’s response to the survey question on being an online platform organization. The survey results indicated that the text-based classification model can be used to obtain a subpopulation of potential platform organizations from the entire population of businesses with a website. This subpopulation may form a good starting point to study platform organizations in more detail.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140274255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualizing the Shelf Life of Population Forecasts: A Simple Approach to Communicating Forecast Uncertainty 可视化人口预测的有效期:传达预测不确定性的简单方法
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-03-01 DOI: 10.1177/0282423x241236275
Tom Wilson
It is widely appreciated that population forecasts are inherently uncertain. Researchers have responded by quantifying uncertainty using probabilistic forecasting methods. Yet despite several decades of development, probabilistic forecasts have gained little traction outside the academic sector. Therefore, this article suggests an alternative and simpler approach to estimating and communicating uncertainty which might be helpful for population forecast practitioners and users. Drawing on the naïve forecasts idea of Alho, it suggests creating “synthetic historical forecast errors” by running a regular deterministic projection model many times over recent decades. Then, borrowing from perishable food terminology, the “shelf life” of forecast variables, the number of years into the future the forecast is likely to remain “safe for consumption” (within a specified error tolerance), is estimated from the “historical” errors. The shelf lives are then applied to a current set of forecasts and presented in a simple manner in graphs and tables of forecasts using color-coding. The approach is illustrated through a case study of 2021-based population forecasts for Australia. It´s concluded that the approach offers a relatively straightforward way of estimating and communicating population forecast uncertainty.
人们普遍认为,人口预测本身具有不确定性。为此,研究人员采用概率预测方法对不确定性进行量化。然而,尽管经过了几十年的发展,概率预测在学术界之外却鲜有人问津。因此,本文提出了另一种更简单的估算和交流不确定性的方法,或许对人口预测从业人员和用户有所帮助。文章借鉴了阿尔霍的天真预测思想,建议通过在最近几十年中多次运行一个常规确定性预测模型来创建 "合成历史预测误差"。然后,借用易腐食品的术语,根据 "历史 "误差估算出预测变量的 "保质期",即预测在未来多少年后仍有可能 "安全食用"(在规定的误差容限内)。然后,将保存期应用于当前的一组预测,并通过使用颜色编码的预测图和预测表以简单的方式呈现出来。该方法通过基于 2021 年的澳大利亚人口预测案例研究进行说明。结论是,该方法提供了一种相对简单的估算和交流人口预测不确定性的方法。
{"title":"Visualizing the Shelf Life of Population Forecasts: A Simple Approach to Communicating Forecast Uncertainty","authors":"Tom Wilson","doi":"10.1177/0282423x241236275","DOIUrl":"https://doi.org/10.1177/0282423x241236275","url":null,"abstract":"It is widely appreciated that population forecasts are inherently uncertain. Researchers have responded by quantifying uncertainty using probabilistic forecasting methods. Yet despite several decades of development, probabilistic forecasts have gained little traction outside the academic sector. Therefore, this article suggests an alternative and simpler approach to estimating and communicating uncertainty which might be helpful for population forecast practitioners and users. Drawing on the naïve forecasts idea of Alho, it suggests creating “synthetic historical forecast errors” by running a regular deterministic projection model many times over recent decades. Then, borrowing from perishable food terminology, the “shelf life” of forecast variables, the number of years into the future the forecast is likely to remain “safe for consumption” (within a specified error tolerance), is estimated from the “historical” errors. The shelf lives are then applied to a current set of forecasts and presented in a simple manner in graphs and tables of forecasts using color-coding. The approach is illustrated through a case study of 2021-based population forecasts for Australia. It´s concluded that the approach offers a relatively straightforward way of estimating and communicating population forecast uncertainty.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Inference for Repeated Measures Under Informative Sampling 信息抽样下重复测量的贝叶斯推论
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2024-03-01 DOI: 10.1177/0282423x241235252
T. Savitsky, Luis G. León-Novelo, Helen Engle
Survey data are often randomly drawn from an underlying population of inferential interest under a multistage, complex sampling design. A sampling weight proportional to the number of individuals in the population that each sampled individual represents is released. The sampling design is informative with respect to a response variable of interest if the variable correlates with the sampling weights. The distribution for the variables of interest differs in the sample and in the population, requiring correction to the sample distribution to approximate the population. We focus on model-based Bayesian inference for repeated (continuous) measures associated with each sampled individual. We devise a model for the joint estimation of response variable(s) of interest and sampling weights to account for the informative sampling design in a formulation that captures the association of the measures taken on the same individual incorporating individual-specific random-effects. We show that our approach yields correct population inference on the observed sample of units and compare its performance with competing method via simulation. Methods are compared using bias, mean square error, coverage, and length of credible intervals. We demonstrate our approach using a National Health and Nutrition Examination Survey dietary dataset modeling daily protein consumption.
调查数据通常是根据多阶段、复杂的抽样设计,从具有推断意义的基本人口中随机抽取的。抽样权重与每个抽样个体所代表的人口数量成正比。如果所关注的响应变量与抽样权重相关,则抽样设计对该变量具有参考价值。相关变量在样本中的分布与在总体中的分布不同,因此需要对样本分布进行修正,以接近总体分布。我们的重点是对与每个抽样个体相关的重复(连续)测量进行基于模型的贝叶斯推断。我们为感兴趣的响应变量和抽样权重的联合估计设计了一个模型,以考虑信息抽样设计,该模型能捕捉到同一个体的测量值之间的关联,并包含个体特异性随机效应。我们证明了我们的方法能对观察到的单位样本进行正确的总体推断,并通过模拟比较了它与其他方法的性能。我们使用偏差、均方误差、覆盖率和可信区间长度对各种方法进行了比较。我们使用国家健康与营养调查饮食数据集对我们的方法进行了演示,该数据集模拟了每日蛋白质消耗量。
{"title":"Bayesian Inference for Repeated Measures Under Informative Sampling","authors":"T. Savitsky, Luis G. León-Novelo, Helen Engle","doi":"10.1177/0282423x241235252","DOIUrl":"https://doi.org/10.1177/0282423x241235252","url":null,"abstract":"Survey data are often randomly drawn from an underlying population of inferential interest under a multistage, complex sampling design. A sampling weight proportional to the number of individuals in the population that each sampled individual represents is released. The sampling design is informative with respect to a response variable of interest if the variable correlates with the sampling weights. The distribution for the variables of interest differs in the sample and in the population, requiring correction to the sample distribution to approximate the population. We focus on model-based Bayesian inference for repeated (continuous) measures associated with each sampled individual. We devise a model for the joint estimation of response variable(s) of interest and sampling weights to account for the informative sampling design in a formulation that captures the association of the measures taken on the same individual incorporating individual-specific random-effects. We show that our approach yields correct population inference on the observed sample of units and compare its performance with competing method via simulation. Methods are compared using bias, mean square error, coverage, and length of credible intervals. We demonstrate our approach using a National Health and Nutrition Examination Survey dietary dataset modeling daily protein consumption.","PeriodicalId":51092,"journal":{"name":"Journal of Official Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140278712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Official Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1