Asta-Advances in Statistical Analysis最新文献

英文中文

Bayesian generalized additive model selection including a fast variational option 贝叶斯广义加法模型选择，包括快速变异选项

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-12-15 DOI: 10.1007/s10182-023-00490-y

Virginia X. He, Matt P. Wand

We use Bayesian model selection paradigms, such as group least absolute shrinkage and selection operator priors, to facilitate generalized additive model selection. Our approach allows for the effects of continuous predictors to be categorized as either zero, linear or non-linear. Employment of carefully tailored auxiliary variables results in Gibbsian Markov chain Monte Carlo schemes for practical implementation of the approach. In addition, mean field variational algorithms with closed form updates are obtained. Whilst not as accurate, this fast variational option enhances scalability to very large data sets. A package in the R language aids use in practice.

我们使用贝叶斯模型选择范式，如组最小绝对收缩和选择算子先验，来促进广义加法模型选择。我们的方法允许将连续预测因子的影响分为零、线性或非线性。采用精心定制的辅助变量，可产生吉布斯马尔科夫链蒙特卡洛方案，用于该方法的实际应用。此外，还获得了具有闭式更新的均值场变分算法。这种快速变异方案虽然精确度不高，但增强了对超大数据集的可扩展性。R 语言的软件包有助于实际应用。

引用次数: 0

A note on sufficient dimension reduction with post dimension reduction statistical inference 关于充分降维与降维后统计推断的说明

IF 1.4 4区数学 Q2 Social Sciences

Asta-Advances in Statistical Analysis

Pub Date : 2023-12-13 DOI: 10.1007/s10182-023-00491-x

Kyongwon Kim

Sufficient dimension reduction is a widely used tool to extract core information hidden in high-dimensional data for classifying, clustering, and predicting response variables. Various dimension reduction methods and their applications have been introduced in the past decades. Data analysis using sufficient dimension reduction involves two steps: dimension reduction and model estimation. However, when we implement the two-step modeling process, we consider the estimated sufficient predictor as a true predictor variable and proceed to the model development step, which includes statistical inference such as estimating confidence intervals and performing hypothesis tests. However, the outcome obtained using this method is by no means complete because it contains errors only from the model estimation step. Therefore, post dimension reduction inference is an important topic because it is essential to consider errors from sufficient dimension reduction. In this paper, we review the fundamentals of sufficient dimension reduction methods. Then, we introduce an intuitive and heuristic approach for the recently developed post dimension reduction statistical inference.

充分降维是一种广泛应用的工具，可提取隐藏在高维数据中的核心信息，用于分类、聚类和预测响应变量。在过去的几十年里，人们提出了各种降维方法及其应用。充分降维的数据分析包括两个步骤：降维和模型估计。然而，当我们实施两步建模过程时，我们会将估计出的充分预测变量视为真正的预测变量，并进入模型开发步骤，其中包括统计推断，如估计置信区间和进行假设检验。然而，使用这种方法得到的结果并不完整，因为它只包含了模型估计步骤中的误差。因此，后降维推断是一个重要课题，因为必须考虑充分降维带来的误差。本文回顾了充分降维方法的基本原理。然后，我们将为最近开发的后降维统计推断介绍一种直观的启发式方法。

{"title":"A note on sufficient dimension reduction with post dimension reduction statistical inference","authors":"Kyongwon Kim","doi":"10.1007/s10182-023-00491-x","DOIUrl":"https://doi.org/10.1007/s10182-023-00491-x","url":null,"abstract":"<p>Sufficient dimension reduction is a widely used tool to extract core information hidden in high-dimensional data for classifying, clustering, and predicting response variables. Various dimension reduction methods and their applications have been introduced in the past decades. Data analysis using sufficient dimension reduction involves two steps: dimension reduction and model estimation. However, when we implement the two-step modeling process, we consider the estimated sufficient predictor as a true predictor variable and proceed to the model development step, which includes statistical inference such as estimating confidence intervals and performing hypothesis tests. However, the outcome obtained using this method is by no means complete because it contains errors only from the model estimation step. Therefore, post dimension reduction inference is an important topic because it is essential to consider errors from sufficient dimension reduction. In this paper, we review the fundamentals of sufficient dimension reduction methods. Then, we introduce an intuitive and heuristic approach for the recently developed post dimension reduction statistical inference.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138581852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Zero-modified count time series modeling with an application to influenza cases 零修正计数时间序列模型及其在流感病例中的应用

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-11-27 DOI: 10.1007/s10182-023-00488-6

Marinho G. Andrade, Katiane S. Conceição, Nalini Ravishanker

The past few decades have seen considerable interest in modeling time series of counts, with applications in many domains. Classical and Bayesian modeling have primarily focused on conditional Poisson sampling distributions at each time. There is very little research on modeling time series involving Zero-Modified (i.e., Zero Deflated or Inflated) distributions. This paper aims to fill this gap and develop models for count time series involving Zero-Modified distributions, which belong to the Power Series family and are suitable for time series exhibiting both zero-inflation and zero-deflation. A full Bayesian approach via the Hamiltonian Monte Carlo (HMC) technique enables accurate modeling and inference. The paper illustrates our approach using time series on the number of deaths from the influenza virus in the city of São Paulo, Brazil.

在过去的几十年里，人们对计数时间序列建模产生了相当大的兴趣，并在许多领域得到了应用。经典和贝叶斯建模主要集中在每次的条件泊松抽样分布上。对涉及零修正(即零Deflated或零膨胀)分布的时间序列建模的研究很少。本文旨在填补这一空白，开发涉及零修正分布的计数时间序列模型，该模型属于幂级数族，适用于零通货膨胀和零通货紧缩的时间序列。一个完整的贝叶斯方法通过哈密顿蒙特卡罗(HMC)技术实现准确的建模和推理。该论文说明了我们的方法使用时间序列上的死亡人数从流感病毒在城市圣保罗，巴西。

引用次数: 0

Mixtures of generalized normal distributions and EGARCH models to analyse returns and volatility of ESG and traditional investments 混合广义正态分布和EGARCH模型来分析ESG和传统投资的回报和波动性

IF 1.4 4区数学 Q2 Social Sciences

Asta-Advances in Statistical Analysis

Pub Date : 2023-11-18 DOI: 10.1007/s10182-023-00487-7

Pierdomenico Duttilo, Stefano Antonio Gattone, Barbara Iannone

Environmental, social and governance (ESG) criteria are increasingly integrated into investment process to contribute to overcoming global sustainability challenges. Focusing on the reaction to turmoil periods, this work analyses returns and volatility of several ESG indices and makes a comparison with their traditional counterparts from 2016 to 2022. These indices comprise the following markets: Global, the US, Europe and emerging markets. Firstly, the two-component mixture of generalized normal distribution was exploited to objectively detect financial market turmoil periods with the Naïve Bayes’ classifier. Secondly, the EGARCH-in-mean model with exogenous dummy variables was applied to capture the turmoil period impact. Results show that returns and volatility are both affected by turmoil periods. The return–risk performance differs by index type and market: the European ESG index is less volatile than its traditional market benchmark, while in the other markets, the estimated volatility is approximately the same. Moreover, ESG and non-ESG indices differ in terms of turmoil periods impact, risk premium and leverage effect.

环境、社会和治理(ESG)标准日益融入投资过程，有助于克服全球可持续性挑战。本文着眼于对动荡时期的反应，分析了2016年至2022年几个ESG指数的回报和波动性，并与传统指数进行了比较。这些指数包括以下市场:全球、美国、欧洲和新兴市场。首先，利用广义正态分布的双成分混合，利用Naïve贝叶斯分类器客观地检测金融市场动荡时期。其次，采用外生虚拟变量的EGARCH-in-mean模型来捕捉动荡时期的影响。结果表明，收益和波动率都受到动荡时期的影响。不同指数类型和市场的回报风险表现不同:欧洲ESG指数的波动率低于其传统市场基准，而在其他市场，估计的波动率大致相同。此外，ESG指数与非ESG指数在动荡期影响、风险溢价和杠杆效应方面存在差异。

{"title":"Mixtures of generalized normal distributions and EGARCH models to analyse returns and volatility of ESG and traditional investments","authors":"Pierdomenico Duttilo, Stefano Antonio Gattone, Barbara Iannone","doi":"10.1007/s10182-023-00487-7","DOIUrl":"https://doi.org/10.1007/s10182-023-00487-7","url":null,"abstract":"<p>Environmental, social and governance (ESG) criteria are increasingly integrated into investment process to contribute to overcoming global sustainability challenges. Focusing on the reaction to turmoil periods, this work analyses returns and volatility of several ESG indices and makes a comparison with their traditional counterparts from 2016 to 2022. These indices comprise the following markets: Global, the US, Europe and emerging markets. Firstly, the two-component mixture of generalized normal distribution was exploited to objectively detect financial market turmoil periods with the Naïve Bayes’ classifier. Secondly, the EGARCH-in-mean model with exogenous dummy variables was applied to capture the turmoil period impact. Results show that returns and volatility are both affected by turmoil periods. The return–risk performance differs by index type and market: the European ESG index is less volatile than its traditional market benchmark, while in the other markets, the estimated volatility is approximately the same. Moreover, ESG and non-ESG indices differ in terms of turmoil periods impact, risk premium and leverage effect.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mixture of experts distributional regression: implementation using robust estimation with adaptive first-order methods 混合专家分布回归:采用自适应一阶方法的稳健估计实现

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-11-15 DOI: 10.1007/s10182-023-00486-8

David Rügamer, Florian Pfisterer, Bernd Bischl, Bettina Grün

In this work, we propose an efficient implementation of mixtures of experts distributional regression models which exploits robust estimation by using stochastic first-order optimization techniques with adaptive learning rate schedulers. We take advantage of the flexibility and scalability of neural network software and implement the proposed framework in mixdistreg, an R software package that allows for the definition of mixtures of many different families, estimation in high-dimensional and large sample size settings and robust optimization based on TensorFlow. Numerical experiments with simulated and real-world data applications show that optimization is as reliable as estimation via classical approaches in many different settings and that results may be obtained for complicated scenarios where classical approaches consistently fail.

在这项工作中，我们提出了一种有效的专家混合分布回归模型的实现，该模型通过使用随机一阶优化技术和自适应学习率调度程序来利用鲁棒估计。我们利用神经网络软件的灵活性和可扩展性，并在mixdistreg中实现所提出的框架，mixdistreg是一个R软件包，允许定义许多不同家族的混合物，在高维和大样本设置中进行估计，并基于TensorFlow进行鲁棒优化。模拟和真实数据应用的数值实验表明，在许多不同的设置中，优化与通过经典方法进行估计一样可靠，并且在经典方法始终失败的复杂场景中可能获得结果。

引用次数: 0

A Bayesian approach to modeling topic-metadata relationships 贝叶斯方法为主题-元数据关系建模

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-11-03 DOI: 10.1007/s10182-023-00485-9

Patrick Schulze, Simon Wiegrebe, Paul W. Thurner, Christian Heumann, Matthias Aßenmacher

The objective of advanced topic modeling is not only to explore latent topical structures, but also to estimate relationships between the discovered topics and theoretically relevant metadata. Methods used to estimate such relationships must take into account that the topical structure is not directly observed, but instead being estimated itself in an unsupervised fashion, usually by common topic models. A frequently used procedure to achieve this is the method of composition, a Monte Carlo sampling technique performing multiple repeated linear regressions of sampled topic proportions on metadata covariates. In this paper, we propose two modifications of this approach: First, we substantially refine the existing implementation of the method of composition from the R package stm by replacing linear regression with the more appropriate Beta regression. Second, we provide a fundamental enhancement of the entire estimation framework by substituting the current blending of frequentist and Bayesian methods with a fully Bayesian approach. This allows for a more appropriate quantification of uncertainty. We illustrate our improved methodology by investigating relationships between Twitter posts by German parliamentarians and different metadata covariates related to their electoral districts, using the structural topic model to estimate topic proportions.

高级主题建模的目的不仅在于探索潜在的主题结构，还在于估计所发现的主题与理论上相关的元数据之间的关系。用于估算这种关系的方法必须考虑到拓扑结构不是直接观察到的，而是以无监督的方式估算出来的，通常是通过普通的主题模型。为实现这一目的，经常使用的程序是构成法，这是一种蒙特卡罗抽样技术，对元数据协变量的抽样主题比例进行多次重复线性回归。在本文中，我们对这种方法提出了两点修改建议：首先，我们用更合适的 Beta 回归取代了线性回归，从而大大改进了 R 软件包 stm 中现有的组成方法实现。其次，我们从根本上改进了整个估计框架，用完全的贝叶斯方法取代了目前的频繁法和贝叶斯方法的混合方法。这样就能更恰当地量化不确定性。我们通过调查德国议员的 Twitter 帖子与其选区相关的不同元数据协变量之间的关系来说明我们改进后的方法，并使用结构主题模型来估计主题比例。

{"title":"A Bayesian approach to modeling topic-metadata relationships","authors":"Patrick Schulze, Simon Wiegrebe, Paul W. Thurner, Christian Heumann, Matthias Aßenmacher","doi":"10.1007/s10182-023-00485-9","DOIUrl":"10.1007/s10182-023-00485-9","url":null,"abstract":"<div><p>The objective of advanced topic modeling is not only to explore latent topical structures, but also to estimate relationships between the discovered topics and theoretically relevant metadata. Methods used to estimate such relationships must take into account that the topical structure is not directly observed, but instead being estimated itself in an unsupervised fashion, usually by common topic models. A frequently used procedure to achieve this is the <i>method of composition</i>, a Monte Carlo sampling technique performing multiple repeated linear regressions of sampled topic proportions on metadata covariates. In this paper, we propose two modifications of this approach: First, we substantially refine the existing implementation of the method of composition from the <span>R</span> package <span>stm</span> by replacing linear regression with the more appropriate Beta regression. Second, we provide a fundamental enhancement of the entire estimation framework by substituting the current blending of frequentist and Bayesian methods with a fully Bayesian approach. This allows for a more appropriate quantification of uncertainty. We illustrate our improved methodology by investigating relationships between Twitter posts by German parliamentarians and different metadata covariates related to their electoral districts, using the structural topic model to estimate topic proportions.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-023-00485-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135820119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GPS data on tourists: a spatial analysis on road networks 游客 GPS 数据：道路网络的空间分析

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-11-03 DOI: 10.1007/s10182-023-00484-w

Nicoletta D’Angelo, Antonino Abbruzzo, Mauro Ferrante, Giada Adelfio, Marcello Chiodi

This paper proposes a spatial point process model on a linear network to analyse cruise passengers’ stop activities. It identifies and models tourists’ stop intensity at the destination as a function of their main determinants. For this purpose, we consider data collected on cruise passengers through the integration of traditional questionnaire-based survey methods and GPS tracking data in two cities, namely Palermo (Italy) and Dubrovnik (Croatia). Firstly, the density-based spatial clustering of applications with noise algorithm is applied to identify stop locations from GPS tracking data. The influence of individual-related variables and itinerary-related characteristics is considered within a framework of a Gibbs point process model. The proposed model describes spatial stop intensity at the destination, accounting for the geometry of the underlying road network, individual-related variables, contextual-level information, and the spatial interaction amongst stop points. The analysis succeeds in quantifying the influence of both individual-related variables and trip-related characteristics on stop intensity. An interaction parameter allows for measuring the degree of dependence amongst cruise passengers in stop location decisions.

本文提出了一个线性网络上的空间点过程模型来分析邮轮乘客的停留活动。该模型将游客在目的地的停留强度作为其主要决定因素的函数进行识别和建模。为此，我们在意大利巴勒莫和克罗地亚杜布罗夫尼克两座城市，通过整合传统的问卷调查方法和 GPS 跟踪数据，收集了邮轮乘客的数据。首先，我们采用基于密度的空间聚类算法来识别 GPS 跟踪数据中的停靠地点。在吉布斯点过程模型的框架内，考虑了与个人相关的变量和与行程相关的特征的影响。所提出的模型描述了目的地的空间停靠强度，考虑了基础道路网络的几何形状、与个人相关的变量、上下文信息以及停靠点之间的空间交互作用。分析成功地量化了个人相关变量和行程相关特征对停靠强度的影响。通过互动参数，可以衡量邮轮乘客在决定停靠站点时的依赖程度。

{"title":"GPS data on tourists: a spatial analysis on road networks","authors":"Nicoletta D’Angelo, Antonino Abbruzzo, Mauro Ferrante, Giada Adelfio, Marcello Chiodi","doi":"10.1007/s10182-023-00484-w","DOIUrl":"10.1007/s10182-023-00484-w","url":null,"abstract":"<div><p>This paper proposes a spatial point process model on a linear network to analyse cruise passengers’ stop activities. It identifies and models tourists’ stop intensity at the destination as a function of their main determinants. For this purpose, we consider data collected on cruise passengers through the integration of traditional questionnaire-based survey methods and GPS tracking data in two cities, namely Palermo (Italy) and Dubrovnik (Croatia). Firstly, the density-based spatial clustering of applications with noise algorithm is applied to identify stop locations from GPS tracking data. The influence of individual-related variables and itinerary-related characteristics is considered within a framework of a Gibbs point process model. The proposed model describes spatial stop intensity at the destination, accounting for the geometry of the underlying road network, individual-related variables, contextual-level information, and the spatial interaction amongst stop points. The analysis succeeds in quantifying the influence of both individual-related variables and trip-related characteristics on stop intensity. An interaction parameter allows for measuring the degree of dependence amongst cruise passengers in stop location decisions.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-023-00484-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conditional sum of squares estimation of k-factor GARMA models k 因子 GARMA 模型的条件平方和估计

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-10-31 DOI: 10.1007/s10182-023-00482-y

Paul M. Beaumont, Aaron D. Smallwood

We analyze issues related to estimation and inference for the constrained sum of squares estimator (CSS) of the k-factor Gegenbauer autoregressive moving average (GARMA) model. We present theoretical results for the estimator and show that the parameters that determine the cycle lengths are asymptotically independent, converging at rate T, the sample size, for finite cycles. The remaining parameters lack independence and converge at the standard rate. Analogous with existing literature, some challenges exist for testing the hypothesis of non-cyclical long memory, since the associated parameter lies on the boundary of the parameter space. We present simulation results to explore small sample properties of the estimator, which support most distributional results, while also highlighting areas that merit additional exploration. We demonstrate the applicability of the theory and estimator with an application to IBM trading volume.

我们分析了 k 因子格根鲍尔自回归移动平均（GARMA）模型的约束平方和估计器（CSS）的估计和推断相关问题。我们给出了估计器的理论结果，并表明决定周期长度的参数是渐近独立的，在有限周期内以样本大小 T 的速率收敛。其余参数缺乏独立性，以标准速率收敛。与现有文献类似，由于相关参数位于参数空间的边界上，因此在检验非周期性长记忆假设时存在一些挑战。我们展示了模拟结果，以探索估计器的小样本特性，这些结果支持大多数分布结果，同时也强调了值得进一步探索的领域。我们通过对 IBM 交易量的应用证明了理论和估计器的适用性。

引用次数: 0

Measures of interrater agreement for quantitative data 定量数据间一致性的度量

4区数学 Q2 Social Sciences

Asta-Advances in Statistical Analysis

Pub Date : 2023-10-10 DOI: 10.1007/s10182-023-00483-x

Daniela Marella, Giuseppe Bove

Abstract In this paper measures of interrater absolute agreement for quantitative measurements based on the standard deviation are proposed. Such indices allow (i) to overcome the limits affecting the intraclass correlation index; (ii) to measure the interrater agreement on single targets. Estimators of the proposed measures are introduced and their sampling properties are investigated for normal and non-normal data. Simulated data are employed to demonstrate the accuracy and practical utility of the new indices for assessing agreement. Finally, an application to assess the consistency of measurements performed by radiologists evaluating tumor size of lung cancer is presented.

摘要本文提出了基于标准偏差的定量测量间绝对一致性的度量方法。这些指数允许(i)克服影响类内相关指数的限制;(ii)衡量在单一目标上的互译协议。介绍了所提测度的估计量，并研究了它们对正态和非正态数据的抽样性质。仿真数据验证了新指标的准确性和实用性。最后，一个应用程序，以评估一致性的测量执行放射科医师评估肺癌的肿瘤大小提出。

引用次数: 0

Calibrated imputation for multivariate categorical data 多变量分类数据的校准估算

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2023-10-05 DOI: 10.1007/s10182-023-00481-z

Ton de Waal, Jacco Daalmans

Non-response is a major problem for anyone collecting and processing data. A commonly used technique to deal with missing data is imputation, where missing values are estimated and filled in into the dataset. Imputation can become challenging if the variable to be imputed has to comply with a known total. Even more challenging is the case where several variables in the same dataset need to be imputed and, in addition to known totals, logical restrictions between variables have to be satisfied. In our paper, we develop an approach for a broad class of imputation methods for multivariate categorical data such that previously published totals are preserved while logical restrictions on the data are satisfied. The developed approach can be used in combination with any imputation model that estimates imputation probabilities, i.e. the probability that imputation of a certain category for a variable in a certain unit leads to the correct value for this variable and unit.

对于任何收集和处理数据的人来说，非响应都是一个主要问题。处理缺失数据的常用技术是估算，即估算缺失值并将其填入数据集。如果要估算的变量必须符合已知的总数，那么估算就会变得很有挑战性。更具挑战性的情况是，同一数据集中的多个变量都需要估算，而且除了已知总数外，还必须满足变量之间的逻辑限制。在本文中，我们为多变量分类数据的一大类估算方法开发了一种方法，在满足数据逻辑限制的同时，保留了之前公布的总数。所开发的方法可与任何估算估算概率的估算模型结合使用，估算概率即在某一单位中对某一变量的某一类别进行估算，从而得出该变量和单位的正确值的概率。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Asta-Advances in Statistical Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀