首页 > 最新文献

Statistical Modeling最新文献

英文 中文
Negative binomial loglinear mixed models 负二项式对数混合模型
Pub Date : 2003-10-01 DOI: 10.1191/1471082X03st058oa
J. Booth, G. Casella, H. Friedl, J. Hobert
The Poisson loglinear model is a common choice for explaining variability in counts. However, in many practical circumstances the restriction that the mean and variance are equal is not realistic. Overdispersion with respect to the Poisson distribution can be modeled explicitly by integrating with respect to a mixture distribution, and use of the conjugate gamma mixing distribution leads to a negative binomial loglinear model. This paper extends the negative binomial loglinear model to the case of dependent counts, where dependence among the counts is handled by including linear combinations of random effects in the linear predictor. If we assume that the vector of random effects is multivariate normal, then complex forms of dependence can be modelled by appropriate specification of the covariance structure. Although the likelihood function for the resulting model is not tractable, maximum likelihood estimates (and standard errors) can be found using the NLMIXED procedure in SAS or, in more complicated examples, using a Monte Carlo EM algorithm. An alternate approach is to leave the random effects completely unspecified and attempt to estimate them using nonparametric maximum likelihood. The methodologies are illustrated with several examples.
泊松对数模型是解释计数变异性的常用选择。然而,在许多实际情况下,均值和方差相等的限制是不现实的。关于泊松分布的过色散可以通过对混合分布的积分来明确地建模,并且使用共轭伽马混合分布导致负二项对数线性模型。本文将负二项对数线性模型推广到计数相关的情况,其中计数之间的相关性通过在线性预测器中包含随机效应的线性组合来处理。如果我们假设随机效应的向量是多元正态的,那么复杂的依赖形式可以通过适当的协方差结构来建模。尽管结果模型的似然函数难以处理,但可以使用SAS中的NLMIXED过程找到最大似然估计(和标准误差),或者在更复杂的示例中,使用蒙特卡罗EM算法。另一种方法是完全不指定随机效应,并尝试使用非参数最大似然来估计它们。用几个例子说明了这些方法。
{"title":"Negative binomial loglinear mixed models","authors":"J. Booth, G. Casella, H. Friedl, J. Hobert","doi":"10.1191/1471082X03st058oa","DOIUrl":"https://doi.org/10.1191/1471082X03st058oa","url":null,"abstract":"The Poisson loglinear model is a common choice for explaining variability in counts. However, in many practical circumstances the restriction that the mean and variance are equal is not realistic. Overdispersion with respect to the Poisson distribution can be modeled explicitly by integrating with respect to a mixture distribution, and use of the conjugate gamma mixing distribution leads to a negative binomial loglinear model. This paper extends the negative binomial loglinear model to the case of dependent counts, where dependence among the counts is handled by including linear combinations of random effects in the linear predictor. If we assume that the vector of random effects is multivariate normal, then complex forms of dependence can be modelled by appropriate specification of the covariance structure. Although the likelihood function for the resulting model is not tractable, maximum likelihood estimates (and standard errors) can be found using the NLMIXED procedure in SAS or, in more complicated examples, using a Monte Carlo EM algorithm. An alternate approach is to leave the random effects completely unspecified and attempt to estimate them using nonparametric maximum likelihood. The methodologies are illustrated with several examples.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126614504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 121
Use of fractional polynomials for dose-response modelling and quantitative risk assessment in developmental toxicity studies 在发育毒性研究中使用分数多项式进行剂量-反应建模和定量风险评估
Pub Date : 2003-07-01 DOI: 10.1191/1471082X03st051oa
C. Faes, H. Geys, M. Aerts, G. Molenberghs
Developmental toxicity studies are designed to assess the potential adverse effects of an exposure on developing fetuses. Safe dose levels can be determined using dose-response modelling. To this end, it is important to investigate the effect of misspecifying the dose-response model on the safe dose. Since classical polynomial predictors are often of poor quality, there is a clear need for alternative specifications of the predictors, such as fractional polynomials. By means of simulations, we will show how fractional polynomial predictors may resolve possible model misspecifications and may thus yield more reliable estimates of the benchmark doses.
发育毒性研究的目的是评估接触某种物质对发育中的胎儿的潜在不利影响。安全剂量水平可通过剂量反应模型确定。为此,研究剂量反应模型的错误规定对安全剂量的影响是很重要的。由于经典多项式预测器通常质量较差,因此显然需要预测器的替代规范,例如分数多项式。通过模拟,我们将展示分数多项式预测器如何解决可能的模型错误规范,从而产生更可靠的基准剂量估计。
{"title":"Use of fractional polynomials for dose-response modelling and quantitative risk assessment in developmental toxicity studies","authors":"C. Faes, H. Geys, M. Aerts, G. Molenberghs","doi":"10.1191/1471082X03st051oa","DOIUrl":"https://doi.org/10.1191/1471082X03st051oa","url":null,"abstract":"Developmental toxicity studies are designed to assess the potential adverse effects of an exposure on developing fetuses. Safe dose levels can be determined using dose-response modelling. To this end, it is important to investigate the effect of misspecifying the dose-response model on the safe dose. Since classical polynomial predictors are often of poor quality, there is a clear need for alternative specifications of the predictors, such as fractional polynomials. By means of simulations, we will show how fractional polynomial predictors may resolve possible model misspecifications and may thus yield more reliable estimates of the benchmark doses.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125875440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Extensions of the Bartlett-Lewis model for rainfall processes 降雨过程Bartlett-Lewis模式的扩展
Pub Date : 2003-07-01 DOI: 10.1191/1471082X02st049oa
A. Salim, Y. Pawitan
While the Bartlett-Lewis model has been widely used for modelling rainfall processes at a fixed point in space over time, there are observed features, such as longer-scale dependence, which are not well fitted by the model. In this paper, we study an extension where we put an extra layer in the clustered Poisson process of storm origins. We also investigate the Pareto inter-arrival time for the storm origins, which has been used to model web-traffic data. We derive the theoretical first and second-order properties of the multi-layer clustered Poisson processes, but generally we have to rely on Monte Carlo techniques. The models are fitted to hourly rainfall data from Valentia observatory in southwest Ireland, where the extensions are shown to improve on the standard models. We generalize these models further by allowing some parameters of the models to be a function of some covariates. An application using data from Valentia observatory and Belmullet shows how to use this class of models to analyze the association between the rainfall pattern and the North Atlantic Oscillation (NAO) index.
虽然Bartlett-Lewis模式已被广泛用于模拟空间中某一固定时间点的降雨过程,但该模式仍存在一些观测到的特征,如较长尺度依赖性,这些特征不能很好地拟合。本文研究了在风暴起源的聚类泊松过程中增加一层的扩展。我们还研究了风暴起源的帕累托到达时间,该时间已用于模拟网络流量数据。我们推导了多层聚类泊松过程的理论一阶和二阶性质,但通常我们必须依靠蒙特卡罗技术。这些模型与爱尔兰西南部瓦伦西亚天文台的每小时降雨量数据相匹配,在那里,扩展显示出对标准模型的改进。我们通过允许模型的某些参数是某些协变量的函数来进一步推广这些模型。一个使用Valentia天文台和Belmullet的数据的应用程序展示了如何使用这类模式来分析降雨模式与北大西洋涛动(NAO)指数之间的关系。
{"title":"Extensions of the Bartlett-Lewis model for rainfall processes","authors":"A. Salim, Y. Pawitan","doi":"10.1191/1471082X02st049oa","DOIUrl":"https://doi.org/10.1191/1471082X02st049oa","url":null,"abstract":"While the Bartlett-Lewis model has been widely used for modelling rainfall processes at a fixed point in space over time, there are observed features, such as longer-scale dependence, which are not well fitted by the model. In this paper, we study an extension where we put an extra layer in the clustered Poisson process of storm origins. We also investigate the Pareto inter-arrival time for the storm origins, which has been used to model web-traffic data. We derive the theoretical first and second-order properties of the multi-layer clustered Poisson processes, but generally we have to rely on Monte Carlo techniques. The models are fitted to hourly rainfall data from Valentia observatory in southwest Ireland, where the extensions are shown to improve on the standard models. We generalize these models further by allowing some parameters of the models to be a function of some covariates. An application using data from Valentia observatory and Belmullet shows how to use this class of models to analyze the association between the rainfall pattern and the North Atlantic Oscillation (NAO) index.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114973881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Likelihood-based analysis of longitudinal count data using a generalized Poisson model 利用广义泊松模型对纵向计数数据进行似然分析
Pub Date : 2003-07-01 DOI: 10.1191/1471082X03st050oa
P. Toscas, M. Faddy
Models based on a generalization of the simple Poisson process are discussed and illustrated with an analysis of some longitudinal count data on frequencies of epileptic fits. The models enable a broad class of discrete distributions to be constructed, which cover a variety of dispersion properties that can be characterized in an intuitive and appealing way by a simple parameterization. This class includes the Poisson and negative binomial distributions as well as other distributions with greater dispersion than Poisson, and also distributions underdispersed relative to the Poisson distribution. Comparing a number of analyses of the data shows that some covariates have a more significant effect using this modelling than from using mixed Poisson models. It is argued that this could be due to the mixed Poisson models used in the other analyses not providing an appropriate description of the residual variation, with the greater flexibility of the generalized Poisson modelling generally enabling more critical assessment of covariate effects than more standard mixed Poisson modelling.
本文讨论了基于简单泊松过程的一般化模型,并通过对癫痫发作频率的一些纵向计数数据的分析加以说明。这些模型能够构建广泛的离散分布,其中涵盖了各种分散特性,这些特性可以通过简单的参数化以直观和吸引人的方式进行表征。这类分布包括泊松分布和负二项分布,以及其他比泊松分布更分散的分布,以及相对于泊松分布的欠分散分布。对数据的大量分析进行比较表明,使用该模型的一些协变量比使用混合泊松模型的协变量具有更显著的影响。有人认为,这可能是由于其他分析中使用的混合泊松模型没有提供对剩余变化的适当描述,广义泊松模型具有更大的灵活性,通常能够比更标准的混合泊松模型更严格地评估协变量效应。
{"title":"Likelihood-based analysis of longitudinal count data using a generalized Poisson model","authors":"P. Toscas, M. Faddy","doi":"10.1191/1471082X03st050oa","DOIUrl":"https://doi.org/10.1191/1471082X03st050oa","url":null,"abstract":"Models based on a generalization of the simple Poisson process are discussed and illustrated with an analysis of some longitudinal count data on frequencies of epileptic fits. The models enable a broad class of discrete distributions to be constructed, which cover a variety of dispersion properties that can be characterized in an intuitive and appealing way by a simple parameterization. This class includes the Poisson and negative binomial distributions as well as other distributions with greater dispersion than Poisson, and also distributions underdispersed relative to the Poisson distribution. Comparing a number of analyses of the data shows that some covariates have a more significant effect using this modelling than from using mixed Poisson models. It is argued that this could be due to the mixed Poisson models used in the other analyses not providing an appropriate description of the residual variation, with the greater flexibility of the generalized Poisson modelling generally enabling more critical assessment of covariate effects than more standard mixed Poisson modelling.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131505209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Multilevel ordinal models for examination grades 考试成绩的多级顺序模型
Pub Date : 2003-07-01 DOI: 10.1191/1471082X03st052oa
A. Fielding, Min Yang, H. Goldstein
In multilevel situations graded category responses are often converted to points scores and linear models for continuous normal responses fitted. This is particularly prevalent in educational research. Generalized multilevel ordinal models for response categories are developed and contrasted in some respects with these normal models. Attention is given to the analysis of a large database of the General Certificate of Education Advanced Level examinations in England and Wales. Ordinal models appear to have advantages in facilitating the study of institutional differences in more detail. Of particular importance is the flexibility offered by logit models with nonproportionally changing odds. Examples are given of the richer contrasts of institutional and subgroup differences that may be evaluated. Appropriate widely available software for this approach is also discussed.
在多层次的情况下,分级类别反应通常被转换成分数和线性模型来拟合连续的正态反应。这在教育研究中尤为普遍。建立了响应类别的广义多层有序模型,并在某些方面与这些常规模型进行了比较。重点是对英格兰和威尔士高等教育普通证书考试的大型数据库进行分析。序数模型似乎在促进更详细地研究制度差异方面具有优势。特别重要的是具有非比例变化的概率的logit模型所提供的灵活性。给出了可以评估的制度差异和亚群体差异的更丰富对比的例子。本文还讨论了适用于这种方法的广泛可用的软件。
{"title":"Multilevel ordinal models for examination grades","authors":"A. Fielding, Min Yang, H. Goldstein","doi":"10.1191/1471082X03st052oa","DOIUrl":"https://doi.org/10.1191/1471082X03st052oa","url":null,"abstract":"In multilevel situations graded category responses are often converted to points scores and linear models for continuous normal responses fitted. This is particularly prevalent in educational research. Generalized multilevel ordinal models for response categories are developed and contrasted in some respects with these normal models. Attention is given to the analysis of a large database of the General Certificate of Education Advanced Level examinations in England and Wales. Ordinal models appear to have advantages in facilitating the study of institutional differences in more detail. Of particular importance is the flexibility offered by logit models with nonproportionally changing odds. Examples are given of the richer contrasts of institutional and subgroup differences that may be evaluated. Appropriate widely available software for this approach is also discussed.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116718701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Modelling ranks using the inverse hypergeometric distribution 使用逆超几何分布建模排名
Pub Date : 2003-04-01 DOI: 10.1191/1471082X03st047oa
Angela D'Elia
A statistical model for ranks is presented, and some results on its parameter are discussed. In particular, maximum likelihood inference is developed, with and without covariates; thus, a statistical model for rank data is introduced in order to link the expressed ranks to the main features of the raters. Some empirical evidence from a marketing survey confirms the usefulness of the proposal in the study of the preferences.
提出了一个职级统计模型,并讨论了其参数的一些结果。特别是,开发了最大似然推理,有和没有协变量;因此,为了将表达的排名与评分者的主要特征联系起来,引入了排名数据的统计模型。来自市场调查的一些经验证据证实了该建议在偏好研究中的有用性。
{"title":"Modelling ranks using the inverse hypergeometric distribution","authors":"Angela D'Elia","doi":"10.1191/1471082X03st047oa","DOIUrl":"https://doi.org/10.1191/1471082X03st047oa","url":null,"abstract":"A statistical model for ranks is presented, and some results on its parameter are discussed. In particular, maximum likelihood inference is developed, with and without covariates; thus, a statistical model for rank data is introduced in order to link the expressed ranks to the main features of the raters. Some empirical evidence from a marketing survey confirms the usefulness of the proposal in the study of the preferences.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123460863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
A simulation-based method for model evaluation 基于仿真的模型评价方法
Pub Date : 2003-04-01 DOI: 10.1191/1471082X03st044oa
D. Allcroft, C. Glasbey
We wish to evaluate and compare models that are non-nested and fit to data using different fitting criteria. We first estimate parameters in all models by optimizing goodness-of-fit to a dataset. Then, to assess a candidate model, we simulate a population of datasets from it and evaluate the goodness-of-fit of all the models, without re-estimating parameter values. Finally, we see whether the vector of goodness-of-fit criteria for the original data is compatible with the multivariate distribution of these criteria for the simulated datasets. By simulating from each model in turn, we determine whether any, or several, models are consistent with the data. We apply the method to compare three models, fit at different temporal resolutions to binary time series of animal behaviour data, concluding that a semi-Markov model gives a better fit than latent Gaussian and hidden Markov models.
我们希望评估和比较非嵌套模型,并使用不同的拟合标准来拟合数据。我们首先通过优化数据集的拟合优度来估计所有模型中的参数。然后,为了评估候选模型,我们模拟了来自该模型的数据集的总体,并评估了所有模型的拟合优度,而无需重新估计参数值。最后,我们查看原始数据的拟合优度标准向量是否与模拟数据集的这些标准的多变量分布兼容。通过对每个模型依次进行模拟,我们确定是否有一个或几个模型与数据一致。我们应用该方法比较了三种模型,在不同时间分辨率下拟合动物行为数据的二进制时间序列,得出结论:半马尔可夫模型比隐高斯模型和隐马尔可夫模型具有更好的拟合效果。
{"title":"A simulation-based method for model evaluation","authors":"D. Allcroft, C. Glasbey","doi":"10.1191/1471082X03st044oa","DOIUrl":"https://doi.org/10.1191/1471082X03st044oa","url":null,"abstract":"We wish to evaluate and compare models that are non-nested and fit to data using different fitting criteria. We first estimate parameters in all models by optimizing goodness-of-fit to a dataset. Then, to assess a candidate model, we simulate a population of datasets from it and evaluate the goodness-of-fit of all the models, without re-estimating parameter values. Finally, we see whether the vector of goodness-of-fit criteria for the original data is compatible with the multivariate distribution of these criteria for the simulated datasets. By simulating from each model in turn, we determine whether any, or several, models are consistent with the data. We apply the method to compare three models, fit at different temporal resolutions to binary time series of animal behaviour data, concluding that a semi-Markov model gives a better fit than latent Gaussian and hidden Markov models.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114622113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Reduced-rank vector generalized linear models 降秩向量广义线性模型
Pub Date : 2003-04-01 DOI: 10.1191/1471082X03st045oa
T. Yee, T. Hastie
Reduced-rank regression is a method with great potential for dimension reduction but has found few applications in applied statistics. To address this, reduced-rank regression is proposed for the class of vector generalized linear models (VGLMs), which is very large. The resulting class, which we call reduced-rank VGLMs (RR-VGLMs), enables the benefits of reduced-rank regression to be conveyed to a wide range of data types, including categorical data. RR-VGLMs are illustrated by focussing on models for categorical data, and especially the multinomial logit model. General algorithmic details are provided and software written by the first author is described. The reduced-rank multinomial logit model is illustrated with real data in two contexts: a regression analysis of workforce data and a classification problem.
降秩回归是一种极具降维潜力的方法,但在应用统计学中应用较少。为了解决这个问题,提出了对向量广义线性模型(VGLMs)的降秩回归,这类模型非常大。得到的类,我们称之为降阶vglm (rr - vglm),它可以将降阶回归的好处传递给广泛的数据类型,包括分类数据。rr - vglm通过关注分类数据的模型,特别是多项逻辑模型来说明。提供了一般算法细节,并描述了第一作者编写的软件。在劳动力数据的回归分析和分类问题两种情况下,用实际数据说明了降阶多项式逻辑模型。
{"title":"Reduced-rank vector generalized linear models","authors":"T. Yee, T. Hastie","doi":"10.1191/1471082X03st045oa","DOIUrl":"https://doi.org/10.1191/1471082X03st045oa","url":null,"abstract":"Reduced-rank regression is a method with great potential for dimension reduction but has found few applications in applied statistics. To address this, reduced-rank regression is proposed for the class of vector generalized linear models (VGLMs), which is very large. The resulting class, which we call reduced-rank VGLMs (RR-VGLMs), enables the benefits of reduced-rank regression to be conveyed to a wide range of data types, including categorical data. RR-VGLMs are illustrated by focussing on models for categorical data, and especially the multinomial logit model. General algorithmic details are provided and software written by the first author is described. The reduced-rank multinomial logit model is illustrated with real data in two contexts: a regression analysis of workforce data and a classification problem.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"50 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132974030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
Modelling data from inside the Earth: local smoothing of mean and dispersion structure in deep drill data 地球内部的建模数据:深钻数据中均值和频散结构的局部平滑
Pub Date : 2003-04-01 DOI: 10.1191/1471082X03st048oa
G. Kauermann, H. Küchenhoff
The paper describes the analysis of data originating from the German Deep Drill Program. The amount of ‘cataclastic rocks’ is modelled with data resulting from a series of measurements taken from deep drill samples ranging from 1000 up to 5000 m depth. The measurements thereby describe the amount of strongly deformed rock particles and serve as indicator for the occurrence of cataclastic shear zones, which are areas of severely‘ground’ stones due to movements of different layers in the earth crust. The data represent a ‘depth series’ as analogue to a ‘time series’, with mean, dispersion and correlation structure varying in depth. The general smooth structure is thereby disturbed by peaks and outliers so that robust procedures have to be applied for estimation. In terms of statistical modelling technology three different peculiarities of the data have to be tackled simultaneously, that is estimation of the correlation structure, local bandwidth selection and robust smoothing. To do so, existing routines are adapted and combined in new ‘two-stage’ estimation procedures.
本文介绍了对德国深钻计划数据的分析。“碎裂岩”的数量是根据从1000到5000米深度的深钻样品中获得的一系列测量数据进行建模的。因此,测量结果描述了强烈变形的岩石颗粒的数量,并作为碎裂剪切带发生的指标,碎裂剪切带是由于地壳中不同层的运动而严重“磨碎”岩石的区域。数据将“深度序列”表示为类似于“时间序列”的“深度序列”,其平均值、离散度和相关结构随深度而变化。因此,一般的平滑结构受到峰值和异常值的干扰,因此必须采用鲁棒程序进行估计。在统计建模技术方面,必须同时解决数据的三个不同特性,即相关结构的估计、局部带宽的选择和鲁棒平滑。为了做到这一点,现有的例程被改编并结合到新的“两阶段”估计过程中。
{"title":"Modelling data from inside the Earth: local smoothing of mean and dispersion structure in deep drill data","authors":"G. Kauermann, H. Küchenhoff","doi":"10.1191/1471082X03st048oa","DOIUrl":"https://doi.org/10.1191/1471082X03st048oa","url":null,"abstract":"The paper describes the analysis of data originating from the German Deep Drill Program. The amount of ‘cataclastic rocks’ is modelled with data resulting from a series of measurements taken from deep drill samples ranging from 1000 up to 5000 m depth. The measurements thereby describe the amount of strongly deformed rock particles and serve as indicator for the occurrence of cataclastic shear zones, which are areas of severely‘ground’ stones due to movements of different layers in the earth crust. The data represent a ‘depth series’ as analogue to a ‘time series’, with mean, dispersion and correlation structure varying in depth. The general smooth structure is thereby disturbed by peaks and outliers so that robust procedures have to be applied for estimation. In terms of statistical modelling technology three different peculiarities of the data have to be tackled simultaneously, that is estimation of the correlation structure, local bandwidth selection and robust smoothing. To do so, existing routines are adapted and combined in new ‘two-stage’ estimation procedures.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132521502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Flexible smoothing with P-splines: a unified approach 基于p样条的柔性平滑:一种统一的方法
Pub Date : 2002-12-01 DOI: 10.1191/1471082x02st039ob
I. Currie, M. Durbán
We consider the application of P-splines (Eilers and Marx, 1996) to three classes of models with smooth components: semiparametric models, models with serially correlated errors, and models with heteroscedastic errors. We show that P-splines provide a common approach to these problems. We set out a simple nonparametric strategy for the choice of the P-spline parameters (the number of knots, the degree of the P-spline, and the order of the penalty) and use mixed model (REML) methods for smoothing parameter selection. We give an example of a model in each of the three classes and analyse appropriate data sets.
我们考虑将p样条(Eilers和Marx, 1996)应用于三类具有光滑分量的模型:半参数模型、具有序列相关误差的模型和具有异方差误差的模型。我们证明了p样条为解决这些问题提供了一种常见的方法。我们提出了一个简单的非参数策略来选择p样条参数(结点数、p样条的程度和惩罚的顺序),并使用混合模型(REML)方法来选择平滑参数。我们给出了三个类别中每个类别的模型示例,并分析了相应的数据集。
{"title":"Flexible smoothing with P-splines: a unified approach","authors":"I. Currie, M. Durbán","doi":"10.1191/1471082x02st039ob","DOIUrl":"https://doi.org/10.1191/1471082x02st039ob","url":null,"abstract":"We consider the application of P-splines (Eilers and Marx, 1996) to three classes of models with smooth components: semiparametric models, models with serially correlated errors, and models with heteroscedastic errors. We show that P-splines provide a common approach to these problems. We set out a simple nonparametric strategy for the choice of the P-spline parameters (the number of knots, the degree of the P-spline, and the order of the penalty) and use mixed model (REML) methods for smoothing parameter selection. We give an example of a model in each of the three classes and analyse appropriate data sets.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"310 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133468983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 155
期刊
Statistical Modeling
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1