Pub Date : 2002-12-01DOI: 10.1191/1471082x02st034oa
M. Fuentes
Spatial processes are important models for many environmental problems. Classical geostatistics and Fourier spectral methods are powerful tools for stuyding the spatial structure of stationary processes. However, it is widely recognized that in real applications spatial processes are rarely stationary and isotropic. Consequently, it is important to extend these spectral methods to processes that are nonstationary. In this work, we present some new spectral approaches and tools to estimate the spatial structure of a nonstationary process. More specifically, we propose an approach for the spectral analysis of nonstationary spatial processes that is based on the concept of spatial spectra, i.e., spectral functions that are space-dependent. This notion of spatial spectra generalizes the definition of spectra for stationary processes, and under certain conditions, the spatial spectrum at each Location can be estimated from a single realization of the spatial process. The motivation for this work is the modeling and prediction of ozone concentrations over different geopolitical boundaries for assessment of compliance with ambient air quality standards.
{"title":"Interpolation of nonstationary air pollution processes: a spatial spectral approach","authors":"M. Fuentes","doi":"10.1191/1471082x02st034oa","DOIUrl":"https://doi.org/10.1191/1471082x02st034oa","url":null,"abstract":"Spatial processes are important models for many environmental problems. Classical geostatistics and Fourier spectral methods are powerful tools for stuyding the spatial structure of stationary processes. However, it is widely recognized that in real applications spatial processes are rarely stationary and isotropic. Consequently, it is important to extend these spectral methods to processes that are nonstationary. In this work, we present some new spectral approaches and tools to estimate the spatial structure of a nonstationary process. More specifically, we propose an approach for the spectral analysis of nonstationary spatial processes that is based on the concept of spatial spectra, i.e., spectral functions that are space-dependent. This notion of spatial spectra generalizes the definition of spectra for stationary processes, and under certain conditions, the spatial spectrum at each Location can be estimated from a single realization of the spatial process. The motivation for this work is the modeling and prediction of ozone concentrations over different geopolitical boundaries for assessment of compliance with ambient air quality standards.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132428203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-01DOI: 10.1191/1471082x02st035oa
N. Cressie, M. Pavlicova
The spatial moving average (SMA) is a very natural type of spatial process that involves integrals or sums of independent and identically distributed random variables. Consequently, the mean and covariance function of the SMAs can be written down immediately in terms of their integrand or summand. Moreover, simulation from them is straightforward, and it does not require any large-matrix inversions. Although the SMAs generate a large class of spatial covariance functions, can we find easy-to-use SMAs, calibrated to be ‘like’ some of the usual covariance functions used in geostatistics? For example, is there an SMA that is straightforward to simulate from, whose covariance function is like the spherical covariance function? This article will derive such an SMA.
{"title":"Calibrated spatial moving average simulations","authors":"N. Cressie, M. Pavlicova","doi":"10.1191/1471082x02st035oa","DOIUrl":"https://doi.org/10.1191/1471082x02st035oa","url":null,"abstract":"The spatial moving average (SMA) is a very natural type of spatial process that involves integrals or sums of independent and identically distributed random variables. Consequently, the mean and covariance function of the SMAs can be written down immediately in terms of their integrand or summand. Moreover, simulation from them is straightforward, and it does not require any large-matrix inversions. Although the SMAs generate a large class of spatial covariance functions, can we find easy-to-use SMAs, calibrated to be ‘like’ some of the usual covariance functions used in geostatistics? For example, is there an SMA that is straightforward to simulate from, whose covariance function is like the spherical covariance function? This article will derive such an SMA.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132045870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-01DOI: 10.1191/1471082x02st042ed
D. Higdon
Because of the increasing use of spatial moving average models in a range of applications from biodiversity to oceanography, Jay ver Hoef and I organized a workshop that focused on such models to bring together researchers in this area. Spatial moving average models are formed by convolving a simple, underlying process with a smoothing kernel. This simple construction device can lead to some very interesting spatial processes and appealing computational approaches for estimation. Clearly, the use of such moving average constructions to create spatial processes is hardly new – the idea of smoothing out a spatial Poisson process is mentioned in Matérn (1960). However recent advances in computing and renewed focus on challenging applications has brought new life to the spatial moving average. This workshop was a chance to see the state of the art in such models. The workshop was hosted and supported by the National Research Center for Statistics and the Environment and took place May 20–22, 2001 at University of Washington in Seattle. The format consisted of ten hour long talks, followed by a half hour of lively oor discussion. The workshop participants included: Ron Barry, Julian Besag, Nicky Best, Noel Cressie, Monserrat Fuentes, Peter Guttorp, Mark Hancock, Dave Higdon, Katja Ickstadt, Konstantin Krivoruchko, Doug Nychka, Paul Sampson, Michael Stein, Jean Thiebeaux, Jay Ver Hoef, Chris Wikle and Robert Wolpert. A number of innovative developments – both theoretical and applied – were presented and discussed at the workshop. Some of these developments are contained in the following four papers. Enjoy!
{"title":"The Spatial Moving Average Workshop","authors":"D. Higdon","doi":"10.1191/1471082x02st042ed","DOIUrl":"https://doi.org/10.1191/1471082x02st042ed","url":null,"abstract":"Because of the increasing use of spatial moving average models in a range of applications from biodiversity to oceanography, Jay ver Hoef and I organized a workshop that focused on such models to bring together researchers in this area. Spatial moving average models are formed by convolving a simple, underlying process with a smoothing kernel. This simple construction device can lead to some very interesting spatial processes and appealing computational approaches for estimation. Clearly, the use of such moving average constructions to create spatial processes is hardly new – the idea of smoothing out a spatial Poisson process is mentioned in Matérn (1960). However recent advances in computing and renewed focus on challenging applications has brought new life to the spatial moving average. This workshop was a chance to see the state of the art in such models. The workshop was hosted and supported by the National Research Center for Statistics and the Environment and took place May 20–22, 2001 at University of Washington in Seattle. The format consisted of ten hour long talks, followed by a half hour of lively oor discussion. The workshop participants included: Ron Barry, Julian Besag, Nicky Best, Noel Cressie, Monserrat Fuentes, Peter Guttorp, Mark Hancock, Dave Higdon, Katja Ickstadt, Konstantin Krivoruchko, Doug Nychka, Paul Sampson, Michael Stein, Jean Thiebeaux, Jay Ver Hoef, Chris Wikle and Robert Wolpert. A number of innovative developments – both theoretical and applied – were presented and discussed at the workshop. Some of these developments are contained in the following four papers. Enjoy!","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131326362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-01DOI: 10.1191/1471082x02st037oa
D. Nychka, C. Wikle, J. Andrew Royle
Many geophysical and environmental problems depend on estimating a spatial process that has nonstationary structure. A nonstationary model is proposed based on the spatial field being a linear combination of multiresolution (wavelet) basis functions and random coefficients. The key is to allow for a limited number of correlations among coefficients and also to use a wavelet basis that is smooth. When approximately 6% nonzero correlations are enforced, this representation gives a good approximation to a family of Matern covariance functions. This sparseness is important not only for model parsimony but also has implications for the efficient analysis of large spatial data sets. The covariance model is successfully applied to ozone model output and results in a nonstationary but smooth estimate.
{"title":"Multiresolution models for nonstationary spatial covariance functions","authors":"D. Nychka, C. Wikle, J. Andrew Royle","doi":"10.1191/1471082x02st037oa","DOIUrl":"https://doi.org/10.1191/1471082x02st037oa","url":null,"abstract":"Many geophysical and environmental problems depend on estimating a spatial process that has nonstationary structure. A nonstationary model is proposed based on the spatial field being a linear combination of multiresolution (wavelet) basis functions and random coefficients. The key is to allow for a limited number of correlations among coefficients and also to use a wavelet basis that is smooth. When approximately 6% nonzero correlations are enforced, this representation gives a good approximation to a family of Matern covariance functions. This sparseness is important not only for model parsimony but also has implications for the efficient analysis of large spatial data sets. The covariance model is successfully applied to ozone model output and results in a nonstationary but smooth estimate.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121183526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-01DOI: 10.1191/1471082x02st036oa
C. Wikle
Spatio-temporal processes can often be written as hierarchical state-space processes. In situations with complicated dynamics such as wave propagation, it is difficult to parameterize state transition functions for high-dimensional state processes. Although in some cases prior understanding of the physical process can be used to formulate models for the state transition, this is not always possible. Alternatively, for processes where one considers discrete time and continuous space, complicated dynamics can be modeled by stochastic integro-difference equations in which the associated redistribution kernel is allowed to vary with space and/or time. By considering a spectral implementation of such models, one can formulate a spatio-temporal model with relatively few parameters that can accommodate complicated dynamics. This approach can be developed in a hierarchical framework for non-Gaussian processes, as demonstrated on cloud intensity data.
{"title":"A kernel-based spectral model for non-Gaussian spatio-temporal processes","authors":"C. Wikle","doi":"10.1191/1471082x02st036oa","DOIUrl":"https://doi.org/10.1191/1471082x02st036oa","url":null,"abstract":"Spatio-temporal processes can often be written as hierarchical state-space processes. In situations with complicated dynamics such as wave propagation, it is difficult to parameterize state transition functions for high-dimensional state processes. Although in some cases prior understanding of the physical process can be used to formulate models for the state transition, this is not always possible. Alternatively, for processes where one considers discrete time and continuous space, complicated dynamics can be modeled by stochastic integro-difference equations in which the associated redistribution kernel is allowed to vary with space and/or time. By considering a spectral implementation of such models, one can formulate a spatio-temporal model with relatively few parameters that can accommodate complicated dynamics. This approach can be developed in a hierarchical framework for non-Gaussian processes, as demonstrated on cloud intensity data.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125020233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-10-01DOI: 10.1191/1471082x02st031oa
C. Lange, J. Whittaker, A. Macgregor
We propose an extension of the generalized estimating equation approach to multivariate regression models (Liang and Zeger, 1986) which allows the estimation of dispersion and association parameters in the covariance matrix partly using estimating equations as in Prentice and Zhao (1991), and partly by the direct use of consistent estimators. The advantages of this hybrid approach over that of Prentice and Zhao (1991) are a reduction in the number of fourth moment assumptions that must be made, and the consequent reduction in numerical complexity. We show that the type of estimation used for covariance parameters does not affect the asymptotic efficiency of the mean parameter estimates. The advantages of the hybrid model are illustrated by a simulation study. This work was motivated by problems in statistical genetics, and we illustrate our approach using a twin study examining association between the osteocalcin receptor and various osteoporisis-related traits.
{"title":"Generalized estimating equations: A hybrid approach for mean parameters in multivariate regression models","authors":"C. Lange, J. Whittaker, A. Macgregor","doi":"10.1191/1471082x02st031oa","DOIUrl":"https://doi.org/10.1191/1471082x02st031oa","url":null,"abstract":"We propose an extension of the generalized estimating equation approach to multivariate regression models (Liang and Zeger, 1986) which allows the estimation of dispersion and association parameters in the covariance matrix partly using estimating equations as in Prentice and Zhao (1991), and partly by the direct use of consistent estimators. The advantages of this hybrid approach over that of Prentice and Zhao (1991) are a reduction in the number of fourth moment assumptions that must be made, and the consequent reduction in numerical complexity. We show that the type of estimation used for covariance parameters does not affect the asymptotic efficiency of the mean parameter estimates. The advantages of the hybrid model are illustrated by a simulation study. This work was motivated by problems in statistical genetics, and we illustrate our approach using a twin study examining association between the osteocalcin receptor and various osteoporisis-related traits.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131357800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-10-01DOI: 10.1191/1471082x02st041oa
L. Grilli, C. Rampichini
The paper presents some criteria for the specification of ordinal variance component models when the units are grouped in a limited number of strata. The base model is specified using a latent variable approach, allowing the first level variance, the second level variance, and the thresholds to vary according to the strata. However this model is not identifiable. The paper discusses some alternative assumptions that overcome the identification problem and illustrates a general strategy for model selection. The proposed methodology is applied to the analysis of course programme evaluations based on student ratings, referring to three different schools of the University of Florence. The adopted model takes into account both the ordinal scale of the ratings and the hierarchical nature of the phenomenon. In this framework, the specification of the latent variable distributions is crucial, since a different first level variance among the schools would substantially change the interpretation of model parameters, as confirmed by the limited simulation study presented in the paper.
{"title":"Specification issues in stratified variance component ordinal response models","authors":"L. Grilli, C. Rampichini","doi":"10.1191/1471082x02st041oa","DOIUrl":"https://doi.org/10.1191/1471082x02st041oa","url":null,"abstract":"The paper presents some criteria for the specification of ordinal variance component models when the units are grouped in a limited number of strata. The base model is specified using a latent variable approach, allowing the first level variance, the second level variance, and the thresholds to vary according to the strata. However this model is not identifiable. The paper discusses some alternative assumptions that overcome the identification problem and illustrates a general strategy for model selection. The proposed methodology is applied to the analysis of course programme evaluations based on student ratings, referring to three different schools of the University of Florence. The adopted model takes into account both the ordinal scale of the ratings and the hierarchical nature of the phenomenon. In this framework, the specification of the latent variable distributions is crucial, since a different first level variance among the schools would substantially change the interpretation of model parameters, as confirmed by the limited simulation study presented in the paper.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124571605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-10-01DOI: 10.1191/1471082x02st040oa
L. Fahrmeir, C. Gössl
Functional magnetic resonance imaging (fMRI) has led to enormous progress in human brain mapping. Adequate analysis of the massive spatiotemporal data sets generated by this imaging technique, combining parametric and non-parametric components, imposes challenging problems in statistical modelling. Complex hierarchical Bayesian models in combination with computer-intensive Markov chain Monte Carlo inference are promising tools. The purpose of this paper is twofold. First, it provides a review of general semiparametric Bayesian models for the analysis of fMRI data. Most approaches focus on important but separate temporal or spatial aspects of the overall problem, or they proceed by stepwise procedures. Therefore, as a second aim, we suggest a complete spatiotemporal model for analysing fMRI data within a unified semiparametric Bayesian framework. An application to data from a visual stimulation experiment illustrates our approach and demonstrates its computational feasibility.
{"title":"Semiparametric Bayesian models for human brain mapping","authors":"L. Fahrmeir, C. Gössl","doi":"10.1191/1471082x02st040oa","DOIUrl":"https://doi.org/10.1191/1471082x02st040oa","url":null,"abstract":"Functional magnetic resonance imaging (fMRI) has led to enormous progress in human brain mapping. Adequate analysis of the massive spatiotemporal data sets generated by this imaging technique, combining parametric and non-parametric components, imposes challenging problems in statistical modelling. Complex hierarchical Bayesian models in combination with computer-intensive Markov chain Monte Carlo inference are promising tools. The purpose of this paper is twofold. First, it provides a review of general semiparametric Bayesian models for the analysis of fMRI data. Most approaches focus on important but separate temporal or spatial aspects of the overall problem, or they proceed by stepwise procedures. Therefore, as a second aim, we suggest a complete spatiotemporal model for analysing fMRI data within a unified semiparametric Bayesian framework. An application to data from a visual stimulation experiment illustrates our approach and demonstrates its computational feasibility.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"45 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116083997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-10-01DOI: 10.1191/1471082x02st039oa
H. Borgos, H. Omre, C. Townsend
Geological faults are important in reservoir characterization, since they influence fluid flow in the reservoir. Both the number of faults, or the fault intensity, and the fault sizes are of importance. Fault sizes are often represented by maximum displacements, which can be interpreted from seismic data. Owing to limitations in seismic resolution only faults of relatively large size can be observed, and the observations are biased. In order to make inference about the overall fault population, a proper model must be chosen for the fault size distribution. A fractal (Pareto) distribution is commonly used in geophysics literature, but the exponential distribution has also been suggested. In this work we compare the two models statistically. A Bayesian model is defined for the fault size distributions under the two competing models, where the prior distributions are given as the Pareto and the exponential pdfs, respectively, and the likelihood function describes the sampling errors associated with seismic fault observations. The Bayes factor is used as criterion for the model choice, and is estimated using MCMC sampling. The MCMC algorithm is constructed using pseudopriors to sample jointly the two models. The statistical procedure is applied to a fault size data set from the Gullfaks Field in the North Sea. For this data set we find that the fault sizes are best described by the exponential distribution.
{"title":"Size distribution of geological faults: Model choice and parameter estimation","authors":"H. Borgos, H. Omre, C. Townsend","doi":"10.1191/1471082x02st039oa","DOIUrl":"https://doi.org/10.1191/1471082x02st039oa","url":null,"abstract":"Geological faults are important in reservoir characterization, since they influence fluid flow in the reservoir. Both the number of faults, or the fault intensity, and the fault sizes are of importance. Fault sizes are often represented by maximum displacements, which can be interpreted from seismic data. Owing to limitations in seismic resolution only faults of relatively large size can be observed, and the observations are biased. In order to make inference about the overall fault population, a proper model must be chosen for the fault size distribution. A fractal (Pareto) distribution is commonly used in geophysics literature, but the exponential distribution has also been suggested. In this work we compare the two models statistically. A Bayesian model is defined for the fault size distributions under the two competing models, where the prior distributions are given as the Pareto and the exponential pdfs, respectively, and the likelihood function describes the sampling errors associated with seismic fault observations. The Bayes factor is used as criterion for the model choice, and is estimated using MCMC sampling. The MCMC algorithm is constructed using pseudopriors to sample jointly the two models. The statistical procedure is applied to a fault size data set from the Gullfaks Field in the North Sea. For this data set we find that the fault sizes are best described by the exponential distribution.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123319996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-10-01DOI: 10.1191/1471082x02st033oa
Claudio J. Verzilli, J. Carpenter
Data collected in clinical trials involving follow-up of patients over a period of time will almost inevitably be incomplete. Patients will fail to turn up at some of the intended measurement times or will not complete the study, giving rise to various patterns of missingness. In these circumstances, the validity of the conclusions drawn from an analysis of available cases depends crucially on the mechanism driving the missing data process; this in turn cannot be known for certain. For incomplete categorical data, various authors have recently proposed taking into account in a systematic way the ignorance caused by incomplete data. In particular, the idea of intervals of ignorance has been introduced, whereby point estimates for parameters of interest are replaced by intervals or regions of ignorance (Vansteelandt and Goetghebeur, 2001; Kenward et al., 2001; Molenberghs et al., 2001). These are identified by the set of estimates corresponding to possible outcomes for the missing data under little or no assumptions about the missing data mechanism. Here we extend this idea to incomplete repeated ordinal data. We describe a modified version of standard algorithms used for fitting marginal models to longitudinal categorical data, which enables calculation of intervals of ignorance for the parameters of interest. The ideas are illustrated using dental pain measurements from a longitudinal clinical trial.
在临床试验中收集的数据涉及一段时间内对患者的随访,几乎不可避免地是不完整的。患者将无法在某些预定的测量时间出现或无法完成研究,从而产生各种类型的缺失。在这种情况下,从现有案例分析中得出的结论的有效性主要取决于驱动缺失数据处理的机制;这反过来又不能确定。对于不完整的分类数据,最近有许多作者提出要系统地考虑不完整数据所造成的无知。特别是,引入了无知区间的概念,即对感兴趣参数的点估计被无知区间或区域所取代(Vansteelandt和Goetghebeur, 2001;Kenward et al., 2001;Molenberghs et al., 2001)。在对缺失数据机制很少或没有假设的情况下,通过一组与缺失数据的可能结果相对应的估计来识别这些数据。这里我们把这个思想扩展到不完全重复有序数据。我们描述了用于拟合边缘模型到纵向分类数据的标准算法的修改版本,它可以计算感兴趣参数的忽略区间。这些想法是用纵向临床试验的牙痛测量来说明的。
{"title":"Assessing uncertainty about parameter estimates with incomplete repeated ordinal data","authors":"Claudio J. Verzilli, J. Carpenter","doi":"10.1191/1471082x02st033oa","DOIUrl":"https://doi.org/10.1191/1471082x02st033oa","url":null,"abstract":"Data collected in clinical trials involving follow-up of patients over a period of time will almost inevitably be incomplete. Patients will fail to turn up at some of the intended measurement times or will not complete the study, giving rise to various patterns of missingness. In these circumstances, the validity of the conclusions drawn from an analysis of available cases depends crucially on the mechanism driving the missing data process; this in turn cannot be known for certain. For incomplete categorical data, various authors have recently proposed taking into account in a systematic way the ignorance caused by incomplete data. In particular, the idea of intervals of ignorance has been introduced, whereby point estimates for parameters of interest are replaced by intervals or regions of ignorance (Vansteelandt and Goetghebeur, 2001; Kenward et al., 2001; Molenberghs et al., 2001). These are identified by the set of estimates corresponding to possible outcomes for the missing data under little or no assumptions about the missing data mechanism. Here we extend this idea to incomplete repeated ordinal data. We describe a modified version of standard algorithms used for fitting marginal models to longitudinal categorical data, which enables calculation of intervals of ignorance for the parameters of interest. The ideas are illustrated using dental pain measurements from a longitudinal clinical trial.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"1990 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125502818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}