Pub Date : 2020-12-18DOI: 10.1177/1471082X20963254
A. Bar-Hen, P. Barbillon, S. Donnet
Generalized multipartite networks consist in the joint observation of several networks implying some common pre-specified groups of individuals. Such complex networks arise commonly in social sciences, biology, ecology, etc. We propose a flexible probabilistic model named Multipartite Block Model (MBM) able to unravel the topology of multipartite networks by identifying clusters (blocks) of nodes sharing the same patterns of connectivity across the collection of networks they are involved in. The model parameters are estimated through a variational version of the Expectation–Maximization algorithm. The numbers of blocks are chosen using an Integrated Completed Likelihood criterion specifically designed for our model. A simulation study illustrates the robustness of the inference strategy. Finally, two datasets respectively issued from ecology and ethnobiology are analyzed with the MBM in order to illustrate its flexibility and its relevance for the analysis of real datasets. The inference procedure is implemented in an R-package GREMLIN, available on Github (https://github.com/Demiperimetre/GREMLINhttps://github.com/Demiperimetre/GREMLIN).
{"title":"Block models for generalized multipartite networks: Applications in ecology and ethnobiology","authors":"A. Bar-Hen, P. Barbillon, S. Donnet","doi":"10.1177/1471082X20963254","DOIUrl":"https://doi.org/10.1177/1471082X20963254","url":null,"abstract":"Generalized multipartite networks consist in the joint observation of several networks implying some common pre-specified groups of individuals. Such complex networks arise commonly in social sciences, biology, ecology, etc. We propose a flexible probabilistic model named Multipartite Block Model (MBM) able to unravel the topology of multipartite networks by identifying clusters (blocks) of nodes sharing the same patterns of connectivity across the collection of networks they are involved in. The model parameters are estimated through a variational version of the Expectation–Maximization algorithm. The numbers of blocks are chosen using an Integrated Completed Likelihood criterion specifically designed for our model. A simulation study illustrates the robustness of the inference strategy. Finally, two datasets respectively issued from ecology and ethnobiology are analyzed with the MBM in order to illustrate its flexibility and its relevance for the analysis of real datasets. The inference procedure is implemented in an R-package GREMLIN, available on Github (https://github.com/Demiperimetre/GREMLINhttps://github.com/Demiperimetre/GREMLIN).","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"22 1","pages":"273 - 296"},"PeriodicalIF":1.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20963254","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44560116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-15DOI: 10.1177/1471082X20967158
R. Bivand, V. Gómez‐Rubio
Zhou and Hanson; Zhou and Hanson; Zhou and Hanson (2015, Nonparametric Bayesian Inference in Biostatistics, pages 215–46. Cham: Springer; 2018, Journal of the American Statistical Association, 113, 571–81; 2020, spBayesSurv: Bayesian Modeling and Analysis of Spatially Correlated Survival Data. R package version 1.1.4) and Zhou et al. (2020, Journal of Statistical Software, Articles, 92, 1–33) present methods for estimating spatial survival models using areal data. This article applies their methods to a dataset recording New Orleans business decisions to re-open after Hurricane Katrina; the data were included in LeSage et al. (2011b, Journal of the Royal Statistical Society: Series A (Statistics in Society), 174, 1007—27). In two articles (LeSage etal., 2011a, Significance, 8, 160—63; 2011b, Journal of the Royal Statistical Society: Series A (Statistics in Society), 174, 1007—27), spatial probit models are used to model spatial dependence in this dataset, with decisions to re-open aggregated to the first 90, 180 and 360 days. We re-cast the problem as one of examining the time-to-event records in the data, right-censored as observations ceased before 175 businesses had re-opened; we omit businesses already re-opened when observations began on Day 41. We are interested in checking whether the conclusions about the covariates using aspatial and spatial probit models are modified when applying survival and spatial survival models estimated using MCMC and INLA. In general, we find that the same covariates are associated with re-opening decisions in both modelling approaches. We do however find that data collected from three streets differ substantially, and that the streets are probably better handled separately or that the street effect should be included explicitly.
周和汉森;周和汉森;周和汉森(2015,生物统计学中的非参数贝叶斯推断,第215–46页。查姆:施普林格;2018年,《美国统计协会杂志》,113571–81;2020,spBayesSurv:空间相关生存数据的贝叶斯建模和分析。R软件包1.1.4版)和周等人(2020,《统计软件杂志》,文章,92,1-33)提出了使用区域数据估计空间生存模型的方法。本文将他们的方法应用于一个数据集,该数据集记录了卡特里娜飓风后新奥尔良重新开业的商业决策;数据包含在LeSage等人(2011b,英国皇家统计学会杂志:A系列(社会统计),1741007-27)中。在两篇文章中(LeSage et al.,2011a,Significance,8160-63;2011b,Journal of the Royal Statistical Society:Series A(Statistics In Society),1741007-27),空间概率模型用于对该数据集中的空间依赖性进行建模,并决定在前90、180和360天重新开放。我们将这个问题重新描述为检查数据中的事件时间记录,在175家企业重新开业之前,由于观察结果停止,因此对其进行了严格审查;我们忽略了第41天开始观察时已经重新开业的企业。当应用使用MCMC和INLA估计的生存率和空间生存率模型时,我们有兴趣检查使用空间概率和空间概率模型的关于协变量的结论是否被修改。通常,我们发现在两种建模方法中,相同的协变量与重新开放决策相关。然而,我们确实发现,从三条街道收集的数据存在很大差异,这些街道可能最好单独处理,或者应该明确包括街道效应。
{"title":"Spatial survival modelling of business re-opening after Katrina: Survival modelling compared to spatial probit modelling of re-opening within 3, 6 or 12 months","authors":"R. Bivand, V. Gómez‐Rubio","doi":"10.1177/1471082X20967158","DOIUrl":"https://doi.org/10.1177/1471082X20967158","url":null,"abstract":"Zhou and Hanson; Zhou and Hanson; Zhou and Hanson (2015, Nonparametric Bayesian Inference in Biostatistics, pages 215–46. Cham: Springer; 2018, Journal of the American Statistical Association, 113, 571–81; 2020, spBayesSurv: Bayesian Modeling and Analysis of Spatially Correlated Survival Data. R package version 1.1.4) and Zhou et al. (2020, Journal of Statistical Software, Articles, 92, 1–33) present methods for estimating spatial survival models using areal data. This article applies their methods to a dataset recording New Orleans business decisions to re-open after Hurricane Katrina; the data were included in LeSage et al. (2011b, Journal of the Royal Statistical Society: Series A (Statistics in Society), 174, 1007—27). In two articles (LeSage etal., 2011a, Significance, 8, 160—63; 2011b, Journal of the Royal Statistical Society: Series A (Statistics in Society), 174, 1007—27), spatial probit models are used to model spatial dependence in this dataset, with decisions to re-open aggregated to the first 90, 180 and 360 days. We re-cast the problem as one of examining the time-to-event records in the data, right-censored as observations ceased before 175 businesses had re-opened; we omit businesses already re-opened when observations began on Day 41. We are interested in checking whether the conclusions about the covariates using aspatial and spatial probit models are modified when applying survival and spatial survival models estimated using MCMC and INLA. In general, we find that the same covariates are associated with re-opening decisions in both modelling approaches. We do however find that data collected from three streets differ substantially, and that the streets are probably better handled separately or that the street effect should be included explicitly.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"21 1","pages":"137 - 160"},"PeriodicalIF":1.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20967158","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45849449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-13DOI: 10.1177/1471082X20967168
Zahra Mahdiyeh, I. Kazemi, G. Verbeke
This article introduces a flexible modelling strategy to extend the familiar mixed-effects models for analysing longitudinal responses in the multivariate setting. By initiating a flexible multivariate multimodal distribution, this strategy relaxes the imposed normality assumption of related random-effects. We use copulas to construct a multimodal form of elliptical distributions. It can deal with the multimodality of responses and the non-linearity of dependence structure. Moreover, the proposed model can flexibly accommodate clustered subject-effects for multiple longitudinal measurements. It is much useful when several subpopulations exist but cannot be directly identifiable. Since the implied marginal distribution is not in the closed form, to approximate the associated likelihood functions, we suggest a computational methodology based on the Gauss–Hermite quadrature that consequently enables us to implement standard optimization techniques. We conduct a simulation study to highlight the main properties of the theoretical part and make a comparison with regular mixture distributions. Results confirm that the new strategy deserves to receive attention in practice. We illustrate the usefulness of our model by the analysis of a real-life dataset taken from a low back pain study.
{"title":"A copula-based approach to joint modelling of multiple longitudinal responses with multimodal structures","authors":"Zahra Mahdiyeh, I. Kazemi, G. Verbeke","doi":"10.1177/1471082X20967168","DOIUrl":"https://doi.org/10.1177/1471082X20967168","url":null,"abstract":"This article introduces a flexible modelling strategy to extend the familiar mixed-effects models for analysing longitudinal responses in the multivariate setting. By initiating a flexible multivariate multimodal distribution, this strategy relaxes the imposed normality assumption of related random-effects. We use copulas to construct a multimodal form of elliptical distributions. It can deal with the multimodality of responses and the non-linearity of dependence structure. Moreover, the proposed model can flexibly accommodate clustered subject-effects for multiple longitudinal measurements. It is much useful when several subpopulations exist but cannot be directly identifiable. Since the implied marginal distribution is not in the closed form, to approximate the associated likelihood functions, we suggest a computational methodology based on the Gauss–Hermite quadrature that consequently enables us to implement standard optimization techniques. We conduct a simulation study to highlight the main properties of the theoretical part and make a comparison with regular mixture distributions. Results confirm that the new strategy deserves to receive attention in practice. We illustrate the usefulness of our model by the analysis of a real-life dataset taken from a low back pain study.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"22 1","pages":"327 - 348"},"PeriodicalIF":1.0,"publicationDate":"2020-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20967168","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42995318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-13DOI: 10.1177/1471082X20966919
Amani Almohaimeed, J. Einbeck
Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.
{"title":"Response transformations for random effect and variance component models","authors":"Amani Almohaimeed, J. Einbeck","doi":"10.1177/1471082X20966919","DOIUrl":"https://doi.org/10.1177/1471082X20966919","url":null,"abstract":"Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"22 1","pages":"297 - 326"},"PeriodicalIF":1.0,"publicationDate":"2020-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20966919","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42821112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1177/1471082X19870331
F. Finazzi, L. Paci
Localizing people across space and over time is a relevant and challenging problem in many modern applications. Smartphone ubiquity gives the opportunity to collect useful individual data as never before. In this work, the focus is on location data collected by smartphone applications. We propose a kernel-based density estimation approach that exploits cyclical spatio-temporal patterns of people to estimate the individual location density at any time, uncertainty included. Model parameters are estimated by maximum likelihood cross-validation. Unlike classic tracking methods designed for high spatio-temporal resolution data, the approach is suitable when location data are sparse in time and are affected by non-negligible errors. The approach is applied to location data collected by the Earthquake Network citizen science project which carries out a worldwide earthquake early warning system based on smartphones. The approach is parsimonious and is suitable to model location data gathered by any location-aware smartphone application.
{"title":"Kernel-based estimation of individual location densities from smartphone data","authors":"F. Finazzi, L. Paci","doi":"10.1177/1471082X19870331","DOIUrl":"https://doi.org/10.1177/1471082X19870331","url":null,"abstract":"Localizing people across space and over time is a relevant and challenging problem in many modern applications. Smartphone ubiquity gives the opportunity to collect useful individual data as never before. In this work, the focus is on location data collected by smartphone applications. We propose a kernel-based density estimation approach that exploits cyclical spatio-temporal patterns of people to estimate the individual location density at any time, uncertainty included. Model parameters are estimated by maximum likelihood cross-validation. Unlike classic tracking methods designed for high spatio-temporal resolution data, the approach is suitable when location data are sparse in time and are affected by non-negligible errors. The approach is applied to location data collected by the Earthquake Network citizen science project which carries out a worldwide earthquake early warning system based on smartphones. The approach is parsimonious and is suitable to model location data gathered by any location-aware smartphone application.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"20 1","pages":"617 - 633"},"PeriodicalIF":1.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X19870331","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48104005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-28DOI: 10.1177/1471082x211065785
S. Mews, R. Langrock, Marius Otting, Houda Yaqine, Jost Reinecke
Continuous-time state-space models (SSMs) are flexible tools for analysing irregularly sampled sequential observations that are driven by an underlying state process. Corresponding applications typically involve restrictive assumptions concerning linearity and Gaussianity to facilitate inference on the model parameters via the Kalman filter. In this contribution, we provide a general continuous-time SSM framework, allowing both the observation and the state process to be non-linear and non-Gaussian. Statistical inference is carried out by maximum approximate likelihood estimation, where multiple numerical integration within the likelihood evaluation is performed via a fine discretization of the state process. The corresponding reframing of the SSM as a continuous-time hidden Markov model, with structured state transitions, enables us to apply the associated efficient algorithms for parameter estimation and state decoding. We illustrate the modelling approach in a case study using data from a longitudinal study on delinquent behaviour of adolescents in Germany, revealing temporal persistence in the deviation of an individual's delinquency level from the population mean.
{"title":"Maximum approximate likelihood estimation of general continuous-time state-space models","authors":"S. Mews, R. Langrock, Marius Otting, Houda Yaqine, Jost Reinecke","doi":"10.1177/1471082x211065785","DOIUrl":"https://doi.org/10.1177/1471082x211065785","url":null,"abstract":"Continuous-time state-space models (SSMs) are flexible tools for analysing irregularly sampled sequential observations that are driven by an underlying state process. Corresponding applications typically involve restrictive assumptions concerning linearity and Gaussianity to facilitate inference on the model parameters via the Kalman filter. In this contribution, we provide a general continuous-time SSM framework, allowing both the observation and the state process to be non-linear and non-Gaussian. Statistical inference is carried out by maximum approximate likelihood estimation, where multiple numerical integration within the likelihood evaluation is performed via a fine discretization of the state process. The corresponding reframing of the SSM as a continuous-time hidden Markov model, with structured state transitions, enables us to apply the associated efficient algorithms for parameter estimation and state decoding. We illustrate the modelling approach in a case study using data from a longitudinal study on delinquent behaviour of adolescents in Germany, revealing temporal persistence in the deviation of an individual's delinquency level from the population mean.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"1 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45949267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-23DOI: 10.1177/1471082X21993603
Luca Merlo, A. Maruotti, L. Petrella
This article develops a two-part finite mixture quantile regression model for semi-continuous longitudinal data. The proposed methodology allows heterogeneity sources that influence the model for the binary response variable to also influence the distribution of the positive outcomes. As is common in the quantile regression literature, estimation and inference on the model parameters are based on the asymmetric Laplace distribution. Maximum likelihood estimates are obtained through the EM algorithm without parametric assumptions on the random effects distribution. In addition, a penalized version of the EM algorithm is presented to tackle the problem of variable selection. The proposed statistical method is applied to the well-known RAND Health Insurance Experiment dataset which gives further insights on its empirical behaviour.
{"title":"Two-part quantile regression models for semi-continuous longitudinal data: A finite mixture approach","authors":"Luca Merlo, A. Maruotti, L. Petrella","doi":"10.1177/1471082X21993603","DOIUrl":"https://doi.org/10.1177/1471082X21993603","url":null,"abstract":"This article develops a two-part finite mixture quantile regression model for semi-continuous longitudinal data. The proposed methodology allows heterogeneity sources that influence the model for the binary response variable to also influence the distribution of the positive outcomes. As is common in the quantile regression literature, estimation and inference on the model parameters are based on the asymmetric Laplace distribution. Maximum likelihood estimates are obtained through the EM algorithm without parametric assumptions on the random effects distribution. In addition, a penalized version of the EM algorithm is presented to tackle the problem of variable selection. The proposed statistical method is applied to the well-known RAND Health Insurance Experiment dataset which gives further insights on its empirical behaviour.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"22 1","pages":"485 - 508"},"PeriodicalIF":1.0,"publicationDate":"2020-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X21993603","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43893165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-22DOI: 10.1177/1471082X20949710
L. Grilli, Maria Francesca Marino, O. Paccagnella, C. Rampichini
The article is motivated by the analysis of the relationship between university student ratings and teacher practices and attitudes, which are measured via a set of binary and ordinal items collected by an innovative survey. The analysis is conducted through a two-level random intercept model, where student ratings are nested within teachers. The analysis must face two issues about the items measuring teacher practices and attitudes, which are level 2 predictors: (a) the items are severely affected by missingness due to teacher non-response and (b) there is redundancy in both the number of items and the number of categories of their measurement scale. We tackle the missing data issue by considering a multiple imputation strategy exploiting information at both student and teacher levels. For the redundancy issue, we rely on regularization techniques for ordinal predictors, also accounting for the multilevel data structure. The proposed solution addresses the problem at hand in an original way, and it can be applied whenever it is required to select level 2 predictors affected by missing values. The results obtained with the final model indicate that ratings on teacher ability to motivate students are related to certain teacher practices and attitudes.
{"title":"Multiple imputation and selection of ordinal level 2 predictors in multilevel models: An analysis of the relationship between student ratings and teacher practices and attitudes","authors":"L. Grilli, Maria Francesca Marino, O. Paccagnella, C. Rampichini","doi":"10.1177/1471082X20949710","DOIUrl":"https://doi.org/10.1177/1471082X20949710","url":null,"abstract":"The article is motivated by the analysis of the relationship between university student ratings and teacher practices and attitudes, which are measured via a set of binary and ordinal items collected by an innovative survey. The analysis is conducted through a two-level random intercept model, where student ratings are nested within teachers. The analysis must face two issues about the items measuring teacher practices and attitudes, which are level 2 predictors: (a) the items are severely affected by missingness due to teacher non-response and (b) there is redundancy in both the number of items and the number of categories of their measurement scale. We tackle the missing data issue by considering a multiple imputation strategy exploiting information at both student and teacher levels. For the redundancy issue, we rely on regularization techniques for ordinal predictors, also accounting for the multilevel data structure. The proposed solution addresses the problem at hand in an original way, and it can be applied whenever it is required to select level 2 predictors affected by missing values. The results obtained with the final model indicate that ratings on teacher ability to motivate students are related to certain teacher practices and attitudes.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"22 1","pages":"221 - 238"},"PeriodicalIF":1.0,"publicationDate":"2020-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20949710","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47914577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-28DOI: 10.1177/1471082X20947222
D. Costantin, Andrea Sottosanti, A. Brazzale, D. Bastieri, J. Fan
Identifying as yet undetected high-energy sources in the γ -ray sky is one of the declared objectives of the Fermi Large Area Telescope (LAT) Collaboration. We develop a Bayesian mixture model which is capable of disentangling the high-energy extra-galactic sources present in a given sky region from the pervasive background radiation. We achieve this by combining two model components. The first component models the emission activity of the single sources and incorporates the instrument response function of the Fermi γ -ray space telescope. The second component reliably reflects the current knowledge of the physical phenomena which underlie the γ -ray background. The model parameters are estimated using a reversible jump MCMC algorithm, which simultaneously returns the number of detected sources, their locations and relative intensities, and the background component. Our proposal is illustrated using a sample of the Fermi LAT data. In the analysed sky region, our model correctly identifies 116 sources out of the 132 present. The detection rate and the estimated directions and intensities of the identified sources are largely unaffected by the number of detected sources.
{"title":"Bayesian mixture modelling of the high-energy photon counts collected by the Fermi Large Area Telescope","authors":"D. Costantin, Andrea Sottosanti, A. Brazzale, D. Bastieri, J. Fan","doi":"10.1177/1471082X20947222","DOIUrl":"https://doi.org/10.1177/1471082X20947222","url":null,"abstract":"Identifying as yet undetected high-energy sources in the γ -ray sky is one of the declared objectives of the Fermi Large Area Telescope (LAT) Collaboration. We develop a Bayesian mixture model which is capable of disentangling the high-energy extra-galactic sources present in a given sky region from the pervasive background radiation. We achieve this by combining two model components. The first component models the emission activity of the single sources and incorporates the instrument response function of the Fermi γ -ray space telescope. The second component reliably reflects the current knowledge of the physical phenomena which underlie the γ -ray background. The model parameters are estimated using a reversible jump MCMC algorithm, which simultaneously returns the number of detected sources, their locations and relative intensities, and the background component. Our proposal is illustrated using a sample of the Fermi LAT data. In the analysed sky region, our model correctly identifies 116 sources out of the 132 present. The detection rate and the estimated directions and intensities of the identified sources are largely unaffected by the number of detected sources.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"22 1","pages":"175 - 198"},"PeriodicalIF":1.0,"publicationDate":"2020-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20947222","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47939426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-28DOI: 10.1177/1471082X20945069
K. Mauff, N. Erler, I. Kardys, D. Rizopoulos
Multiple longitudinal outcomes are theoretically easily modelled via extension of the generalized linear mixed effects model. However, due to computational limitations in high dimensions, in practice these models are applied only in situations with relatively few outcomes. We adapt the solution proposed by Fieuws and Verbeke (2006) to the Bayesian setting: fitting all pairwise bivariate models instead of a single multivariate model, and combining the Markov Chain Monte Carlo (MCMC) realizations obtained for each pairwise bivariate model for the relevant parameters. We explore importance sampling as a method to more closely approximate the correct multivariate posterior distribution. Simulation studies show satisfactory results in terms of bias, RMSE and coverage of the 95% credible intervals for multiple longitudinal outcomes, even in scenarios with more limited information and non-continuous outcomes, although the use of importance sampling is not successful. We further examine the incorporation of a time-to-event outcome, proposing the use of Bayesian pairwise estimation of a multivariate GLMM in an adaptation of the corrected two-stage estimation procedure for the joint model for multiple longitudinal outcomes and a time-to-event outcome (Mauff et al., 2020, Statistics and Computing). The method does not work as well in the case of the corrected two-stage joint model; however, the results are promising and should be explored further.
通过广义线性混合效应模型的扩展,理论上可以很容易地对多个纵向结果进行建模。然而,由于高维的计算限制,在实践中,这些模型仅适用于结果相对较少的情况。我们将Fieuws和Verbeke(2006)提出的解决方案应用于贝叶斯设置:拟合所有成对的二变量模型而不是单个多变量模型,并将为每个成对的二元模型获得的马尔可夫链蒙特卡罗(MCMC)实现与相关参数相结合。我们探索重要性抽样作为一种更接近正确的多元后验分布的方法。模拟研究表明,即使在信息更有限和结果不连续的情况下,在多个纵向结果的偏倚、RMSE和95%可信区间的覆盖率方面也取得了令人满意的结果,尽管重要性抽样的使用并不成功。我们进一步研究了时间到事件结果的结合,建议在多个纵向结果和时间到事件的联合模型的校正两阶段估计程序的自适应中使用多变量GLMM的贝叶斯成对估计(Mauff et al.,2020,Statistics and Computing)。该方法在修正的两阶段联合模型的情况下效果不佳;然而,结果是有希望的,应该进一步探索。
{"title":"Pairwise estimation of multivariate longitudinal outcomes in a Bayesian setting with extensions to the joint model","authors":"K. Mauff, N. Erler, I. Kardys, D. Rizopoulos","doi":"10.1177/1471082X20945069","DOIUrl":"https://doi.org/10.1177/1471082X20945069","url":null,"abstract":"Multiple longitudinal outcomes are theoretically easily modelled via extension of the generalized linear mixed effects model. However, due to computational limitations in high dimensions, in practice these models are applied only in situations with relatively few outcomes. We adapt the solution proposed by Fieuws and Verbeke (2006) to the Bayesian setting: fitting all pairwise bivariate models instead of a single multivariate model, and combining the Markov Chain Monte Carlo (MCMC) realizations obtained for each pairwise bivariate model for the relevant parameters. We explore importance sampling as a method to more closely approximate the correct multivariate posterior distribution. Simulation studies show satisfactory results in terms of bias, RMSE and coverage of the 95% credible intervals for multiple longitudinal outcomes, even in scenarios with more limited information and non-continuous outcomes, although the use of importance sampling is not successful. We further examine the incorporation of a time-to-event outcome, proposing the use of Bayesian pairwise estimation of a multivariate GLMM in an adaptation of the corrected two-stage estimation procedure for the joint model for multiple longitudinal outcomes and a time-to-event outcome (Mauff et al., 2020, Statistics and Computing). The method does not work as well in the case of the corrected two-stage joint model; however, the results are promising and should be explored further.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"21 1","pages":"115 - 136"},"PeriodicalIF":1.0,"publicationDate":"2020-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1471082X20945069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45650265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}