The present review provides a survey on basic models of ordered data and censoring mechanisms with a focus on lifetime data, failure data, and reliability applications. Throughout, illustrations of the data generation process as well as of the censoring mechanisms are used to visualize these procedures. By example we present basic results assuming a life testing model with independent and identically distributed measurements and focus on selected inferential results for exponentially distributed lifetimes. In particular, we aim to illustrate similarities between the models as well as to highlight some interesting exact statistical results. It is not intended to survey all possible model assumptions, probabilistic results, and used inferential methods used in this framework. For this purpose as well as for further reading, we provide an extensive bibliography.
{"title":"Ordered and censored lifetime data in reliability: An illustrative review","authors":"E. Cramer","doi":"10.1002/wics.1571","DOIUrl":"https://doi.org/10.1002/wics.1571","url":null,"abstract":"The present review provides a survey on basic models of ordered data and censoring mechanisms with a focus on lifetime data, failure data, and reliability applications. Throughout, illustrations of the data generation process as well as of the censoring mechanisms are used to visualize these procedures. By example we present basic results assuming a life testing model with independent and identically distributed measurements and focus on selected inferential results for exponentially distributed lifetimes. In particular, we aim to illustrate similarities between the models as well as to highlight some interesting exact statistical results. It is not intended to survey all possible model assumptions, probabilistic results, and used inferential methods used in this framework. For this purpose as well as for further reading, we provide an extensive bibliography.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":"15 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42744277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data are often collected from multiple heterogeneous sources and are combined subsequently. In combing data, record linkage is an essential task for linking records in datasets that refer to the same entity. Record linkage is generally not error‐free; there is a possibility that records belonging to different entities are linked or that records belonging to the same entity are missed. It is not advisable to simply ignore such errors because they can lead to data contamination and introduce bias in sample selection or estimation, which, in return, can lead to misleading statistical results and conclusions. For a long while, this problem was not properly recognized, but in recent years a growing number of researchers have developed methodology for dealing with linkage errors in regression analysis with linked datasets. The main goal of this overview is to give an account of those developments, with an emphasis on recent approaches and their connection to the so‐called “Broken Sample” problem. We also provide a short empirical study that illustrates the efficacy of corrective methods in different scenarios.
{"title":"Regression with linked datasets subject to linkage error","authors":"Zhenbang Wang, E. Ben-David, G. Diao, M. Slawski","doi":"10.1002/wics.1570","DOIUrl":"https://doi.org/10.1002/wics.1570","url":null,"abstract":"Data are often collected from multiple heterogeneous sources and are combined subsequently. In combing data, record linkage is an essential task for linking records in datasets that refer to the same entity. Record linkage is generally not error‐free; there is a possibility that records belonging to different entities are linked or that records belonging to the same entity are missed. It is not advisable to simply ignore such errors because they can lead to data contamination and introduce bias in sample selection or estimation, which, in return, can lead to misleading statistical results and conclusions. For a long while, this problem was not properly recognized, but in recent years a growing number of researchers have developed methodology for dealing with linkage errors in regression analysis with linked datasets. The main goal of this overview is to give an account of those developments, with an emphasis on recent approaches and their connection to the so‐called “Broken Sample” problem. We also provide a short empirical study that illustrates the efficacy of corrective methods in different scenarios.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46320164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stable distributions are a class of probability distributions that generalize the normal distribution. They are the only possible limits of normalized sums of independent, identically distributed terms, so sums of a large number of such terms have to approach a stable law. The non‐Gaussian stable distributions have heavy tails with infinite variance, and can be skewed. In most cases, there are no known formulas for the density or cumulative distribution function of these laws, so using them in practice requires significant computational methods. This paper explains some of the computations used to make stable laws useful in practical problems.
{"title":"Computational aspects of stable distributions","authors":"J. P. Nolan","doi":"10.1002/wics.1569","DOIUrl":"https://doi.org/10.1002/wics.1569","url":null,"abstract":"Stable distributions are a class of probability distributions that generalize the normal distribution. They are the only possible limits of normalized sums of independent, identically distributed terms, so sums of a large number of such terms have to approach a stable law. The non‐Gaussian stable distributions have heavy tails with infinite variance, and can be skewed. In most cases, there are no known formulas for the density or cumulative distribution function of these laws, so using them in practice requires significant computational methods. This paper explains some of the computations used to make stable laws useful in practical problems.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1569","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43746919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mike K. P. So, Amanda M. Y. Chu, Cliff C. Y. Lo, Chun Yin Ip
Since the introduction of ARCH models close to 40 years ago, a wide range of models for volatility estimation and prediction have been developed and integrated into asset allocation, financial derivative pricing, and financial risk management. Research has also been very active in extending volatility modeling to dependence modeling and in developing our understanding of risk and uncertainty in financial systems. This paper presents a review on the statistical modeling on volatility and dynamic dependence of financial returns. In addition, we present a real data example using a time‐varying copula model to estimate the dynamic dependence of stock returns. Research on volatility and dynamic dependence modeling will continue to encounter statistical and computational challenges; it is necessary to persist in dealing with the 3H (high dimension, high frequency, high complexity) paradigm in modeling.
{"title":"Volatility and dynamic dependence modeling: Review, applications, and financial risk management","authors":"Mike K. P. So, Amanda M. Y. Chu, Cliff C. Y. Lo, Chun Yin Ip","doi":"10.1002/wics.1567","DOIUrl":"https://doi.org/10.1002/wics.1567","url":null,"abstract":"Since the introduction of ARCH models close to 40 years ago, a wide range of models for volatility estimation and prediction have been developed and integrated into asset allocation, financial derivative pricing, and financial risk management. Research has also been very active in extending volatility modeling to dependence modeling and in developing our understanding of risk and uncertainty in financial systems. This paper presents a review on the statistical modeling on volatility and dynamic dependence of financial returns. In addition, we present a real data example using a time‐varying copula model to estimate the dynamic dependence of stock returns. Research on volatility and dynamic dependence modeling will continue to encounter statistical and computational challenges; it is necessary to persist in dealing with the 3H (high dimension, high frequency, high complexity) paradigm in modeling.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1567","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43155204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taeho Kim, Benjamin Lieberman, G. Luta, Edsel A. Peña
This paper provides a review of the literature regarding methods for constructing prediction intervals for counting variables, with particular focus on those whose distributions are Poisson or derived from Poisson and with an over‐dispersion property. Independent and identically distributed models and regression models are both considered. The motivating problem for this review is that of predicting the number of daily and cumulative cases or deaths attributable to COVID‐19 at a future date.
{"title":"Prediction intervals for Poisson‐based regression models","authors":"Taeho Kim, Benjamin Lieberman, G. Luta, Edsel A. Peña","doi":"10.1002/wics.1568","DOIUrl":"https://doi.org/10.1002/wics.1568","url":null,"abstract":"This paper provides a review of the literature regarding methods for constructing prediction intervals for counting variables, with particular focus on those whose distributions are Poisson or derived from Poisson and with an over‐dispersion property. Independent and identically distributed models and regression models are both considered. The motivating problem for this review is that of predicting the number of daily and cumulative cases or deaths attributable to COVID‐19 at a future date.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1568","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44356294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Identifying and tracking community structures in complex networks are one of the cornerstones of network studies, spanning multiple disciplines, from statistics to machine learning to social sciences, and involving even a broader range of application areas, from biology to politics to blockchain. This survey paper aims to provide an overview of some most popular approaches in statistical network community detection as well as the newly emerging research directions such as community extraction with higher‐order features and community discovery in multilayer and multiscale networks. Our goal is to offer a unified view at methodological interconnections and the wide spectrum of interdisciplinary data science applications of network community analysis.
{"title":"Community detection in complex networks: From statistical foundations to data science applications","authors":"A. K. Dey, Yahui Tian, Y. Gel","doi":"10.1002/wics.1566","DOIUrl":"https://doi.org/10.1002/wics.1566","url":null,"abstract":"Identifying and tracking community structures in complex networks are one of the cornerstones of network studies, spanning multiple disciplines, from statistics to machine learning to social sciences, and involving even a broader range of application areas, from biology to politics to blockchain. This survey paper aims to provide an overview of some most popular approaches in statistical network community detection as well as the newly emerging research directions such as community extraction with higher‐order features and community discovery in multilayer and multiscale networks. Our goal is to offer a unified view at methodological interconnections and the wide spectrum of interdisciplinary data science applications of network community analysis.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":"14 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1566","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41631557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, we present an overview of recent advances in Bayesian modeling and analysis of multivariate time series of counts. We discuss basic modeling strategies including integer valued autoregressive processes, multivariate Poisson time series and dynamic latent factor models. In so doing, we make a connection with univariate modeling frameworks such as dynamic generalized models, Poisson state space models with gamma evolution and present Bayesian approaches that extend these frameworks to multivariate setting. During our development, recent Bayesian approaches to the analysis of integer valued autoregressive processes and multivariate Poisson models are highlighted and concepts such as “decouple/recouple” and “common random environment” are presented. The role that these concepts play in Bayesian modeling and analysis of multivariate time series are discussed. Computational issues associated with Bayesian inference and forecasting from these models are also considered.
{"title":"Bayesian modeling of multivariate time series of counts","authors":"R. Soyer, Di Zhang","doi":"10.1002/wics.1559","DOIUrl":"https://doi.org/10.1002/wics.1559","url":null,"abstract":"In this article, we present an overview of recent advances in Bayesian modeling and analysis of multivariate time series of counts. We discuss basic modeling strategies including integer valued autoregressive processes, multivariate Poisson time series and dynamic latent factor models. In so doing, we make a connection with univariate modeling frameworks such as dynamic generalized models, Poisson state space models with gamma evolution and present Bayesian approaches that extend these frameworks to multivariate setting. During our development, recent Bayesian approaches to the analysis of integer valued autoregressive processes and multivariate Poisson models are highlighted and concepts such as “decouple/recouple” and “common random environment” are presented. The role that these concepts play in Bayesian modeling and analysis of multivariate time series are discussed. Computational issues associated with Bayesian inference and forecasting from these models are also considered.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1559","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47699602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The global minimum variance portfolio (GMVP) is the starting point of the Markowitz mean‐variance efficient frontier. The estimation of the GMVP weights is therefore of much importance for financial investors. The GMVP weights depend only on the inverse covariance matrix of returns on financial risky assets, for this reason the estimated GMVP weights are subject to substantial estimation risk, especially in high‐dimensional portfolio settings. In this paper we review the recent literature on traditional sample estimators for the unconditional GMVP weights which are typically based on daily asset returns, as well as on modern realized estimators for the conditional GMVP weights based on intraday high‐frequency returns. We present various types of GMVP estimators with the corresponding stochastic results, discuss statistical tests and point on some directions for further research. Our empirical application illustrates selected properties of realized GMVP weights.
{"title":"Sample and realized minimum variance portfolios: Estimation, statistical inference, and tests","authors":"Vasyl Golosnoy, Bastian Gribisch, M. Seifert","doi":"10.1002/wics.1556","DOIUrl":"https://doi.org/10.1002/wics.1556","url":null,"abstract":"The global minimum variance portfolio (GMVP) is the starting point of the Markowitz mean‐variance efficient frontier. The estimation of the GMVP weights is therefore of much importance for financial investors. The GMVP weights depend only on the inverse covariance matrix of returns on financial risky assets, for this reason the estimated GMVP weights are subject to substantial estimation risk, especially in high‐dimensional portfolio settings. In this paper we review the recent literature on traditional sample estimators for the unconditional GMVP weights which are typically based on daily asset returns, as well as on modern realized estimators for the conditional GMVP weights based on intraday high‐frequency returns. We present various types of GMVP estimators with the corresponding stochastic results, discuss statistical tests and point on some directions for further research. Our empirical application illustrates selected properties of realized GMVP weights.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1556","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45011827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}