Pub Date : 2024-08-21DOI: 10.1146/annurev-statistics-112723-034555
Naomi Altman, Martin Krzywinski
Points of Significance is an ongoing series of short articles about statistics in Nature Methods that started in 2013. Its aim is to provide clear explanations of essential concepts in statistics for a nonspecialist audience. The articles favor heuristic explanations and make extensive use of simulated examples and graphical explanations, while maintaining mathematical rigor. Topics range from basic, but often misunderstood, such as uncertainty and p-values, to relatively advanced, but often neglected, such as the error-in-variables problem and the curse of dimensionality. More recent articles have focused on timely topics such as modeling of epidemics, machine learning, and neural networks. In this article, we discuss the evolution of topics and details behind some of the story arcs, our approach to crafting statistical explanations and narratives, and our use of figures and numerical simulations as props for building understanding.
意义之点》是《自然-方法》(Nature Methods)杂志从 2013 年开始持续推出的统计学短文系列。其目的是为非专业读者提供统计学基本概念的清晰解释。这些文章倾向于启发式解释,并广泛使用模拟示例和图表说明,同时保持数学的严谨性。主题范围从基本但经常被误解的内容,如不确定性和 p 值,到相对高级但经常被忽视的内容,如变量误差问题和维度诅咒。最近的文章主要关注流行病建模、机器学习和神经网络等适时的主题。在本文中,我们将讨论一些故事弧线背后的主题和细节的演变、我们制作统计解释和叙述的方法,以及我们使用数字和数值模拟作为建立理解的道具。
{"title":"Crafting 10 Years of Statistics Explanations: Points of Significance","authors":"Naomi Altman, Martin Krzywinski","doi":"10.1146/annurev-statistics-112723-034555","DOIUrl":"https://doi.org/10.1146/annurev-statistics-112723-034555","url":null,"abstract":"Points of Significance is an ongoing series of short articles about statistics in <jats:italic>Nature Methods</jats:italic> that started in 2013. Its aim is to provide clear explanations of essential concepts in statistics for a nonspecialist audience. The articles favor heuristic explanations and make extensive use of simulated examples and graphical explanations, while maintaining mathematical rigor. Topics range from basic, but often misunderstood, such as uncertainty and <jats:italic>p</jats:italic>-values, to relatively advanced, but often neglected, such as the error-in-variables problem and the curse of dimensionality. More recent articles have focused on timely topics such as modeling of epidemics, machine learning, and neural networks. In this article, we discuss the evolution of topics and details behind some of the story arcs, our approach to crafting statistical explanations and narratives, and our use of figures and numerical simulations as props for building understanding.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"13 1","pages":""},"PeriodicalIF":7.9,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142022040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-19DOI: 10.1146/annurev-statistics-112723-034507
Susan M. Paddock, Carolina Franco, F. Jay Breidt, Brenda Betancourt
Health policy evidence-building requires data sources such as health care claims, electronic health records, probability and nonprobability survey data, epidemiological surveillance databases, administrative data, and more, all of which have strengths and limitations for a given policy analysis. Data integration techniques leverage the relative strengths of input sources to obtain a blended source that is richer, more informative, and more fit for use than any single input component. This review notes the expansion of opportunities to use data integration for health policy analyses, reviews key methodological approaches to expand the number of variables in a data set or to increase the precision of estimates, and provides directions for future research. As data quality improvement motivates data integration, key data quality frameworks are provided to structure assessments of candidate input data sources.
{"title":"Statistical Data Integration for Health Policy Evidence-Building","authors":"Susan M. Paddock, Carolina Franco, F. Jay Breidt, Brenda Betancourt","doi":"10.1146/annurev-statistics-112723-034507","DOIUrl":"https://doi.org/10.1146/annurev-statistics-112723-034507","url":null,"abstract":"Health policy evidence-building requires data sources such as health care claims, electronic health records, probability and nonprobability survey data, epidemiological surveillance databases, administrative data, and more, all of which have strengths and limitations for a given policy analysis. Data integration techniques leverage the relative strengths of input sources to obtain a blended source that is richer, more informative, and more fit for use than any single input component. This review notes the expansion of opportunities to use data integration for health policy analyses, reviews key methodological approaches to expand the number of variables in a data set or to increase the precision of estimates, and provides directions for future research. As data quality improvement motivates data integration, key data quality frameworks are provided to structure assessments of candidate input data sources.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"135 1","pages":""},"PeriodicalIF":7.9,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142007340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-24DOI: 10.1146/annurev-statistics-040522-114848
Serge Aleshin-Guendel, Rebecca C. Steorts
Entity resolution is the process of merging and removing duplicate records from multiple data sources, often in the absence of unique identifiers. Bayesian models for entity resolution allow one to include a priori information, quantify uncertainty in important applications, and directly estimate a partition of the records. Markov chain Monte Carlo (MCMC) sampling is the primary computational method for approximate posterior inference in this setting, but due to the high dimensionality of the space of partitions, there are no agreed upon standards for diagnosing nonconvergence of MCMC sampling. In this article, we review Bayesian entity resolution, with a focus on the specific challenges that it poses for the convergence of a Markov chain. We review prior methods for convergence diagnostics, discussing their weaknesses. We provide recommendations for using MCMC sampling for Bayesian entity resolution, focusing on the use of modern diagnostics that are commonplace in applied Bayesian statistics. Using simulated data, we find that a commonly used Gibbs sampler performs poorly compared with two alternatives.
{"title":"Convergence Diagnostics for Entity Resolution","authors":"Serge Aleshin-Guendel, Rebecca C. Steorts","doi":"10.1146/annurev-statistics-040522-114848","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-114848","url":null,"abstract":"Entity resolution is the process of merging and removing duplicate records from multiple data sources, often in the absence of unique identifiers. Bayesian models for entity resolution allow one to include a priori information, quantify uncertainty in important applications, and directly estimate a partition of the records. Markov chain Monte Carlo (MCMC) sampling is the primary computational method for approximate posterior inference in this setting, but due to the high dimensionality of the space of partitions, there are no agreed upon standards for diagnosing nonconvergence of MCMC sampling. In this article, we review Bayesian entity resolution, with a focus on the specific challenges that it poses for the convergence of a Markov chain. We review prior methods for convergence diagnostics, discussing their weaknesses. We provide recommendations for using MCMC sampling for Bayesian entity resolution, focusing on the use of modern diagnostics that are commonplace in applied Bayesian statistics. Using simulated data, we find that a commonly used Gibbs sampler performs poorly compared with two alternatives.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"4 1","pages":""},"PeriodicalIF":7.9,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140642691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-24DOI: 10.1146/annurev-statistics-040522-101020
Colin Aitken, Franco Taroni, Silvia Bozza
The use of the Bayes factor as a metric for the assessment of the probative value of forensic scientific evidence is largely supported by recommended standards in different disciplines. The application of Bayesian networks enables the consideration of problems of increasing complexity. The lack of a widespread consensus concerning key aspects of evidence evaluation and interpretation, such as the adequacy of a probabilistic framework for handling uncertainty or the method by which conclusions regarding how the strength of the evidence should be reported to a court, has meant the role of the Bayes factor in the administration of criminal justice has come under increasing challenge in recent years. We review the many advantages the Bayes factor has as an approach to the evaluation and interpretation of evidence.
{"title":"The Role of the Bayes Factor in the Evaluation of Evidence","authors":"Colin Aitken, Franco Taroni, Silvia Bozza","doi":"10.1146/annurev-statistics-040522-101020","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-101020","url":null,"abstract":"The use of the Bayes factor as a metric for the assessment of the probative value of forensic scientific evidence is largely supported by recommended standards in different disciplines. The application of Bayesian networks enables the consideration of problems of increasing complexity. The lack of a widespread consensus concerning key aspects of evidence evaluation and interpretation, such as the adequacy of a probabilistic framework for handling uncertainty or the method by which conclusions regarding how the strength of the evidence should be reported to a court, has meant the role of the Bayes factor in the administration of criminal justice has come under increasing challenge in recent years. We review the many advantages the Bayes factor has as an approach to the evaluation and interpretation of evidence.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"19 1","pages":""},"PeriodicalIF":7.9,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140642773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-29DOI: 10.1146/annurev-statistics-040522-022138
Zheng Tracy Ke, Pengsheng Ji, Jiashun Jin, Wanshan Li
Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze the Multi-Attribute Data Set on Statisticians (MADStat), a data set on statistical publications that we collected and cleaned. The application of Topic-SCORE and other methods to MADStat leads to interesting findings. For example, we identified 11 representative topics in statistics. For each journal, the evolution of topic weights over time can be visualized, and these results are used to analyze the trends in statistical research. In particular, we propose a new statistical model for ranking the citation impacts of 11 topics, and we also build a cross-topic citation graph to illustrate how research results on different topics spread to one another. The results on MADStat provide a data-driven picture of the statistical research from 1975 to 2015, from a text analysis perspective.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
{"title":"Recent Advances in Text Analysis","authors":"Zheng Tracy Ke, Pengsheng Ji, Jiashun Jin, Wanshan Li","doi":"10.1146/annurev-statistics-040522-022138","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-022138","url":null,"abstract":"Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze the Multi-Attribute Data Set on Statisticians (MADStat), a data set on statistical publications that we collected and cleaned. The application of Topic-SCORE and other methods to MADStat leads to interesting findings. For example, we identified 11 representative topics in statistics. For each journal, the evolution of topic weights over time can be visualized, and these results are used to analyze the trends in statistical research. In particular, we propose a new statistical model for ranking the citation impacts of 11 topics, and we also build a cross-topic citation graph to illustrate how research results on different topics spread to one another. The results on MADStat provide a data-driven picture of the statistical research from 1975 to 2015, from a text analysis perspective.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"101 8","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138455325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-29DOI: 10.1146/annurev-statistics-040522-115238
Marina Meilă, Hanyu Zhang
Manifold learning (ML), also known as nonlinear dimension reduction, is a set of methods to find the low-dimensional structure of data. Dimension reduction for large, high-dimensional data is not merely a way to reduce the data; the new representations and descriptors obtained by ML reveal the geometric shape of high-dimensional point clouds and allow one to visualize, denoise, and interpret them. This review presents the underlying principles of ML, its representative methods, and their statistical foundations, all from a practicing statistician's perspective. It describes the trade-offs and what theory tells us about the parameter and algorithmic choices we make in order to obtain reliable conclusions.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
{"title":"Manifold Learning: What, How, and Why","authors":"Marina Meilă, Hanyu Zhang","doi":"10.1146/annurev-statistics-040522-115238","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-115238","url":null,"abstract":"Manifold learning (ML), also known as nonlinear dimension reduction, is a set of methods to find the low-dimensional structure of data. Dimension reduction for large, high-dimensional data is not merely a way to reduce the data; the new representations and descriptors obtained by ML reveal the geometric shape of high-dimensional point clouds and allow one to visualize, denoise, and interpret them. This review presents the underlying principles of ML, its representative methods, and their statistical foundations, all from a practicing statistician's perspective. It describes the trade-offs and what theory tells us about the parameter and algorithmic choices we make in order to obtain reliable conclusions.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"101 5","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138455328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-29DOI: 10.1146/annurev-statistics-040722-052011
Claudia R. Schneider, John R. Kerr, Sarah Dryhurst, John A.D. Aston
This review provides an overview of concepts relating to the communication of statistical and empirical evidence in times of crisis, with a special focus on COVID-19. In it, we consider topics relating to both the communication of numbers, such as the role of format, context, comparisons, and visualization, and the communication of evidence more broadly, such as evidence quality, the influence of changes in available evidence, transparency, and repeated decision-making. A central focus is on the communication of the inherent uncertainties in statistical analysis, especially in rapidly changing informational environments during crises. We present relevant literature on these topics and draw connections to the communication of statistics and empirical evidence during the COVID-19 pandemic and beyond. We finish by suggesting some considerations for those faced with communicating statistics and evidence in times of crisis.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
{"title":"Communication of Statistics and Evidence in Times of Crisis","authors":"Claudia R. Schneider, John R. Kerr, Sarah Dryhurst, John A.D. Aston","doi":"10.1146/annurev-statistics-040722-052011","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040722-052011","url":null,"abstract":"This review provides an overview of concepts relating to the communication of statistical and empirical evidence in times of crisis, with a special focus on COVID-19. In it, we consider topics relating to both the communication of numbers, such as the role of format, context, comparisons, and visualization, and the communication of evidence more broadly, such as evidence quality, the influence of changes in available evidence, transparency, and repeated decision-making. A central focus is on the communication of the inherent uncertainties in statistical analysis, especially in rapidly changing informational environments during crises. We present relevant literature on these topics and draw connections to the communication of statistics and empirical evidence during the COVID-19 pandemic and beyond. We finish by suggesting some considerations for those faced with communicating statistics and evidence in times of crisis.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"101 7","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138455326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-29DOI: 10.1146/annurev-statistics-032921-040851
Lance A. Waller
Maps provide a data framework for the statistical analysis of georeferenced data observations. Since the middle of the twentieth century, the field of spatial statistics has evolved to address key inferential questions relating to spatially defined data, yet many central statistical properties do not translate to spatially indexed and spatially correlated data, and the development of statistical inference for mapped data remains an active area of research. Rather than review statistical techniques, we review the different ways the maps of georeferenced data can influence statistical analysis, focusing especially on maps as data visualization, maps as data structures, and maps as statistics themselves, i.e., summaries of underlying patterns with accompanying uncertainty. The categories provide connections to disparate literatures addressing spatial analysis including data visualization, cartography, spatial statistics, and geography. We find maps integrate spatial analysis from motivating questions, informing analytic methods, and providing context for results.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
{"title":"Maps: A Statistical View","authors":"Lance A. Waller","doi":"10.1146/annurev-statistics-032921-040851","DOIUrl":"https://doi.org/10.1146/annurev-statistics-032921-040851","url":null,"abstract":"Maps provide a data framework for the statistical analysis of georeferenced data observations. Since the middle of the twentieth century, the field of spatial statistics has evolved to address key inferential questions relating to spatially defined data, yet many central statistical properties do not translate to spatially indexed and spatially correlated data, and the development of statistical inference for mapped data remains an active area of research. Rather than review statistical techniques, we review the different ways the maps of georeferenced data can influence statistical analysis, focusing especially on maps as data visualization, maps as data structures, and maps as statistics themselves, i.e., summaries of underlying patterns with accompanying uncertainty. The categories provide connections to disparate literatures addressing spatial analysis including data visualization, cartography, spatial statistics, and geography. We find maps integrate spatial analysis from motivating questions, informing analytic methods, and providing context for results.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"101 6","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138455327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-28DOI: 10.1146/annurev-statistics-040522-020722
Sean L. Simpson, Heather M. Shappell, Mohsen Bahrami
The recent fusion of network science and neuroscience has catalyzed a paradigm shift in how we study the brain and led to the field of brain network analysis. Brain network analyses hold great potential in helping us understand normal and abnormal brain function by providing profound clinical insight into links between system-level properties and health and behavioral outcomes. Nonetheless, methods for statistically analyzing networks at the group and individual levels have lagged behind. We have attempted to address this need by developing three complementary statistical frameworks—a mixed modeling framework, a distance regression framework, and a hidden semi-Markov modeling framework. These tools serve as synergistic fusions of statistical approaches with network science methods, providing needed analytic foundations for whole-brain network data. Here we delineate these approaches, briefly survey related tools, and discuss potential future avenues of research. We hope this review catalyzes further statistical interest and methodological development in the field.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
{"title":"Statistical Brain Network Analysis","authors":"Sean L. Simpson, Heather M. Shappell, Mohsen Bahrami","doi":"10.1146/annurev-statistics-040522-020722","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-020722","url":null,"abstract":"The recent fusion of network science and neuroscience has catalyzed a paradigm shift in how we study the brain and led to the field of brain network analysis. Brain network analyses hold great potential in helping us understand normal and abnormal brain function by providing profound clinical insight into links between system-level properties and health and behavioral outcomes. Nonetheless, methods for statistically analyzing networks at the group and individual levels have lagged behind. We have attempted to address this need by developing three complementary statistical frameworks—a mixed modeling framework, a distance regression framework, and a hidden semi-Markov modeling framework. These tools serve as synergistic fusions of statistical approaches with network science methods, providing needed analytic foundations for whole-brain network data. Here we delineate these approaches, briefly survey related tools, and discuss potential future avenues of research. We hope this review catalyzes further statistical interest and methodological development in the field.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"80 14","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138449678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-28DOI: 10.1146/annurev-statistics-040722-060248
Federica Bianchi, Edoardo Filippi-Mazzola, Alessandro Lomi, Ernst C. Wit
Advances in information technology have increased the availability of time-stamped relational data, such as those produced by email exchanges or interaction through social media. Whereas the associated information flows could be aggregated into cross-sectional panels, the temporal ordering of the events frequently contains information that requires new models for the analysis of continuous-time interactions, subject to both endogenous and exogenous influences. The introduction of the relational event model (REM) has been a major development that has stimulated new questions and led to further methodological developments. In this review, we track the intellectual history of the REM, define its core properties, and discuss why and how it has been considered useful in empirical research. We describe how the demands of novel applications have stimulated methodological, computational, and inferential advancements.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
{"title":"Relational Event Modeling","authors":"Federica Bianchi, Edoardo Filippi-Mazzola, Alessandro Lomi, Ernst C. Wit","doi":"10.1146/annurev-statistics-040722-060248","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040722-060248","url":null,"abstract":"Advances in information technology have increased the availability of time-stamped relational data, such as those produced by email exchanges or interaction through social media. Whereas the associated information flows could be aggregated into cross-sectional panels, the temporal ordering of the events frequently contains information that requires new models for the analysis of continuous-time interactions, subject to both endogenous and exogenous influences. The introduction of the relational event model (REM) has been a major development that has stimulated new questions and led to further methodological developments. In this review, we track the intellectual history of the REM, define its core properties, and discuss why and how it has been considered useful in empirical research. We describe how the demands of novel applications have stimulated methodological, computational, and inferential advancements.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"80 15","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138449677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}