Pub Date : 2024-06-01Epub Date: 2024-06-07DOI: 10.1002/sta4.704
Hoseung Song, Michael C Wu
Identifying how dependence relationships vary across different conditions plays a significant role in many scientific investigations. For example, it is important for the comparison of biological systems to see if relationships between genomic features differ between cases and controls. In this paper, we seek to evaluate whether relationships between two sets of variables are different or not across two conditions. Specifically, we assess: do two sets of high-dimensional variables have similar dependence relationships across two conditions? We propose a new kernel-based test to capture the differential dependence. Specifically, the new test determines whether two measures that detect dependence relationships are similar or not under two conditions. We introduce the asymptotic permutation null distribution of the test statistic and it is shown to work well under finite samples such that the test is computationally efficient, significantly enhancing its usability in analyzing large datasets. We demonstrate through numerical studies that our proposed test has high power for detecting differential linear and non-linear relationships. The proposed method is implemented in an R package kerDAA.
{"title":"Multivariate differential association analysis.","authors":"Hoseung Song, Michael C Wu","doi":"10.1002/sta4.704","DOIUrl":"10.1002/sta4.704","url":null,"abstract":"<p><p>Identifying how dependence relationships vary across different conditions plays a significant role in many scientific investigations. For example, it is important for the comparison of biological systems to see if relationships between genomic features differ between cases and controls. In this paper, we seek to evaluate whether relationships between two sets of variables are different or not across two conditions. Specifically, we assess: <i>do two sets of high-dimensional variables have similar dependence relationships across two conditions?</i> We propose a new kernel-based test to capture the differential dependence. Specifically, the new test determines whether two measures that detect dependence relationships are similar or not under two conditions. We introduce the asymptotic permutation null distribution of the test statistic and it is shown to work well under finite samples such that the test is computationally efficient, significantly enhancing its usability in analyzing large datasets. We demonstrate through numerical studies that our proposed test has high power for detecting differential linear and non-linear relationships. The proposed method is implemented in an R package kerDAA.</p>","PeriodicalId":56159,"journal":{"name":"Stat","volume":"13 2","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11661859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142878013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The literature has witnessed an upsurge of interest in model selection in diverse fields and optimization applications. Despite the substantial progress, model selection remains a significant challenge when covariates are highly correlated, particularly within economic and financial datasets that exhibit cross‐sectional and serial dependency. In this paper, we introduce a novel methodology named factor augmented regularized model selection with weak factors (WeakFARM) for generalized linear models in the presence of correlated covariates with weak latent factor structure. By identifying weak latent factors and idiosyncratic components and employing them as predictors, WeakFARM converts the challenge from model selection with highly correlated covariates to that with weakly correlated ones. Furthermore, we develop a variable screening method based on the proposed WeakFARM method. Comprehensive theoretical guarantees including estimation consistency, model selection consistency and sure screening property are also provided. We demonstrate the effectiveness of our approach by extensive simulation studies and a real data application in economic forecasting.
{"title":"Model selection for generalized linear models with weak factors","authors":"Xin Zhou, Yan Dong, Qin Yu, Zemin Zheng","doi":"10.1002/sta4.697","DOIUrl":"https://doi.org/10.1002/sta4.697","url":null,"abstract":"The literature has witnessed an upsurge of interest in model selection in diverse fields and optimization applications. Despite the substantial progress, model selection remains a significant challenge when covariates are highly correlated, particularly within economic and financial datasets that exhibit cross‐sectional and serial dependency. In this paper, we introduce a novel methodology named factor augmented regularized model selection with weak factors (WeakFARM) for generalized linear models in the presence of correlated covariates with weak latent factor structure. By identifying weak latent factors and idiosyncratic components and employing them as predictors, WeakFARM converts the challenge from model selection with highly correlated covariates to that with weakly correlated ones. Furthermore, we develop a variable screening method based on the proposed WeakFARM method. Comprehensive theoretical guarantees including estimation consistency, model selection consistency and sure screening property are also provided. We demonstrate the effectiveness of our approach by extensive simulation studies and a real data application in economic forecasting.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"48 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mara Rojeski Blake, Emily Griffith, Steven J. Pierce, Rachel Levy, Micaela Parker, Marianne Huebner
Measuring success plays a central role in justifying and advocating for a statistical or data science consulting or collaboration program (SDSP) within an academic institution. We present several specific metrics to report to targeted audiences to tell the story for success of a robust and sustainable program. While gathering such metrics includes challenges, we discuss potential data sources and possible practices for SDSPs to inform their own approaches. Emphasizing essential metrics for reporting, we also share the metric gathering and reporting practices of two programs in greater detail. New or existing SDSPs should evaluate their local environments and tailor their practice to gathering, analysing and reporting success metrics accordingly. This approach provides a strong foundation to use success metrics to tell compelling stories about the SDSP and enhance program sustainability. The area of success metrics provides ample opportunity for future research projects that leverage qualitative methods and consider mechanisms for adapting to the changing landscape of data science.
{"title":"Tell your story: Metrics of success for academic data science collaboration and consulting programs","authors":"Mara Rojeski Blake, Emily Griffith, Steven J. Pierce, Rachel Levy, Micaela Parker, Marianne Huebner","doi":"10.1002/sta4.686","DOIUrl":"https://doi.org/10.1002/sta4.686","url":null,"abstract":"Measuring success plays a central role in justifying and advocating for a statistical or data science consulting or collaboration program (SDSP) within an academic institution. We present several specific metrics to report to targeted audiences to tell the story for success of a robust and sustainable program. While gathering such metrics includes challenges, we discuss potential data sources and possible practices for SDSPs to inform their own approaches. Emphasizing essential metrics for reporting, we also share the metric gathering and reporting practices of two programs in greater detail. New or existing SDSPs should evaluate their local environments and tailor their practice to gathering, analysing and reporting success metrics accordingly. This approach provides a strong foundation to use success metrics to tell compelling stories about the SDSP and enhance program sustainability. The area of success metrics provides ample opportunity for future research projects that leverage qualitative methods and consider mechanisms for adapting to the changing landscape of data science.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"42 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Cabezas, Marco Battiston, Christopher Nemeth
Spike‐and‐slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real‐world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed‐form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy‐tailed response variables. The model's performance is tested against competing algorithms on synthetic and real‐world datasets.
{"title":"Robust Bayesian nonparametric variable selection for linear regression","authors":"Alberto Cabezas, Marco Battiston, Christopher Nemeth","doi":"10.1002/sta4.696","DOIUrl":"https://doi.org/10.1002/sta4.696","url":null,"abstract":"Spike‐and‐slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real‐world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed‐form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy‐tailed response variables. The model's performance is tested against competing algorithms on synthetic and real‐world datasets.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"47 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan A. Peterson, Emily Slade, Gina‐Maria Pomann, Walter T. Ambrosius
Statistical collaboration requires statisticians to work and communicate effectively with nonstatisticians, which can be challenging for many reasons. To identify common themes and lessons for working smoothly with nonstatistician collaborators, two focus groups of primarily academic collaborative statisticians were held. We identified qualities of collaborations that tend to yield fruitful relationships and those that tend to yield nothing (or worse, with one or both parties being dissatisfied). The initial goal was to share helpful knowledge and individual experiences that can facilitate more successful collaborative relationships for statisticians who work within academic statistical collaboration units. These findings were used to design a follow‐up survey to collect perspectives from a wider set of practicing statisticians on important qualities to consider when assessing potential collaborations. In this survey of practicing statisticians, we found widespread agreement on many good and bad qualities to promote and discourage, respectively. Interestingly, some negative and positive collaboration qualities were less agreed upon, suggesting that in such cases, a mix‐and‐match approach of domain experts to statisticians could alleviate friction and statistician burnout in team science settings. The perceived importance of some collaboration characteristics differed between faculty and staff, while others depended on experience.
{"title":"Working well with statisticians: Perceptions of practicing statisticians on their most successful collaborations","authors":"Ryan A. Peterson, Emily Slade, Gina‐Maria Pomann, Walter T. Ambrosius","doi":"10.1002/sta4.694","DOIUrl":"https://doi.org/10.1002/sta4.694","url":null,"abstract":"Statistical collaboration requires statisticians to work and communicate effectively with nonstatisticians, which can be challenging for many reasons. To identify common themes and lessons for working smoothly with nonstatistician collaborators, two focus groups of primarily academic collaborative statisticians were held. We identified qualities of collaborations that tend to yield fruitful relationships and those that tend to yield nothing (or worse, with one or both parties being dissatisfied). The initial goal was to share helpful knowledge and individual experiences that can facilitate more successful collaborative relationships for statisticians who work within academic statistical collaboration units. These findings were used to design a follow‐up survey to collect perspectives from a wider set of practicing statisticians on important qualities to consider when assessing potential collaborations. In this survey of practicing statisticians, we found widespread agreement on many good and bad qualities to promote and discourage, respectively. Interestingly, some negative and positive collaboration qualities were less agreed upon, suggesting that in such cases, a mix‐and‐match approach of domain experts to statisticians could alleviate friction and statistician burnout in team science settings. The perceived importance of some collaboration characteristics differed between faculty and staff, while others depended on experience.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"51 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141167300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christina Maimone, Julia L. Sharp, Ofira Schwartz‐Soicher, Jeffrey C. Oliver, Lencia Beltran
Leading a data science or statistical consulting team in an academic environment can have many challenges, including institutional infrastructure, funding and technical expertise. Even in the most challenging environment, however, leading such a team with inclusive practices can be rewarding for the leader, the team members and collaborators. We describe nine leadership and management practices that are especially relevant to the dynamics of data science or statistics consulting teams and an academic environment: ensuring people get credit, making tacit knowledge explicit, establishing clear performance review processes, championing career development, empowering team members to work autonomously, learning from diverse experiences, supporting team members in navigating power dynamics, having difficult conversations and developing foundational management skills. Active engagement in these areas will help those who lead data science or statistics consulting groups – whether faculty or staff, regardless of title – create and support inclusive teams.
{"title":"Do good: Strategies for leading an inclusive data science or statistics consulting team","authors":"Christina Maimone, Julia L. Sharp, Ofira Schwartz‐Soicher, Jeffrey C. Oliver, Lencia Beltran","doi":"10.1002/sta4.687","DOIUrl":"https://doi.org/10.1002/sta4.687","url":null,"abstract":"Leading a data science or statistical consulting team in an academic environment can have many challenges, including institutional infrastructure, funding and technical expertise. Even in the most challenging environment, however, leading such a team with inclusive practices can be rewarding for the leader, the team members and collaborators. We describe nine leadership and management practices that are especially relevant to the dynamics of data science or statistics consulting teams and an academic environment: ensuring people get credit, making tacit knowledge explicit, establishing clear performance review processes, championing career development, empowering team members to work autonomously, learning from diverse experiences, supporting team members in navigating power dynamics, having difficult conversations and developing foundational management skills. Active engagement in these areas will help those who lead data science or statistics consulting groups – whether faculty or staff, regardless of title – create and support inclusive teams.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"24 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Differential network analysis plays a crucial role in capturing nuanced changes in conditional correlations between two samples. Under the high‐dimensional setting, the differential network, that is, the difference between the two precision matrices are usually stylized with sparse signals and some low‐rank latent factors. Recognizing the distinctions inherent in the precision matrices of such networks, we introduce a novel approach, termed ‘SR‐Network’ for the estimation of sparse and reduced‐rank differential networks. This method directly assesses the differential network by formulating a convex empirical loss function with ‐norm and nuclear norm penalties. The study establishes finite‐sample error bounds for parameter estimation and highlights the superior performance of the proposed method through extensive simulations and real data studies. This research significantly contributes to the advancement of methodologies for accurate analysis of differential networks, particularly in the context of structures characterized by sparsity and low‐rank features.
{"title":"High‐dimensional differential networks with sparsity and reduced‐rank","authors":"Yao Wang, Cheng Wang, Binyan Jiang","doi":"10.1002/sta4.690","DOIUrl":"https://doi.org/10.1002/sta4.690","url":null,"abstract":"Differential network analysis plays a crucial role in capturing nuanced changes in conditional correlations between two samples. Under the high‐dimensional setting, the differential network, that is, the difference between the two precision matrices are usually stylized with sparse signals and some low‐rank latent factors. Recognizing the distinctions inherent in the precision matrices of such networks, we introduce a novel approach, termed ‘SR‐Network’ for the estimation of sparse and reduced‐rank differential networks. This method directly assesses the differential network by formulating a convex empirical loss function with ‐norm and nuclear norm penalties. The study establishes finite‐sample error bounds for parameter estimation and highlights the superior performance of the proposed method through extensive simulations and real data studies. This research significantly contributes to the advancement of methodologies for accurate analysis of differential networks, particularly in the context of structures characterized by sparsity and low‐rank features.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"218 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140941729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The latent position model (LPM) is a popular method used in network data analysis where nodes are assumed to be positioned in a ‐dimensional latent space. The latent shrinkage position model (LSPM) is an extension of the LPM which automatically determines the number of effective dimensions of the latent space via a Bayesian nonparametric shrinkage prior. However, the LSPM's reliance on Markov chain Monte Carlo for inference, while rigorous, is computationally expensive, making it challenging to scale to networks with large numbers of nodes. We introduce a variational inference approach for the LSPM, aiming to reduce computational demands while retaining the model's ability to intrinsically determine the number of effective latent dimensions. The performance of the variational LSPM is illustrated through simulation studies and its application to real‐world network data. To promote wider adoption and ease of implementation, we also provide open‐source code.
{"title":"Variational inference for the latent shrinkage position model","authors":"Xian Yao Gwee, Isobel Claire Gormley, Michael Fop","doi":"10.1002/sta4.685","DOIUrl":"https://doi.org/10.1002/sta4.685","url":null,"abstract":"The latent position model (LPM) is a popular method used in network data analysis where nodes are assumed to be positioned in a ‐dimensional latent space. The latent shrinkage position model (LSPM) is an extension of the LPM which automatically determines the number of effective dimensions of the latent space via a Bayesian nonparametric shrinkage prior. However, the LSPM's reliance on Markov chain Monte Carlo for inference, while rigorous, is computationally expensive, making it challenging to scale to networks with large numbers of nodes. We introduce a variational inference approach for the LSPM, aiming to reduce computational demands while retaining the model's ability to intrinsically determine the number of effective latent dimensions. The performance of the variational LSPM is illustrated through simulation studies and its application to real‐world network data. To promote wider adoption and ease of implementation, we also provide open‐source code.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"5 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alyssa Platt, Tracy Truong, Mary Boulos, Nichole E. Carlson, Manisha Desai, Monica M. Elam, Emily Slade, Alexandra L. Hanlon, Jillian H. Hurst, Maren K. Olsen, Laila M. Poisson, Lacey Rende, Gina‐Maria Pomann
Data‐intensive research continues to expand with the goal of improving healthcare delivery, clinical decision‐making, and patient outcomes. Quantitative scientists, such as biostatisticians, epidemiologists, and informaticists, are tasked with turning data into health knowledge. In academic health centres, quantitative scientists are critical to the missions of biomedical discovery and improvement of health. Many academic health centres have developed centralized Quantitative Science Units which foster dual goals of professional development of quantitative scientists and producing high quality, reproducible domain research. Such units then develop teams of quantitative scientists who can collaborate with researchers. However, existing literature does not provide guidance on how such teams are formed or how to manage and sustain them. Leaders of Quantitative Science Units across six institutions formed a working group to examine common practices and tools that can serve as best practices for Quantitative Science Units that wish to achieve these dual goals through building long‐term partnerships with researchers. The results of this working group are presented to provide tools and guidance for Quantitative Science Units challenged with developing, managing, and evaluating Quantitative Science Teams. This guidance aims to help Quantitative Science Units effectively participate in and enhance the research that is conducted throughout the academic health centre—shaping their resources to fit evolving research needs.
{"title":"A guide to successful management of collaborative partnerships in quantitative research: An illustration of the science of team science","authors":"Alyssa Platt, Tracy Truong, Mary Boulos, Nichole E. Carlson, Manisha Desai, Monica M. Elam, Emily Slade, Alexandra L. Hanlon, Jillian H. Hurst, Maren K. Olsen, Laila M. Poisson, Lacey Rende, Gina‐Maria Pomann","doi":"10.1002/sta4.674","DOIUrl":"https://doi.org/10.1002/sta4.674","url":null,"abstract":"Data‐intensive research continues to expand with the goal of improving healthcare delivery, clinical decision‐making, and patient outcomes. Quantitative scientists, such as biostatisticians, epidemiologists, and informaticists, are tasked with turning data into health knowledge. In academic health centres, quantitative scientists are critical to the missions of biomedical discovery and improvement of health. Many academic health centres have developed centralized Quantitative Science Units which foster dual goals of professional development of quantitative scientists and producing high quality, reproducible domain research. Such units then develop teams of quantitative scientists who can collaborate with researchers. However, existing literature does not provide guidance on how such teams are formed or how to manage and sustain them. Leaders of Quantitative Science Units across six institutions formed a working group to examine common practices and tools that can serve as best practices for Quantitative Science Units that wish to achieve these dual goals through building long‐term partnerships with researchers. The results of this working group are presented to provide tools and guidance for Quantitative Science Units challenged with developing, managing, and evaluating Quantitative Science Teams. This guidance aims to help Quantitative Science Units effectively participate in and enhance the research that is conducted throughout the academic health centre—shaping their resources to fit evolving research needs.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"24 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The table with a structural zero represents a common scenario in clinical trials and epidemiology, characterized by a specific empty cell. In such cases, the risk ratio serves as a vital parameter for statistical inference. However, existing confidence intervals, such as those constructed through the score test and Bayesian methods, fail to achieve the prescribed nominal level. Our focus is on numerically constructing exact confidence intervals for the risk ratio. We achieve this by optimally combining the modified inferential model method and the ‐function method. The resulting interval is then compared with intervals generated by four existing methods: the score method, the exact score method, the Bayesian tailed‐based method and the inferential model method. This comparison is conducted based on the infimum coverage probability, average interval length and non‐coverage probability criteria. Remarkably, our proposed interval outperforms other exact intervals, being notably shorter. To illustrate the effectiveness of our approach, we discuss two examples in detail.
{"title":"An optimal exact interval for the risk ratio in the 2×2$$ 2times 2 $$ table with structural zero","authors":"Weizhen Wang, Xingyun Cao, Tianfa Xie","doi":"10.1002/sta4.681","DOIUrl":"https://doi.org/10.1002/sta4.681","url":null,"abstract":"The table with a structural zero represents a common scenario in clinical trials and epidemiology, characterized by a specific empty cell. In such cases, the risk ratio serves as a vital parameter for statistical inference. However, existing confidence intervals, such as those constructed through the score test and Bayesian methods, fail to achieve the prescribed nominal level. Our focus is on numerically constructing exact confidence intervals for the risk ratio. We achieve this by optimally combining the modified inferential model method and the ‐function method. The resulting interval is then compared with intervals generated by four existing methods: the score method, the exact score method, the Bayesian tailed‐based method and the inferential model method. This comparison is conducted based on the infimum coverage probability, average interval length and non‐coverage probability criteria. Remarkably, our proposed interval outperforms other exact intervals, being notably shorter. To illustrate the effectiveness of our approach, we discuss two examples in detail.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"9 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}