首页 > 最新文献

Stat最新文献

英文 中文
Multivariate differential association analysis. 多变量差异关联分析。
IF 0.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-01 Epub Date: 2024-06-07 DOI: 10.1002/sta4.704
Hoseung Song, Michael C Wu

Identifying how dependence relationships vary across different conditions plays a significant role in many scientific investigations. For example, it is important for the comparison of biological systems to see if relationships between genomic features differ between cases and controls. In this paper, we seek to evaluate whether relationships between two sets of variables are different or not across two conditions. Specifically, we assess: do two sets of high-dimensional variables have similar dependence relationships across two conditions? We propose a new kernel-based test to capture the differential dependence. Specifically, the new test determines whether two measures that detect dependence relationships are similar or not under two conditions. We introduce the asymptotic permutation null distribution of the test statistic and it is shown to work well under finite samples such that the test is computationally efficient, significantly enhancing its usability in analyzing large datasets. We demonstrate through numerical studies that our proposed test has high power for detecting differential linear and non-linear relationships. The proposed method is implemented in an R package kerDAA.

确定依赖关系在不同条件下如何变化在许多科学研究中起着重要作用。例如,对于生物系统的比较来说,观察病例和对照组之间基因组特征之间的关系是否不同是很重要的。在本文中,我们试图评估两组变量之间的关系在两个条件下是否不同。具体来说,我们评估:两组高维变量在两种情况下是否具有相似的依赖关系?我们提出了一种新的基于核的测试来捕获微分依赖性。具体来说,新的测试确定在两种条件下检测依赖关系的两个度量是否相似。我们引入检验统计量的渐近排列零分布,并证明它在有限样本下工作良好,因此该检验具有计算效率,显着提高了其在分析大型数据集时的可用性。我们通过数值研究证明,我们提出的测试在检测微分线性和非线性关系方面具有很高的能力。该方法在一个R包kerDAA中实现。
{"title":"Multivariate differential association analysis.","authors":"Hoseung Song, Michael C Wu","doi":"10.1002/sta4.704","DOIUrl":"10.1002/sta4.704","url":null,"abstract":"<p><p>Identifying how dependence relationships vary across different conditions plays a significant role in many scientific investigations. For example, it is important for the comparison of biological systems to see if relationships between genomic features differ between cases and controls. In this paper, we seek to evaluate whether relationships between two sets of variables are different or not across two conditions. Specifically, we assess: <i>do two sets of high-dimensional variables have similar dependence relationships across two conditions?</i> We propose a new kernel-based test to capture the differential dependence. Specifically, the new test determines whether two measures that detect dependence relationships are similar or not under two conditions. We introduce the asymptotic permutation null distribution of the test statistic and it is shown to work well under finite samples such that the test is computationally efficient, significantly enhancing its usability in analyzing large datasets. We demonstrate through numerical studies that our proposed test has high power for detecting differential linear and non-linear relationships. The proposed method is implemented in an R package kerDAA.</p>","PeriodicalId":56159,"journal":{"name":"Stat","volume":"13 2","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11661859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142878013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model selection for generalized linear models with weak factors 弱因子广义线性模型的模型选择
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-29 DOI: 10.1002/sta4.697
Xin Zhou, Yan Dong, Qin Yu, Zemin Zheng
The literature has witnessed an upsurge of interest in model selection in diverse fields and optimization applications. Despite the substantial progress, model selection remains a significant challenge when covariates are highly correlated, particularly within economic and financial datasets that exhibit cross‐sectional and serial dependency. In this paper, we introduce a novel methodology named factor augmented regularized model selection with weak factors (WeakFARM) for generalized linear models in the presence of correlated covariates with weak latent factor structure. By identifying weak latent factors and idiosyncratic components and employing them as predictors, WeakFARM converts the challenge from model selection with highly correlated covariates to that with weakly correlated ones. Furthermore, we develop a variable screening method based on the proposed WeakFARM method. Comprehensive theoretical guarantees including estimation consistency, model selection consistency and sure screening property are also provided. We demonstrate the effectiveness of our approach by extensive simulation studies and a real data application in economic forecasting.
在不同领域和优化应用中,人们对模型选择的兴趣与日俱增。尽管取得了长足进步,但在协变量高度相关的情况下,模型选择仍然是一项重大挑战,尤其是在表现出横截面和序列依赖性的经济和金融数据集中。在本文中,我们针对具有弱潜在因子结构的相关协变量,为广义线性模型引入了一种名为 "弱因子增强正则化模型选择"(WeakFARM)的新方法。通过识别弱潜在因子和特异性成分并将其用作预测因子,WeakFARM 将高度相关协变量的模型选择挑战转换为弱相关协变量的模型选择挑战。此外,我们还基于所提出的 WeakFARM 方法开发了一种变量筛选方法。我们还提供了全面的理论保证,包括估计一致性、模型选择一致性和确定的筛选属性。我们通过大量的模拟研究和经济预测中的实际数据应用,证明了我们方法的有效性。
{"title":"Model selection for generalized linear models with weak factors","authors":"Xin Zhou, Yan Dong, Qin Yu, Zemin Zheng","doi":"10.1002/sta4.697","DOIUrl":"https://doi.org/10.1002/sta4.697","url":null,"abstract":"The literature has witnessed an upsurge of interest in model selection in diverse fields and optimization applications. Despite the substantial progress, model selection remains a significant challenge when covariates are highly correlated, particularly within economic and financial datasets that exhibit cross‐sectional and serial dependency. In this paper, we introduce a novel methodology named factor augmented regularized model selection with weak factors (WeakFARM) for generalized linear models in the presence of correlated covariates with weak latent factor structure. By identifying weak latent factors and idiosyncratic components and employing them as predictors, WeakFARM converts the challenge from model selection with highly correlated covariates to that with weakly correlated ones. Furthermore, we develop a variable screening method based on the proposed WeakFARM method. Comprehensive theoretical guarantees including estimation consistency, model selection consistency and sure screening property are also provided. We demonstrate the effectiveness of our approach by extensive simulation studies and a real data application in economic forecasting.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"48 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tell your story: Metrics of success for academic data science collaboration and consulting programs 讲述你的故事学术数据科学合作与咨询项目的成功衡量标准
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-29 DOI: 10.1002/sta4.686
Mara Rojeski Blake, Emily Griffith, Steven J. Pierce, Rachel Levy, Micaela Parker, Marianne Huebner
Measuring success plays a central role in justifying and advocating for a statistical or data science consulting or collaboration program (SDSP) within an academic institution. We present several specific metrics to report to targeted audiences to tell the story for success of a robust and sustainable program. While gathering such metrics includes challenges, we discuss potential data sources and possible practices for SDSPs to inform their own approaches. Emphasizing essential metrics for reporting, we also share the metric gathering and reporting practices of two programs in greater detail. New or existing SDSPs should evaluate their local environments and tailor their practice to gathering, analysing and reporting success metrics accordingly. This approach provides a strong foundation to use success metrics to tell compelling stories about the SDSP and enhance program sustainability. The area of success metrics provides ample opportunity for future research projects that leverage qualitative methods and consider mechanisms for adapting to the changing landscape of data science.
衡量成功与否在证明和宣传学术机构内的统计或数据科学咨询或合作计划 (SDSP) 的合理性方面发挥着核心作用。我们向目标受众介绍了几种具体的衡量标准,以说明一个稳健而可持续的项目取得了多大的成功。在收集这些指标的同时,我们也讨论了 SDSP 的潜在数据来源和可行做法,以便为他们自己的方法提供参考。在强调报告的基本指标的同时,我们还更详细地分享了两个计划的指标收集和报告实践。新的或现有的 SDSP 应评估其当地环境,并相应地调整其收集、分析和报告成功指标的做法。这种方法为利用成功度量标准讲述有关可持续发展战略计划的引人入胜的故事和增强计划的可持续性奠定了坚实的基础。成功指标领域为未来的研究项目提供了大量机会,这些研究项目可利用定性方法,并考虑适应数据科学不断变化的环境的机制。
{"title":"Tell your story: Metrics of success for academic data science collaboration and consulting programs","authors":"Mara Rojeski Blake, Emily Griffith, Steven J. Pierce, Rachel Levy, Micaela Parker, Marianne Huebner","doi":"10.1002/sta4.686","DOIUrl":"https://doi.org/10.1002/sta4.686","url":null,"abstract":"Measuring success plays a central role in justifying and advocating for a statistical or data science consulting or collaboration program (SDSP) within an academic institution. We present several specific metrics to report to targeted audiences to tell the story for success of a robust and sustainable program. While gathering such metrics includes challenges, we discuss potential data sources and possible practices for SDSPs to inform their own approaches. Emphasizing essential metrics for reporting, we also share the metric gathering and reporting practices of two programs in greater detail. New or existing SDSPs should evaluate their local environments and tailor their practice to gathering, analysing and reporting success metrics accordingly. This approach provides a strong foundation to use success metrics to tell compelling stories about the SDSP and enhance program sustainability. The area of success metrics provides ample opportunity for future research projects that leverage qualitative methods and consider mechanisms for adapting to the changing landscape of data science.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"42 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Bayesian nonparametric variable selection for linear regression 线性回归的稳健贝叶斯非参数变量选择
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-28 DOI: 10.1002/sta4.696
Alberto Cabezas, Marco Battiston, Christopher Nemeth
Spike‐and‐slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real‐world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed‐form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy‐tailed response variables. The model's performance is tested against competing algorithms on synthetic and real‐world datasets.
穗-板回归和马蹄回归可以说是线性回归模型中最流行的贝叶斯变量选择方法。然而,如果数据中存在异常值和异方差,它们的性能就会下降,而异常值和异方差是许多实际统计和机器学习应用中的常见特征。本研究提出了一种贝叶斯非参数线性回归方法,在考虑异常值和异方差的同时进行变量选择。我们提出的模型是 Dirichlet 过程尺度混合模型的一个实例,其优势在于我们能以闭合形式推导出所有参数的完整条件分布,从而为后验推理提供高效的 Gibbs 采样器。此外,我们还介绍了如何扩展模型以考虑重尾响应变量。我们在合成数据集和实际数据集上测试了该模型与其他算法的性能。
{"title":"Robust Bayesian nonparametric variable selection for linear regression","authors":"Alberto Cabezas, Marco Battiston, Christopher Nemeth","doi":"10.1002/sta4.696","DOIUrl":"https://doi.org/10.1002/sta4.696","url":null,"abstract":"Spike‐and‐slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real‐world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed‐form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy‐tailed response variables. The model's performance is tested against competing algorithms on synthetic and real‐world datasets.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"47 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Working well with statisticians: Perceptions of practicing statisticians on their most successful collaborations 与统计学家合作愉快:执业统计学家对其最成功合作的看法
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-27 DOI: 10.1002/sta4.694
Ryan A. Peterson, Emily Slade, Gina‐Maria Pomann, Walter T. Ambrosius
Statistical collaboration requires statisticians to work and communicate effectively with nonstatisticians, which can be challenging for many reasons. To identify common themes and lessons for working smoothly with nonstatistician collaborators, two focus groups of primarily academic collaborative statisticians were held. We identified qualities of collaborations that tend to yield fruitful relationships and those that tend to yield nothing (or worse, with one or both parties being dissatisfied). The initial goal was to share helpful knowledge and individual experiences that can facilitate more successful collaborative relationships for statisticians who work within academic statistical collaboration units. These findings were used to design a follow‐up survey to collect perspectives from a wider set of practicing statisticians on important qualities to consider when assessing potential collaborations. In this survey of practicing statisticians, we found widespread agreement on many good and bad qualities to promote and discourage, respectively. Interestingly, some negative and positive collaboration qualities were less agreed upon, suggesting that in such cases, a mix‐and‐match approach of domain experts to statisticians could alleviate friction and statistician burnout in team science settings. The perceived importance of some collaboration characteristics differed between faculty and staff, while others depended on experience.
统计合作要求统计人员与非统计人员进行有效的合作与交流,而由于多种原因,这可能具有挑战性。为了找出与非统计学家合作者顺利开展工作的共同主题和经验教训,我们召开了两个主要由学术合作统计学家组成的焦点小组会议。我们找出了往往会产生丰硕成果的合作关系的特质,以及往往会一无所获(或者更糟糕,一方或双方都不满意)的合作关系的特质。最初的目标是分享有用的知识和个人经验,以促进在学术统计合作单位工作的统计人员建立更成功的合作关系。我们利用这些发现设计了一项后续调查,以收集更多从业统计人员对评估潜在合作关系时应考虑的重要品质的看法。在这项针对从业统计人员的调查中,我们发现大家普遍认为,许多好的品质和坏的品质分别值得提倡和反对。有趣的是,对一些消极和积极的合作品质的认同度较低,这表明在这种情况下,将领域专家与统计人员混合搭配的方法可以减轻团队科学环境中的摩擦和统计人员的职业倦怠。教职员工对某些合作特征的重要性认识不同,而其他特征则取决于经验。
{"title":"Working well with statisticians: Perceptions of practicing statisticians on their most successful collaborations","authors":"Ryan A. Peterson, Emily Slade, Gina‐Maria Pomann, Walter T. Ambrosius","doi":"10.1002/sta4.694","DOIUrl":"https://doi.org/10.1002/sta4.694","url":null,"abstract":"Statistical collaboration requires statisticians to work and communicate effectively with nonstatisticians, which can be challenging for many reasons. To identify common themes and lessons for working smoothly with nonstatistician collaborators, two focus groups of primarily academic collaborative statisticians were held. We identified qualities of collaborations that tend to yield fruitful relationships and those that tend to yield nothing (or worse, with one or both parties being dissatisfied). The initial goal was to share helpful knowledge and individual experiences that can facilitate more successful collaborative relationships for statisticians who work within academic statistical collaboration units. These findings were used to design a follow‐up survey to collect perspectives from a wider set of practicing statisticians on important qualities to consider when assessing potential collaborations. In this survey of practicing statisticians, we found widespread agreement on many good and bad qualities to promote and discourage, respectively. Interestingly, some negative and positive collaboration qualities were less agreed upon, suggesting that in such cases, a mix‐and‐match approach of domain experts to statisticians could alleviate friction and statistician burnout in team science settings. The perceived importance of some collaboration characteristics differed between faculty and staff, while others depended on experience.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"51 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141167300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do good: Strategies for leading an inclusive data science or statistics consulting team 做好事:领导包容性数据科学或统计咨询团队的策略
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-13 DOI: 10.1002/sta4.687
Christina Maimone, Julia L. Sharp, Ofira Schwartz‐Soicher, Jeffrey C. Oliver, Lencia Beltran
Leading a data science or statistical consulting team in an academic environment can have many challenges, including institutional infrastructure, funding and technical expertise. Even in the most challenging environment, however, leading such a team with inclusive practices can be rewarding for the leader, the team members and collaborators. We describe nine leadership and management practices that are especially relevant to the dynamics of data science or statistics consulting teams and an academic environment: ensuring people get credit, making tacit knowledge explicit, establishing clear performance review processes, championing career development, empowering team members to work autonomously, learning from diverse experiences, supporting team members in navigating power dynamics, having difficult conversations and developing foundational management skills. Active engagement in these areas will help those who lead data science or statistics consulting groups – whether faculty or staff, regardless of title – create and support inclusive teams.
在学术环境中领导一个数据科学或统计咨询团队可能会面临许多挑战,包括机构基础设施、资金和专业技术知识。然而,即使在最具挑战性的环境中,以包容性的实践领导这样的团队,也能为领导者、团队成员和合作者带来丰厚的回报。我们介绍了与数据科学或统计咨询团队的动态和学术环境特别相关的九项领导和管理实践:确保人们获得荣誉、使隐性知识显性化、建立明确的绩效考核流程、支持职业发展、赋予团队成员自主工作的权力、从不同的经验中学习、支持团队成员驾驭权力动态、进行艰难的对话以及发展基础管理技能。积极参与这些领域的工作将有助于那些领导数据科学或统计咨询小组的人员--无论是教职员工,还是任何职称的人员--创建并支持包容性团队。
{"title":"Do good: Strategies for leading an inclusive data science or statistics consulting team","authors":"Christina Maimone, Julia L. Sharp, Ofira Schwartz‐Soicher, Jeffrey C. Oliver, Lencia Beltran","doi":"10.1002/sta4.687","DOIUrl":"https://doi.org/10.1002/sta4.687","url":null,"abstract":"Leading a data science or statistical consulting team in an academic environment can have many challenges, including institutional infrastructure, funding and technical expertise. Even in the most challenging environment, however, leading such a team with inclusive practices can be rewarding for the leader, the team members and collaborators. We describe nine leadership and management practices that are especially relevant to the dynamics of data science or statistics consulting teams and an academic environment: ensuring people get credit, making tacit knowledge explicit, establishing clear performance review processes, championing career development, empowering team members to work autonomously, learning from diverse experiences, supporting team members in navigating power dynamics, having difficult conversations and developing foundational management skills. Active engagement in these areas will help those who lead data science or statistics consulting groups – whether faculty or staff, regardless of title – create and support inclusive teams.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"24 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High‐dimensional differential networks with sparsity and reduced‐rank 具有稀疏性和降低秩的高维微分网络
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-13 DOI: 10.1002/sta4.690
Yao Wang, Cheng Wang, Binyan Jiang
Differential network analysis plays a crucial role in capturing nuanced changes in conditional correlations between two samples. Under the high‐dimensional setting, the differential network, that is, the difference between the two precision matrices are usually stylized with sparse signals and some low‐rank latent factors. Recognizing the distinctions inherent in the precision matrices of such networks, we introduce a novel approach, termed ‘SR‐Network’ for the estimation of sparse and reduced‐rank differential networks. This method directly assesses the differential network by formulating a convex empirical loss function with ‐norm and nuclear norm penalties. The study establishes finite‐sample error bounds for parameter estimation and highlights the superior performance of the proposed method through extensive simulations and real data studies. This research significantly contributes to the advancement of methodologies for accurate analysis of differential networks, particularly in the context of structures characterized by sparsity and low‐rank features.
差分网络分析在捕捉两个样本之间条件相关性的细微变化方面起着至关重要的作用。在高维环境下,差分网络(即两个精度矩阵之间的差异)通常由稀疏信号和一些低阶潜因构成。认识到此类网络精度矩阵的内在区别,我们引入了一种新方法,称为 "SR-网络",用于估计稀疏和低阶差分网络。这种方法通过制定带有-规范和核规范惩罚的凸经验损失函数,直接评估差分网络。该研究为参数估计建立了有限样本误差边界,并通过大量模拟和真实数据研究凸显了所提方法的优越性能。这项研究极大地促进了差分网络精确分析方法的发展,尤其是在具有稀疏性和低秩特征的结构中。
{"title":"High‐dimensional differential networks with sparsity and reduced‐rank","authors":"Yao Wang, Cheng Wang, Binyan Jiang","doi":"10.1002/sta4.690","DOIUrl":"https://doi.org/10.1002/sta4.690","url":null,"abstract":"Differential network analysis plays a crucial role in capturing nuanced changes in conditional correlations between two samples. Under the high‐dimensional setting, the differential network, that is, the difference between the two precision matrices are usually stylized with sparse signals and some low‐rank latent factors. Recognizing the distinctions inherent in the precision matrices of such networks, we introduce a novel approach, termed ‘SR‐Network’ for the estimation of sparse and reduced‐rank differential networks. This method directly assesses the differential network by formulating a convex empirical loss function with ‐norm and nuclear norm penalties. The study establishes finite‐sample error bounds for parameter estimation and highlights the superior performance of the proposed method through extensive simulations and real data studies. This research significantly contributes to the advancement of methodologies for accurate analysis of differential networks, particularly in the context of structures characterized by sparsity and low‐rank features.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"218 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140941729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variational inference for the latent shrinkage position model 潜缩位置模型的变量推理
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-09 DOI: 10.1002/sta4.685
Xian Yao Gwee, Isobel Claire Gormley, Michael Fop
The latent position model (LPM) is a popular method used in network data analysis where nodes are assumed to be positioned in a ‐dimensional latent space. The latent shrinkage position model (LSPM) is an extension of the LPM which automatically determines the number of effective dimensions of the latent space via a Bayesian nonparametric shrinkage prior. However, the LSPM's reliance on Markov chain Monte Carlo for inference, while rigorous, is computationally expensive, making it challenging to scale to networks with large numbers of nodes. We introduce a variational inference approach for the LSPM, aiming to reduce computational demands while retaining the model's ability to intrinsically determine the number of effective latent dimensions. The performance of the variational LSPM is illustrated through simulation studies and its application to real‐world network data. To promote wider adoption and ease of implementation, we also provide open‐source code.
潜在位置模型(LPM)是网络数据分析中常用的一种方法,它假定节点位于一维潜在空间中。潜在收缩位置模型(LSPM)是 LPM 的扩展,它通过贝叶斯非参数收缩先验自动确定潜在空间的有效维数。然而,LSPM 依靠马尔科夫链蒙特卡洛进行推理,虽然严谨,但计算成本高昂,使其难以扩展到具有大量节点的网络。我们为 LSPM 引入了一种变异推理方法,旨在减少计算需求,同时保留模型内在确定有效潜维数的能力。通过模拟研究及其在真实世界网络数据中的应用,说明了变异 LSPM 的性能。为了促进更广泛的应用和便于实施,我们还提供了开放源代码。
{"title":"Variational inference for the latent shrinkage position model","authors":"Xian Yao Gwee, Isobel Claire Gormley, Michael Fop","doi":"10.1002/sta4.685","DOIUrl":"https://doi.org/10.1002/sta4.685","url":null,"abstract":"The latent position model (LPM) is a popular method used in network data analysis where nodes are assumed to be positioned in a ‐dimensional latent space. The latent shrinkage position model (LSPM) is an extension of the LPM which automatically determines the number of effective dimensions of the latent space via a Bayesian nonparametric shrinkage prior. However, the LSPM's reliance on Markov chain Monte Carlo for inference, while rigorous, is computationally expensive, making it challenging to scale to networks with large numbers of nodes. We introduce a variational inference approach for the LSPM, aiming to reduce computational demands while retaining the model's ability to intrinsically determine the number of effective latent dimensions. The performance of the variational LSPM is illustrated through simulation studies and its application to real‐world network data. To promote wider adoption and ease of implementation, we also provide open‐source code.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"5 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A guide to successful management of collaborative partnerships in quantitative research: An illustration of the science of team science 成功管理定量研究中的合作伙伴关系指南:团队科学说明
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-09 DOI: 10.1002/sta4.674
Alyssa Platt, Tracy Truong, Mary Boulos, Nichole E. Carlson, Manisha Desai, Monica M. Elam, Emily Slade, Alexandra L. Hanlon, Jillian H. Hurst, Maren K. Olsen, Laila M. Poisson, Lacey Rende, Gina‐Maria Pomann
Data‐intensive research continues to expand with the goal of improving healthcare delivery, clinical decision‐making, and patient outcomes. Quantitative scientists, such as biostatisticians, epidemiologists, and informaticists, are tasked with turning data into health knowledge. In academic health centres, quantitative scientists are critical to the missions of biomedical discovery and improvement of health. Many academic health centres have developed centralized Quantitative Science Units which foster dual goals of professional development of quantitative scientists and producing high quality, reproducible domain research. Such units then develop teams of quantitative scientists who can collaborate with researchers. However, existing literature does not provide guidance on how such teams are formed or how to manage and sustain them. Leaders of Quantitative Science Units across six institutions formed a working group to examine common practices and tools that can serve as best practices for Quantitative Science Units that wish to achieve these dual goals through building long‐term partnerships with researchers. The results of this working group are presented to provide tools and guidance for Quantitative Science Units challenged with developing, managing, and evaluating Quantitative Science Teams. This guidance aims to help Quantitative Science Units effectively participate in and enhance the research that is conducted throughout the academic health centre—shaping their resources to fit evolving research needs.
数据密集型研究不断扩大,其目标是改善医疗服务、临床决策和患者疗效。定量科学家,如生物统计学家、流行病学家和信息学家,负责将数据转化为健康知识。在学术健康中心,定量科学家对生物医学发现和改善健康状况的使命至关重要。许多学术健康中心都建立了中央定量科学部门,以促进定量科学家的专业发展和开展高质量、可重复的领域研究为双重目标。这些单位随后发展了可与研究人员合作的定量科学家团队。然而,现有文献并未就如何组建此类团队或如何管理和维持团队提供指导。六所院校定量科学部门的领导组成了一个工作小组,研究共同的实践和工具,作为希望通过与研究人员建立长期合作关系来实现上述双重目标的定量科学部门的最佳实践。本报告介绍了该工作组的成果,旨在为面临发展、管理和评估定量科学团队挑战的定量科学部门提供工具和指导。该指南旨在帮助定量科学部门有效地参与并加强整个学术健康中心的研究工作--根据不断变化的研究需求调整其资源。
{"title":"A guide to successful management of collaborative partnerships in quantitative research: An illustration of the science of team science","authors":"Alyssa Platt, Tracy Truong, Mary Boulos, Nichole E. Carlson, Manisha Desai, Monica M. Elam, Emily Slade, Alexandra L. Hanlon, Jillian H. Hurst, Maren K. Olsen, Laila M. Poisson, Lacey Rende, Gina‐Maria Pomann","doi":"10.1002/sta4.674","DOIUrl":"https://doi.org/10.1002/sta4.674","url":null,"abstract":"Data‐intensive research continues to expand with the goal of improving healthcare delivery, clinical decision‐making, and patient outcomes. Quantitative scientists, such as biostatisticians, epidemiologists, and informaticists, are tasked with turning data into health knowledge. In academic health centres, quantitative scientists are critical to the missions of biomedical discovery and improvement of health. Many academic health centres have developed centralized Quantitative Science Units which foster dual goals of professional development of quantitative scientists and producing high quality, reproducible domain research. Such units then develop teams of quantitative scientists who can collaborate with researchers. However, existing literature does not provide guidance on how such teams are formed or how to manage and sustain them. Leaders of Quantitative Science Units across six institutions formed a working group to examine common practices and tools that can serve as best practices for Quantitative Science Units that wish to achieve these dual goals through building long‐term partnerships with researchers. The results of this working group are presented to provide tools and guidance for Quantitative Science Units challenged with developing, managing, and evaluating Quantitative Science Teams. This guidance aims to help Quantitative Science Units effectively participate in and enhance the research that is conducted throughout the academic health centre—shaping their resources to fit evolving research needs.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"24 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An optimal exact interval for the risk ratio in the 2×2$$ 2times 2 $$ table with structural zero 具有结构零的 2×2$$ 2 次 2 $$ 表中风险比的最佳精确区间
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-08 DOI: 10.1002/sta4.681
Weizhen Wang, Xingyun Cao, Tianfa Xie
The table with a structural zero represents a common scenario in clinical trials and epidemiology, characterized by a specific empty cell. In such cases, the risk ratio serves as a vital parameter for statistical inference. However, existing confidence intervals, such as those constructed through the score test and Bayesian methods, fail to achieve the prescribed nominal level. Our focus is on numerically constructing exact confidence intervals for the risk ratio. We achieve this by optimally combining the modified inferential model method and the ‐function method. The resulting interval is then compared with intervals generated by four existing methods: the score method, the exact score method, the Bayesian tailed‐based method and the inferential model method. This comparison is conducted based on the infimum coverage probability, average interval length and non‐coverage probability criteria. Remarkably, our proposed interval outperforms other exact intervals, being notably shorter. To illustrate the effectiveness of our approach, we discuss two examples in detail.
结构为零的表格是临床试验和流行病学中常见的一种情况,其特点是有一个特定的空单元格。在这种情况下,风险比是统计推断的重要参数。然而,现有的置信区间,如通过分数检验和贝叶斯方法构建的置信区间,都无法达到规定的名义水平。我们的重点是用数字构建风险比的精确置信区间。我们通过优化组合修正推理模型法和-函数法来实现这一目标。然后将得到的置信区间与四种现有方法生成的置信区间进行比较:得分法、精确得分法、基于贝叶斯尾数法和推理模型法。这种比较是基于最小覆盖概率、平均区间长度和非覆盖概率标准进行的。值得注意的是,我们提出的区间优于其他精确区间,明显更短。为了说明我们的方法的有效性,我们详细讨论了两个例子。
{"title":"An optimal exact interval for the risk ratio in the 2×2$$ 2times 2 $$ table with structural zero","authors":"Weizhen Wang, Xingyun Cao, Tianfa Xie","doi":"10.1002/sta4.681","DOIUrl":"https://doi.org/10.1002/sta4.681","url":null,"abstract":"The table with a structural zero represents a common scenario in clinical trials and epidemiology, characterized by a specific empty cell. In such cases, the risk ratio serves as a vital parameter for statistical inference. However, existing confidence intervals, such as those constructed through the score test and Bayesian methods, fail to achieve the prescribed nominal level. Our focus is on numerically constructing exact confidence intervals for the risk ratio. We achieve this by optimally combining the modified inferential model method and the ‐function method. The resulting interval is then compared with intervals generated by four existing methods: the score method, the exact score method, the Bayesian tailed‐based method and the inferential model method. This comparison is conducted based on the infimum coverage probability, average interval length and non‐coverage probability criteria. Remarkably, our proposed interval outperforms other exact intervals, being notably shorter. To illustrate the effectiveness of our approach, we discuss two examples in detail.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"9 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Stat
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1