Stat最新文献

英文中文

Statistical inference and distributed implementation for linear multicategory SVM 线性多类别支持向量机的统计推断与分布式实现

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-08-14 DOI: 10.1002/sta4.611

Gaoming Sun, Xiaozhou Wang, Yibo Yan, Riquan Zhang

Support vector machine (SVM) is one of the most prevalent classification techniques due to its excellent performance. The standard binary SVM has been well‐studied. However, a large number of multicategory classification problems in the real world are equally worth attention. In this paper, focusing on the computationally efficient multicategory angle‐based SVM model, we first study the statistical properties of model coefficient estimation. Notice that the new challenges posed by the widespread presence of distributed data, this paper further develops a distributed smoothed estimation for the multicategory SVM and establishes its theoretical guarantees. Through the derived asymptotic properties, it can be seen that our distributed smoothed estimation can achieve the same statistical efficiency as the global estimation. Numerical studies are performed to demonstrate the highly competitive performance of our proposed distributed smoothed method.

支持向量机(SVM)以其优异的性能成为目前最流行的分类技术之一。标准二进制支持向量机已经得到了很好的研究。然而，现实世界中大量的多类别分类问题同样值得关注。本文以计算效率高的多类别角度支持向量机模型为研究对象，首先研究了模型系数估计的统计性质。注意到分布式数据的广泛存在所带来的新挑战，本文进一步发展了多类别支持向量机的分布式平滑估计，并建立了其理论保证。通过推导出的渐近性质可以看出，我们的分布光滑估计可以达到与全局估计相同的统计效率。数值研究表明，本文提出的分布式平滑方法具有很强的竞争力。

引用次数: 0

Score‐based test in high‐dimensional quantile regression for longitudinal data with application to a glomerular filtration rate data 对纵向数据应用于肾小球滤过率数据的高维分位数回归中基于分数的检验

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-08-14 DOI: 10.1002/sta4.610

Yinfeng Wang, H. Wang, Yanlin Tang

Motivated by a genome‐wide association study on the glomerular filtration rate, we develop a new robust test for longitudinal data to detect the effects of biomarkers in high‐dimensional quantile regression, in the presence of prespecified control variables. The test is based on the sum of score‐type statistics deduced from conditional quantile regression. The test statistic is constructed in a working‐independent manner, but the calibration reflects the intrinsic within‐subject correlation. Therefore, the test takes advantage of the feature of longitudinal data and provides more information than those based on only one measurement for each subject. Asymptotic properties of the proposed test statistic are established under both the null and local alternative hypotheses. Simulation studies show that the proposed test can control the family‐wise error rate well, while providing competitive power. The proposed method is applied to the motivating glomerular filtration rate data to test the overall significance of a large number of candidate single‐nucleotide polymorphisms that are possibly associated with the Type 1 diabetes, conditioning on the patients' demographics.

受肾小球滤过率全基因组关联研究的激励，我们开发了一种新的纵向数据稳健测试，以检测生物标志物在高维分位数回归中的影响，在预先指定的控制变量存在的情况下。该测试基于从条件分位数回归推导出的分数型统计量的总和。检验统计量以工作独立的方式构建，但校准反映了内在的主题内相关性。因此，该测试利用了纵向数据的特点，提供了比仅基于每个受试者一次测量的信息更多的信息。在零假设和局部备择假设下，建立了所提检验统计量的渐近性质。仿真研究表明，所提出的测试方法可以很好地控制家庭误差率，同时提供竞争力。所提出的方法应用于激励肾小球滤过率数据，以测试可能与1型糖尿病相关的大量候选单核苷酸多态性的总体意义，并根据患者的人口统计学条件进行调节。

{"title":"Score‐based test in high‐dimensional quantile regression for longitudinal data with application to a glomerular filtration rate data","authors":"Yinfeng Wang, H. Wang, Yanlin Tang","doi":"10.1002/sta4.610","DOIUrl":"https://doi.org/10.1002/sta4.610","url":null,"abstract":"Motivated by a genome‐wide association study on the glomerular filtration rate, we develop a new robust test for longitudinal data to detect the effects of biomarkers in high‐dimensional quantile regression, in the presence of prespecified control variables. The test is based on the sum of score‐type statistics deduced from conditional quantile regression. The test statistic is constructed in a working‐independent manner, but the calibration reflects the intrinsic within‐subject correlation. Therefore, the test takes advantage of the feature of longitudinal data and provides more information than those based on only one measurement for each subject. Asymptotic properties of the proposed test statistic are established under both the null and local alternative hypotheses. Simulation studies show that the proposed test can control the family‐wise error rate well, while providing competitive power. The proposed method is applied to the motivating glomerular filtration rate data to test the overall significance of a large number of candidate single‐nucleotide polymorphisms that are possibly associated with the Type 1 diabetes, conditioning on the patients' demographics.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"6 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81655787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An asymptotic threshold of sufficient randomness for causal inference 对因果推理具有足够随机性的渐近阈值

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-08-01 DOI: 10.1002/sta4.609

B. Knaeble, B. Osting, P. Tshiaba

For sensitivity analysis with stochastic counterfactuals, we introduce a methodology to characterize uncertainty in causal inference from natural experiments. Our sensitivity parameters are standardized measures of variation in propensity and prognosis probabilities, and one minus their geometric mean is an intuitive measure of randomness in the data generating process. Within our latent propensity‐prognosis model, we show how to compute, from contingency table data, a threshold, , of sufficient randomness for causal inference. If the actual randomness of the data generating process is greater than this threshold, then causal inference is warranted. We demonstrate our methodology with two example applications.

对于随机反事实的敏感性分析，我们引入了一种方法来表征自然实验因果推理中的不确定性。我们的敏感性参数是倾向和预后概率变化的标准化度量，一减去它们的几何平均值是数据生成过程中随机性的直观度量。在我们的潜在倾向-预测模型中，我们展示了如何从列联表数据中计算足够随机性的阈值，以进行因果推理。如果数据生成过程的实际随机性大于这个阈值，则可以进行因果推理。我们用两个示例应用程序演示我们的方法。

引用次数: 1

Density regression and uncertainty quantification with Bayesian deep noise neural networks 密度回归与贝叶斯深度噪声神经网络的不确定性量化

4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-08-01 DOI: 10.1002/sta4.604

Daiwei Zhang, Tianci Liu, Jian Kang

Deep neural network (DNN) models have achieved state‐of‐the‐art predictive accuracy in a wide range of applications. However, it remains a challenging task to accurately quantify the uncertainty in DNN predictions, especially those of continuous outcomes. To this end, we propose the Bayesian deep noise neural network (B‐DeepNoise), which generalizes standard Bayesian DNNs by extending the random noise variable from the output layer to all hidden layers. Our model is capable of approximating highly complex predictive density functions and fully learn the possible random variation in the outcome variables. For posterior computation, we provide a closed‐form Gibbs sampling algorithm that circumvents tuning‐intensive Metropolis–Hastings methods. We establish a recursive representation of the predictive density and perform theoretical analysis on the predictive variance. Through extensive experiments, we demonstrate the superiority of B‐DeepNoise over existing methods in terms of density estimation and uncertainty quantification accuracy. A neuroimaging application is included to show our model's usefulness in scientific studies.

深度神经网络(DNN)模型已经在广泛的应用中实现了最先进的预测精度。然而，准确量化深度神经网络预测中的不确定性仍然是一项具有挑战性的任务，特别是那些连续结果。为此，我们提出了贝叶斯深度噪声神经网络(B‐DeepNoise)，它通过将随机噪声变量从输出层扩展到所有隐藏层来推广标准贝叶斯深度神经网络。我们的模型能够近似高度复杂的预测密度函数，并充分学习结果变量中可能的随机变化。对于后验计算，我们提供了一种封闭形式的Gibbs采样算法，该算法绕过了调优密集的Metropolis-Hastings方法。我们建立了预测密度的递归表示，并对预测方差进行了理论分析。通过大量的实验，我们证明了B‐DeepNoise在密度估计和不确定度量化精度方面优于现有方法。包括神经成像应用程序，以显示我们的模型在科学研究中的有用性。

引用次数: 0

CoxKnockoff: Controlled feature selection for the Cox model using knockoffs Cox knockoff:使用仿制品对Cox模型进行控制特征选择

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-07-31 DOI: 10.1002/sta4.607

Daoji Li, Jinzhao Yu, Hui Zhao

Although there is a huge literature on feature selection for the Cox model, none of the existing approaches can control the false discovery rate (FDR) unless the sample size tends to infinity. In addition, there is no formal power analysis of the knockoffs framework for survival data in the literature. To address those issues, in this paper, we propose a novel controlled feature selection approach using knockoffs for the Cox model. We establish that the proposed method enjoys the FDR control in finite samples regardless of the number of covariates. Moreover, under mild regularity conditions, we also show that the power of our method is asymptotically one as sample size tends to infinity. To the best of our knowledge, this is the first formal theoretical result on the power for the knockoffs procedure in the survival setting. Simulation studies confirm that our method has appealing finite-sample performance with desired FDR control and high power. We further demonstrate the performance of our method through a real data example.

尽管有大量关于Cox模型特征选择的文献，但现有的方法都不能控制错误发现率(FDR)，除非样本量趋于无穷大。此外，文献中没有对仿冒品生存数据框架进行正式的功效分析。为了解决这些问题，在本文中，我们提出了一种新的控制特征选择方法，使用Cox模型的仿制品。我们证明，无论协变量的数量如何，所提出的方法在有限样本中都具有FDR控制。此外，在温和的正则性条件下，我们还证明了当样本容量趋于无穷大时，我们的方法的幂函数是渐近的。据我们所知，这是第一个关于生存环境中仿冒程序的能力的正式理论结果。仿真研究证实了该方法具有良好的有限样本性能，具有理想的FDR控制和高功率。我们通过一个真实的数据示例进一步证明了我们的方法的性能。

引用次数: 0

A trinomial difference autoregressive model and its applications 一种三叉差分自回归模型及其应用

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-07-31 DOI: 10.1002/sta4.596

Huaping Chen, Jiayue Zhang, Fukang Zhu

引用次数: 0

Dirichlet process mixture models using matrix‐generalized half‐t distribution 使用矩阵广义半t分布的Dirichlet过程混合模型

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-07-18 DOI: 10.1002/sta4.599

Sanghyun Lee, C. Kim

引用次数: 0

Optimal non-circular designs for total effects under interference models 干涉模型下总效应的最佳非圆设计

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-07-18 DOI: 10.1002/sta4.601

Xiangshun Kong, Jie Fu, Wei Zheng

In block designs, the responses of plots are potentially influenced by the treatments of neighbouring plots and the surrounding environment. Many researchers use two guarding plots next to the edge plots, for which we apply certain treatments to control these environmental effects. Thus, a design is presented as a collection of treatment sequences. For the estimation of total effects, existing results consider circular designs, whose constraints are unnecessary in common applications. In this paper, we construct optimal or highly efficient non-circular designs under interference models. It is observed that the optimal non-circular designs for the total effects outperform the optimal circular designs in many instances. In fact, a design containing a circular sequence cannot be optimal for

� t � > � 2

在块体设计中，地块的响应可能受到邻近地块和周围环境处理的影响。许多研究人员在边缘样地旁边设置了两个防护样地，对此我们采取了一定的处理措施来控制这些环境影响。因此，设计被呈现为处理序列的集合。对于总效应的估计，现有的结果考虑了圆形设计，其约束在一般应用中是不必要的。在本文中，我们在干涉模型下构造了最优或高效的非圆设计。观察到，在许多情况下，总效应的最优非圆设计优于最优圆形设计。事实上，包含圆形序列的设计对于t>2来说不可能是最优的。

引用次数: 0

Distance‐weighted discrimination for functional data 功能数据的距离加权判别

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-07-13 DOI: 10.1002/sta4.598

Peijun Sang

引用次数: 0

Predicting fatigue from heart rate signatures using functional logistic regression 使用功能逻辑回归从心率特征预测疲劳

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2023-07-13 DOI: 10.1002/sta4.595

Daniel Ries, J. Gabriel Huerta

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Stat

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀