首页 > 最新文献

Statistical Methodology最新文献

英文 中文
Change detection for uncertain autoregressive dynamic models through nonparametric estimation 基于非参数估计的不确定自回归动态模型变化检测
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.08.003
Nadine Hilgert , Ghislain Verdier , Jean-Pierre Vila

A new statistical approach for on-line change detection in uncertain dynamic system is proposed. In change detection problem, the distribution of a sequence of observations can change at some unknown instant. The goal is to detect this change, for example a parameter change, as quickly as possible with a minimal risk of false detection. In this paper, the observations come from an uncertain system modeled by an autoregressive model containing an unknown functional component. The popular Page’s CUSUM rule is not applicable anymore since it requires the full knowledge of the model. A new detection CUSUM-like scheme is proposed, which is based on the nonparametric estimation of the unknown component from a learning sample. Moreover, the estimation procedure can be updated on-line which ensures a better detection, especially at the beginning of the monitoring procedure. Simulation trials were performed on a model describing a water treatment process and show the interest of this new procedure with respect to the classic CUSUM rule.

提出了一种新的不确定动态系统在线变化检测的统计方法。在变化检测问题中,观测序列的分布可能在某个未知时刻发生变化。我们的目标是尽可能快地检测这种变化,例如参数变化,同时将错误检测的风险降至最低。在本文中,观测值来自一个不确定系统,该系统由一个包含未知功能成分的自回归模型建模。流行的Page的CUSUM规则不再适用,因为它需要模型的全部知识。提出了一种基于学习样本中未知成分的非参数估计的类cusum检测方案。此外,估计过程可以在线更新,以确保更好的检测,特别是在监测过程的开始。在描述水处理过程的模型上进行了模拟试验,并显示了该新程序相对于经典CUSUM规则的兴趣。
{"title":"Change detection for uncertain autoregressive dynamic models through nonparametric estimation","authors":"Nadine Hilgert ,&nbsp;Ghislain Verdier ,&nbsp;Jean-Pierre Vila","doi":"10.1016/j.stamet.2016.08.003","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.08.003","url":null,"abstract":"<div><p>A new statistical approach for on-line change detection in uncertain dynamic system is proposed. In change detection problem, the distribution of a sequence of observations can change at some unknown instant. The goal is to detect this change, for example a parameter change, as quickly as possible with a minimal risk of false detection. In this paper, the observations come from an uncertain system modeled by an autoregressive model<span> containing an unknown functional component. The popular Page’s CUSUM rule is not applicable anymore since it requires the full knowledge of the model. A new detection CUSUM-like scheme is proposed, which is based on the nonparametric estimation of the unknown component from a learning sample. Moreover, the estimation procedure can be updated on-line which ensures a better detection, especially at the beginning of the monitoring procedure. Simulation trials were performed on a model describing a water treatment process and show the interest of this new procedure with respect to the classic CUSUM rule.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.08.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A novel power-based approach to Gaussian kernel selection in the kernel-based association test 在基于核的关联测试中,一种新的基于幂的高斯核选择方法
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.09.003
Xiang Zhan , Debashis Ghosh

Kernel-based association test (KAT) is a widely used tool in genetics association analysis. The performance of such a test depends on the choice of kernel. In this paper, we study the statistical power of a KAT using a Gaussian kernel. We explicitly develop a notion of analytical power function in this family of tests. We propose a novel approach to select the kernel so as to maximize the analytical power function of the test at a given test level (an upper bound on the probability of making a type I error). We assess some theoretical properties of our optimal estimator, and compare its performance with some similar existing alternatives using simulation studies. Neuroimaging data from an Alzheimer’s disease study is also used to illustrate the proposed kernel selection methodology.

基于核的关联试验(Kernel-based association test, KAT)是一种广泛应用于遗传关联分析的工具。这种测试的性能取决于内核的选择。本文利用高斯核研究了KAT的统计功率。在这类检验中,我们明确地提出了解析幂函数的概念。我们提出了一种新的方法来选择核,以便在给定的测试水平上最大化测试的分析幂函数(产生I类错误概率的上界)。我们评估了我们的最优估计器的一些理论性质,并使用仿真研究将其性能与一些类似的现有替代方案进行比较。来自阿尔茨海默病研究的神经影像学数据也用于说明所提出的核选择方法。
{"title":"A novel power-based approach to Gaussian kernel selection in the kernel-based association test","authors":"Xiang Zhan ,&nbsp;Debashis Ghosh","doi":"10.1016/j.stamet.2016.09.003","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.09.003","url":null,"abstract":"<div><p>Kernel-based association test (KAT) is a widely used tool in genetics association analysis. The performance of such a test depends on the choice of kernel. In this paper, we study the statistical power of a KAT using a Gaussian kernel. We explicitly develop a notion of analytical power function in this family of tests. We propose a novel approach to select the kernel so as to maximize the analytical power function of the test at a given test level (an upper bound on the probability<span> of making a type I error). We assess some theoretical properties of our optimal estimator, and compare its performance with some similar existing alternatives using simulation studies. Neuroimaging data from an Alzheimer’s disease study is also used to illustrate the proposed kernel selection methodology.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.09.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A generalized inverse trinomial distribution with application 广义逆三叉分布及其应用
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.10.001
Shin Zhu Sim , Seng Huat Ong

This paper considers a particular generalized inverse trinomial distribution which may be regarded as the convolution of binomial and negative distributions for the statistical analysis of count data. This distribution has the flexibility to cater for under-, equi- and over-dispersion in the data. Some basic and probabilistic properties and tail approximation of the distribution have been derived. Conditions for the numerical stability of the two-term probability recurrence formula have also been examined to facilitate computation. For the purpose of statistical analysis, test of hypothesis for equi-dispersion by the score and likelihood ratio tests and simulation study of their power, parameter estimation by maximum likelihood and a probability generating function based methods have been considered. The versatility of the distribution is illustrated by its application to real biological data sets which exhibit under and over dispersion. It is shown that the distribution fits better than the well-known generalized Poisson and COM-Poisson distributions.

本文考虑一种特殊的广义逆三叉分布,它可以看作是二项分布和负分布的卷积,用于计数数据的统计分析。这种分布具有灵活性,可以满足数据的低分散、均匀分散和过度分散。推导了该分布的一些基本性质和概率性质以及尾部近似。为了便于计算,还研究了两项概率递推公式的数值稳定性条件。为了进行统计分析,考虑了分数比检验和似然比检验对等离散性的假设检验及其功率的模拟研究、最大似然法参数估计和基于概率生成函数的方法。该分布的多功能性通过其在实际生物数据集上的应用来说明,这些数据集表现出过分散和过分散。结果表明,该分布比众所周知的广义泊松分布和com -泊松分布拟合得更好。
{"title":"A generalized inverse trinomial distribution with application","authors":"Shin Zhu Sim ,&nbsp;Seng Huat Ong","doi":"10.1016/j.stamet.2016.10.001","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.10.001","url":null,"abstract":"<div><p><span><span>This paper considers a particular generalized inverse trinomial distribution which may be regarded as the </span>convolution<span> of binomial and negative distributions for the statistical analysis of count data. This distribution has the flexibility to cater for under-, equi- and over-dispersion in the data. Some basic and probabilistic properties and tail approximation of the distribution have been derived. Conditions for the numerical stability of the two-term probability<span> recurrence formula have also been examined to facilitate computation. For the purpose of statistical analysis, test of hypothesis for equi-dispersion by the score and </span></span></span>likelihood ratio tests<span> and simulation study of their power, parameter estimation by maximum likelihood and a probability generating function<span> based methods have been considered. The versatility of the distribution is illustrated by its application to real biological data sets which exhibit under and over dispersion. It is shown that the distribution fits better than the well-known generalized Poisson and COM-Poisson distributions.</span></span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.10.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Non-parametric Bayesian inference for continuous density hidden Markov mixture model 连续密度隐马尔可夫混合模型的非参数贝叶斯推理
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.10.003
Najmeh Bathaee, Hamid Sheikhzadeh

In this paper, we present a non-parametric continuous density Hidden Markov mixture model (CDHMMix model) with unknown number of mixtures for blind segmentation or clustering of sequences. In our presented model, the emission distributions of HMMs are chosen to be Gaussian with full, diagonal, or tridiagonal covariance matrices. We apply a Bayesian approach to train our presented model and drive the inference of our model using the Monte Carlo Markov Chain (MCMC) method. For the multivariate Gaussian emission a method that maintains the tridiagonal structure of the covariance is introduced. Moreover, we present a new sampling method for hidden state sequences of HMMs based on the Viterbi algorithm that increases the mixing rate.

本文提出了一种具有未知混合数的非参数连续密度隐马尔可夫混合模型(CDHMMix模型),用于序列的盲分割或聚类。在我们提出的模型中,hmm的发射分布被选择为高斯分布,具有全、对角或三对角协方差矩阵。我们应用贝叶斯方法来训练我们提出的模型,并使用蒙特卡洛马尔可夫链(MCMC)方法来驱动我们模型的推理。对于多元高斯发射,提出了一种保持协方差三对角结构的方法。此外,我们提出了一种基于Viterbi算法的hmm隐状态序列采样方法,提高了混合速率。
{"title":"Non-parametric Bayesian inference for continuous density hidden Markov mixture model","authors":"Najmeh Bathaee,&nbsp;Hamid Sheikhzadeh","doi":"10.1016/j.stamet.2016.10.003","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.10.003","url":null,"abstract":"<div><p><span><span>In this paper, we present a non-parametric continuous density Hidden Markov mixture model (CDHMMix model) with unknown number of mixtures for blind segmentation or clustering of sequences. In our presented model, the emission distributions of HMMs are chosen to be Gaussian with full, diagonal, or tridiagonal covariance matrices. We apply a </span>Bayesian approach to train our presented model and drive the inference of our model using the Monte Carlo Markov Chain (MCMC) method. For the multivariate Gaussian emission a method that maintains the tridiagonal structure of the covariance is introduced. Moreover, we present a new sampling method for hidden state sequences of HMMs based on the </span>Viterbi algorithm that increases the mixing rate.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.10.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Estimation and goodness-of-fit in latent trait models: A comparison among theoretical approaches 潜在性状模型的估计和拟合优度:理论方法的比较
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.05.002
Juan Carlos Bustamante , Edixon Chacón

Two theoretical approaches are usually employed for the fitting of ordinal data: the underlying variables approach (UV) and the item response theory (IRT). In the UV approach, limited information methods [generalized least squares (GLS) and weighted least squares (WLS)] are employed. In the IRT approach, fitting is carried out with full information methods [Proportional Odds Model (POM), and the Normal Ogive (NOR)]. The four estimation methods (GLS, WLS, POM and NOR) are compared in this article at the same time, using a simulation study and analyzing the goodness-of-fit indices obtained. The parameters used in the Monte Carlo simulation arise from the application of a political action scale whose two-factor structure is well known. The results show that the estimation method employed affects the goodness-of-fit to the model. In our case, the IRT approach shows a better fitting than UV, especially with the POM method.

序数数据的拟合通常采用两种理论方法:基础变量法(UV)和项目反应理论(IRT)。在UV方法中,采用了有限信息方法[广义最小二乘(GLS)和加权最小二乘(WLS)]。在IRT方法中,采用全信息方法[比例Odds Model (POM)和Normal Ogive (NOR)]进行拟合。同时对四种估计方法(GLS、WLS、POM和NOR)进行了比较,并进行了仿真研究,分析了得到的拟合优度指标。蒙特卡罗模拟中使用的参数来自于一个众所周知的双因素结构的政治行动尺度的应用。结果表明,所采用的估计方法会影响模型的拟合优度。在我们的案例中,IRT方法显示出比UV更好的拟合,特别是与POM方法。
{"title":"Estimation and goodness-of-fit in latent trait models: A comparison among theoretical approaches","authors":"Juan Carlos Bustamante ,&nbsp;Edixon Chacón","doi":"10.1016/j.stamet.2016.05.002","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.05.002","url":null,"abstract":"<div><p>Two theoretical approaches are usually employed for the fitting of ordinal data: the underlying variables approach (UV) and the item response theory (IRT). In the UV approach, limited information methods [generalized least squares (GLS) and weighted least squares<span> (WLS)] are employed. In the IRT approach, fitting is carried out with full information methods [Proportional Odds Model (POM), and the Normal Ogive (NOR)]. The four estimation methods (GLS, WLS, POM and NOR) are compared in this article at the same time, using a simulation study and analyzing the goodness-of-fit indices obtained. The parameters used in the Monte Carlo simulation arise from the application of a political action scale whose two-factor structure is well known. The results show that the estimation method employed affects the goodness-of-fit to the model. In our case, the IRT approach shows a better fitting than UV, especially with the POM method.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.05.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Some new results on the Rényi quantile entropy Ordering 关于rsamnyi分位数熵排序的一些新结果
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.04.003
Lei Yan , Dian-tong Kang

Rényi (1961) proposed the Rényi entropy. Ebrahimi and Pellerey (1995) and Ebrahimi (1996) proposed the residual entropy. Recently, Nanda et al. (2014) obtained a quantile version of the Rényi residual entropy, the Rényi residual quantile entropy (RRQE). Based on the RRQE function, they defined a new stochastic order, the Rényi quantile entropy (RQE) order, and studied some properties of this order. In this paper, we focus on further properties of this new order. Some characterizations of the RQE order are investigated, closure and reversed closure properties are obtained, meanwhile, some illustrative examples are shown. As applications of a main result, the preservation of the RQE order in several stochastic models are discussed.

rsamunyi(1961)提出了rsamunyi熵。Ebrahimi and Pellerey(1995)和Ebrahimi(1996)提出残差熵。最近,Nanda et al.(2014)获得了一种分位数版本的r残差熵,r残差分位数熵(RRQE)。在RRQE函数的基础上,他们定义了一种新的随机阶数——r分位熵(RQE)阶数,并研究了该阶数的一些性质。在本文中,我们重点讨论了这一新阶的进一步性质。研究了RQE序列的一些特征,得到了闭包和反闭包性质,并给出了一些示例。作为一个主要结果的应用,讨论了几种随机模型中RQE阶的保持问题。
{"title":"Some new results on the Rényi quantile entropy Ordering","authors":"Lei Yan ,&nbsp;Dian-tong Kang","doi":"10.1016/j.stamet.2016.04.003","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.04.003","url":null,"abstract":"<div><p>Rényi (1961) proposed the Rényi entropy. Ebrahimi and Pellerey (1995) and Ebrahimi (1996) proposed the residual entropy. Recently, Nanda et al. (2014) obtained a quantile<span><span> version of the Rényi residual entropy, the Rényi residual quantile entropy (RRQE). Based on the RRQE function, they defined a new stochastic order, the Rényi quantile entropy (RQE) order, and studied some properties of this order. In this paper, we focus on further properties of this new order. Some characterizations of the RQE order are investigated, closure and reversed closure properties are obtained, meanwhile, some illustrative examples are shown. As applications of a main result, the preservation of the RQE order in several </span>stochastic models are discussed.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.04.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Forward selection and estimation in high dimensional single index models 高维单指标模型的前向选择与估计
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.09.002
Shikai Luo, Subhashis Ghosal

We propose a new variable selection and estimation technique for high dimensional single index models with unknown monotone smooth link function. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection. In this article, we propose a new penalized forward selection technique which can reduce high dimensional optimization problems to several one dimensional optimization problems by choosing the best predictor and then iterating the selection steps until convergence. The advantage of optimizing in one dimension is that the location of optimum solution can be obtained with an intelligent search by exploiting smoothness of the criterion function. Moreover, these one dimensional optimization problems can be solved in parallel to reduce computing time nearly to the level of the one-predictor problem. Numerical comparison with the LASSO and the shrinkage sliced inverse regression shows very promising performance of our proposed method.

针对未知单调光滑连接函数的高维单指标模型,提出了一种新的变量选择和估计方法。在众多预测因子中,通常只有一小部分对预测有显著影响。在这种情况下,通过变量选择可以获得更多的可解释模型和更好的预测精度。在本文中,我们提出了一种新的惩罚正向选择技术,通过选择最佳的预测器,然后迭代选择步骤直到收敛,将高维优化问题减少到几个一维优化问题。一维优化的优点是利用准则函数的平滑性,通过智能搜索得到最优解的位置。此外,这些一维优化问题可以并行解决,从而将计算时间减少到接近单预测器问题的水平。与LASSO和收缩切片逆回归的数值比较表明,该方法具有良好的性能。
{"title":"Forward selection and estimation in high dimensional single index models","authors":"Shikai Luo,&nbsp;Subhashis Ghosal","doi":"10.1016/j.stamet.2016.09.002","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.09.002","url":null,"abstract":"<div><p>We propose a new variable selection and estimation technique for high dimensional single index models with unknown monotone smooth link function. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection. In this article, we propose a new penalized forward selection technique which can reduce high dimensional optimization problems to several one dimensional optimization problems by choosing the best predictor and then iterating the selection steps until convergence. The advantage of optimizing in one dimension is that the location of optimum solution can be obtained with an intelligent search by exploiting smoothness of the criterion function. Moreover, these one dimensional optimization problems can be solved in parallel to reduce computing time nearly to the level of the one-predictor problem. Numerical comparison with the LASSO and the shrinkage sliced inverse regression shows very promising performance of our proposed method.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.09.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Symmetric directional false discovery rate control 对称定向错误发现率控制
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.08.002
Sarah E. Holte , Eva K. Lee , Yajun Mei

This research is motivated from the analysis of a real gene expression data that aims to identify a subset of “interesting” or “significant” genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample t-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into “over-expressed” and “under-expressed” genes, pairs “over-expressed” and “under-expressed” genes, defines the p-values for gene pairs via column permutations, and then applies the standard FDR method to select “significant” gene pairs instead of “significant” individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.

本研究的动机是对真实基因表达数据的分析,旨在确定一个“有趣的”或“重要的”基因子集,以供进一步研究。当我们盲目地应用标准错误发现率(FDR)方法时,我们的生物学合作者感到怀疑或困惑,因为所选择的重要基因列表高度不平衡:低表达基因比高表达基因多十倍。他们的担忧使我们意识到,观察到的两样本t统计量是高度倾斜和不对称的,因此,标准的罗斯福方法可能是不合适的。为了解决这种情况,我们提出了一种对称定向FDR控制方法,该方法将基因分为“过表达”和“低表达”基因,对“过表达”和“低表达”基因,通过列排列定义基因对的p值,然后应用标准FDR方法选择“显著”基因对而不是“显著”个体基因。我们将所提出的对称定向FDR方法与标准FDR方法进行了比较,并将其应用于模拟数据和几个知名的真实数据集。
{"title":"Symmetric directional false discovery rate control","authors":"Sarah E. Holte ,&nbsp;Eva K. Lee ,&nbsp;Yajun Mei","doi":"10.1016/j.stamet.2016.08.002","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.08.002","url":null,"abstract":"<div><p><span>This research is motivated from the analysis of a real gene expression data that aims to identify a subset of “interesting” or “significant” genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample </span><span><math><mi>t</mi></math></span>-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into “over-expressed” and “under-expressed” genes, pairs “over-expressed” and “under-expressed” genes, defines the <span><math><mi>p</mi></math></span><span>-values for gene pairs via column permutations, and then applies the standard FDR method to select “significant” gene pairs instead of “significant” individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.08.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Nonparametric M-estimation for right censored regression model with stationary ergodic data 平稳遍历数据右截尾回归模型的非参数m估计
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.10.002
Mohamed Chaouch , Naâmane Laïb , Elias Ould Saïd

The present paper deals with a nonparametric M-estimation for right censored regression model with stationary ergodic data. Defined as an implicit function, a kernel-type estimator of a family of robust regression is considered when the covariate takes its values in Rd (d1) and the data are sampled from a stationary ergodic process. The strong consistency (with rate) and the asymptotic distribution of the estimator are established under mild assumptions. Moreover, a usable confidence interval is provided which does not depend on any unknown quantity. Our results hold without any mixing condition and do not require the existence of marginal densities. A comparison study based on simulated data is also provided.

研究具有平稳遍历数据的右截尾回归模型的非参数m估计。当协变量在Rd (d≥1)中取其值并且数据从平稳遍历过程中采样时,将鲁棒回归族的核型估计量定义为隐函数。在温和的假设条件下,建立了估计量的强相合性和渐近分布。此外,提供了一个可用的置信区间,它不依赖于任何未知量。我们的结果不需要任何混合条件,也不需要存在边际密度。并在模拟数据的基础上进行了对比研究。
{"title":"Nonparametric M-estimation for right censored regression model with stationary ergodic data","authors":"Mohamed Chaouch ,&nbsp;Naâmane Laïb ,&nbsp;Elias Ould Saïd","doi":"10.1016/j.stamet.2016.10.002","DOIUrl":"10.1016/j.stamet.2016.10.002","url":null,"abstract":"<div><p>The present paper deals with a nonparametric <span><math><mi>M</mi></math></span><span><span>-estimation for right censored regression model with stationary ergodic data. Defined as an implicit function, a kernel-type estimator of a family of robust regression is considered when the </span>covariate takes its values in </span><span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>d</mi></mrow></msup></math></span> (<span><math><mi>d</mi><mo>≥</mo><mn>1</mn></math></span>) and the data are sampled from a <em>stationary ergodic process</em><span>. The strong consistency (with rate) and the asymptotic distribution of the estimator are established under mild assumptions. Moreover, a usable confidence interval is provided which does not depend on any unknown quantity. Our results hold without any mixing condition and do not require the existence of marginal densities. A comparison study based on simulated data is also provided.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.10.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125822228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Discrete time software reliability modeling with periodic debugging schedule 具有周期性调试计划的离散时间软件可靠性建模
Q Mathematics Pub Date : 2016-12-01 DOI: 10.1016/j.stamet.2016.08.006
Sudipta Das, Anup Dewanji, Debasis Sengupta

In many situations, multiple copies of a software are tested in parallel with different test cases as input, and the detected errors from a particular round of testing are debugged together. In this article, we discuss a discrete time model of software reliability for such a scenario of periodic debugging. We propose likelihood based inference of the model parameters, including the initial number of errors, under the assumption that all errors are equally likely to be detected. The proposed method is used to estimate the reliability of the software. We establish asymptotic normality of the estimated model parameters. The performance of the proposed method is evaluated through a simulation study and its use is illustrated through the analysis of a dataset obtained from testing of a real-time flight control software. We also consider a more general model, in which different errors have different probabilities of detection.

在许多情况下,软件的多个副本以不同的测试用例作为输入并行地进行测试,并且从特定一轮测试中检测到的错误被一起调试。在本文中,我们讨论了这种周期性调试场景下软件可靠性的离散时间模型。我们提出了基于似然的模型参数推理,包括错误的初始数量,假设所有错误都是同样可能被检测到的。采用该方法对软件的可靠性进行了评估。我们建立了估计模型参数的渐近正态性。通过仿真研究评估了该方法的性能,并通过对实时飞行控制软件测试数据集的分析说明了该方法的应用。我们还考虑了一个更一般的模型,其中不同的错误具有不同的检测概率。
{"title":"Discrete time software reliability modeling with periodic debugging schedule","authors":"Sudipta Das,&nbsp;Anup Dewanji,&nbsp;Debasis Sengupta","doi":"10.1016/j.stamet.2016.08.006","DOIUrl":"https://doi.org/10.1016/j.stamet.2016.08.006","url":null,"abstract":"<div><p>In many situations, multiple copies of a software are tested in parallel with different test cases as input, and the detected errors from a particular round of testing are debugged together. In this article, we discuss a discrete time model of software reliability for such a scenario of periodic debugging. We propose likelihood based inference of the model parameters, including the initial number of errors, under the assumption that all errors are equally likely to be detected. The proposed method is used to estimate the reliability of the software. We establish asymptotic normality<span> of the estimated model parameters<span>. The performance of the proposed method is evaluated through a simulation study and its use is illustrated through the analysis of a dataset obtained from testing of a real-time flight control software. We also consider a more general model, in which different errors have different probabilities of detection.</span></span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2016.08.006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136837489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Statistical Methodology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1