Journal of Statistical Planning and Inference最新文献

英文中文

Consistency of the maximum likelihood estimator of population tree in a coalescent framework 聚合框架中种群树最大似然估计的一致性

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-21 DOI: 10.1016/j.jspi.2024.106172

Arindam RoyChoudhury

We present a proof of consistency of the maximum likelihood estimator (MLE) of population tree in a previously proposed coalescent model. As the model involves tree-topology as a parameter, the standard proof of consistency for continuous parameters does not directly apply. In addition to proving that a consistent sequence of MLE exists, we also prove that the overall MLE, computed by maximizing the likelihood over all tree-topologies, is also consistent. Thus, the MLE of tree-topology is consistent as well. The last result is important because local maxima occur in the likelihood of population trees, especially while maximizing the likelihood separately for each tree-topology. Even though MLE is known to be a dependable estimator under this model, our work proves its effectiveness with mathematical certainty.

我们提出了种群树最大似然估计值（MLE）在之前提出的聚合模型中的一致性证明。由于该模型涉及作为参数的树顶结构，因此连续参数的标准一致性证明并不直接适用。除了证明存在一致的 MLE 序列外，我们还证明了通过最大化所有树状结构的似然计算得出的整体 MLE 也是一致的。因此，树状结构的 MLE 也是一致的。最后一个结果非常重要，因为种群树的可能性会出现局部最大值，尤其是在对每种树形分别进行可能性最大化时。尽管众所周知 MLE 是该模型下可靠的估计器，但我们的工作还是用数学上的确定性证明了它的有效性。

引用次数: 0

Augmented projection Wasserstein distances: Multi-dimensional projection with neural surface 增强投影瓦瑟斯坦距离：带神经表面的多维投影

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-19 DOI: 10.1016/j.jspi.2024.106185

Miyu Sugimoto , Ryo Okano , Masaaki Imaizumi

The Wasserstein distance is a fundamental tool for comparing probability distributions and has found broad applications in various fields, including image generation using generative adversarial networks. Despite its useful properties, the performance of the Wasserstein distance decreases when data is high-dimensional, known as the curse of dimensionality. To mitigate this issue, an extension of the Wasserstein distance has been developed, such as the sliced Wasserstein distance using one-dimensional projection. However, such an extension loses information on the original data, due to the linear projection onto the one-dimensional space. In this paper, we propose novel distances named augmented projection Wasserstein distances (APWDs) to address these issues, which utilize multi-dimensional projection with a nonlinear surface by a neural network. The APWDs employ a two-step procedure; it first maps data onto a nonlinear surface by a neural network, then linearly projects the mapped data into a multidimensional space. We also give an algorithm to select a subspace for the multi-dimensional projection. The APWDs are computationally effective while preserving nonlinear information of data. We theoretically confirm that the APWDs mitigate the curse of dimensionality from data. Our experiments demonstrate the APWDs’ outstanding performance and robustness to noise, particularly in the context of nonlinear high-dimensional data.

瓦瑟斯坦距离是比较概率分布的基本工具，在各个领域都有广泛应用，包括使用生成式对抗网络生成图像。尽管瓦瑟斯坦距离具有有用的特性，但当数据维度较高时，它的性能就会下降，这就是所谓的 "维度诅咒"。为了缓解这一问题，人们开发了瓦瑟斯坦距离的扩展，如使用一维投影的切片瓦瑟斯坦距离。然而，由于要线性投影到一维空间，这种扩展会丢失原始数据的信息。为了解决这些问题，我们在本文中提出了名为 "增强投影瓦瑟斯坦距离（APWD）"的新型距离，它利用神经网络的非线性表面进行多维投影。APWD 采用两步程序：首先通过神经网络将数据映射到非线性曲面上，然后将映射数据线性投影到多维空间中。我们还给出了一种为多维投影选择子空间的算法。APWD 在保留数据非线性信息的同时，计算效率也很高。我们从理论上证实，APWD 可减轻数据的维度诅咒。我们的实验证明了 APWDs 的卓越性能和对噪声的鲁棒性，尤其是在非线性高维数据的情况下。

{"title":"Augmented projection Wasserstein distances: Multi-dimensional projection with neural surface","authors":"Miyu Sugimoto , Ryo Okano , Masaaki Imaizumi","doi":"10.1016/j.jspi.2024.106185","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106185","url":null,"abstract":"<div>The Wasserstein distance is a fundamental tool for comparing probability distributions and has found broad applications in various fields, including image generation using generative adversarial networks. Despite its useful properties, the performance of the Wasserstein distance decreases when data is high-dimensional, known as the curse of dimensionality. To mitigate this issue, an extension of the Wasserstein distance has been developed, such as the sliced Wasserstein distance using one-dimensional projection. However, such an extension loses information on the original data, due to the linear projection onto the one-dimensional space. In this paper, we propose novel distances named augmented projection Wasserstein distances (APWDs) to address these issues, which utilize multi-dimensional projection with a nonlinear surface by a neural network. The APWDs employ a two-step procedure; it first maps data onto a nonlinear surface by a neural network, then linearly projects the mapped data into a multidimensional space. We also give an algorithm to select a subspace for the multi-dimensional projection. The APWDs are computationally effective while preserving nonlinear information of data. We theoretically confirm that the APWDs mitigate the curse of dimensionality from data. Our experiments demonstrate the APWDs’ outstanding performance and robustness to noise, particularly in the context of nonlinear high-dimensional data.</div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106185"},"PeriodicalIF":0.9,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000429/pdfft?md5=d9eef2f8ec0fb76099ca4281dc2a0b63&pid=1-s2.0-S0378375824000429-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140632638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Distributed optimal subsampling for quantile regression with massive data 海量数据量化回归的分布式最优子采样

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-18 DOI: 10.1016/j.jspi.2024.106186

Yue Chao, Xuejun Ma, Boya Zhu

Methods for reducing distributed subsample sizes have increasingly become popular statistical problems in the big data era. Existing works of optimal subsample selection on the massive linear and generalized linear models with distributed data sources have been solidly investigated and widely applied. Nevertheless, few studies have developed distributed optimal subsample selection procedures for quantile regression in massive data. In such settings, the distributed optimal subsampling probabilities and subset sizes selection criteria need to be established simultaneously. In this work, we propose a distributed subsampling technique for the quantile regression models. The estimation approach is based on a two-step algorithm for the distributed subsampling procedures. Furthermore, the theoretical results, such as consistency and asymptotic normality of resultant estimators, are rigorously established under some regularity conditions. The empirical evaluation and performance of the proposed subsampling method are conducted in simulation experiments and real data applications.

减少分布式子样本规模的方法日益成为大数据时代的热门统计问题。关于分布式数据源的海量线性模型和广义线性模型的最优子样本选择的现有工作已经得到了扎实的研究和广泛的应用。然而，很少有研究为海量数据中的量化回归开发分布式最优子样本选择程序。在这种情况下，需要同时建立分布式最优子样本概率和子集大小选择标准。在这项工作中，我们提出了一种用于量化回归模型的分布式子采样技术。该估计方法基于分布式子采样程序的两步算法。此外，我们还在一些正则条件下严格地建立了理论结果，如结果估计子的一致性和渐近正态性。在模拟实验和实际数据应用中，对所提出的子抽样方法进行了实证评估并考察了其性能。

引用次数: 0

Entropic regularization of neural networks: Self-similar approximations 神经网络的熵正则化：自相似近似

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-16 DOI: 10.1016/j.jspi.2024.106181

Amir R. Asadi, Po-Ling Loh

This paper focuses on entropic regularization and its multiscale extension in neural network learning. We leverage established results that characterize the optimizer of entropic regularization methods and their connection with generalization bounds. To avoid the significant computational complexity involved in sampling from the optimal multiscale Gibbs distributions, we describe how to make measured concessions in optimality by using self-similar approximating distributions. We study such scale-invariant approximations for linear neural networks and further extend the approximations to neural networks with nonlinear activation functions. We then illustrate the application of our proposed approach through empirical simulation. By navigating the interplay between optimization and computational efficiency, our research contributes to entropic regularization theory, proposing a practical method that embraces symmetry across scales.

本文重点研究神经网络学习中的熵正则化及其多尺度扩展。我们利用已有的结果来描述熵正则化方法的优化器及其与泛化边界的联系。为了避免从最优多尺度吉布斯分布中采样所带来的巨大计算复杂性，我们介绍了如何通过使用自相似近似分布，在最优性方面做出一定程度的让步。我们研究了线性神经网络的规模不变近似，并进一步将近似扩展到具有非线性激活函数的神经网络。然后，我们通过实证模拟来说明我们提出的方法的应用。通过在优化和计算效率之间的相互作用，我们的研究为熵正则化理论做出了贡献，提出了一种跨尺度对称的实用方法。

引用次数: 0

Multiplier subsample bootstrap for statistics of time series 用于时间序列统计的乘数子样本自举法

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-15 DOI: 10.1016/j.jspi.2024.106183

Ruru Ma, Shibin Zhang

Block-based bootstrap, block-based subsampling and multiplier bootstrap are three common nonparametric tools for statistical inference under dependent observations. Combining the ideas of those three, a novel resampling approach, the multiplier subsample bootstrap (MSB), is proposed. Instead of generating a resample from the observations, the MSB imitates the statistic by weighting the block-based subsample statistics with independent standard Gaussian random variables. Given the asymptotic normality of the statistic, the bootstrap validity is established under some mild moment conditions. Involving the idea of MSB, the other resampling approach, the hybrid multiplier subsampling periodogram bootstrap (HMP), is developed for mimicking frequency-domain spectral mean statistics in the paper. A simulation study demonstrates that both the MSB and HMP achieve good performance.

基于块的自举法、基于块的子采样法和乘数自举法是在依赖观测条件下进行统计推断的三种常见的非参数工具。结合这三种方法的思想，我们提出了一种新颖的重采样方法--乘数子样本自举法（MSB）。MSB 不是从观测数据中生成重采样，而是通过用独立的标准高斯随机变量对基于块的子样本统计量进行加权来模仿统计量。考虑到统计量的渐近正态性，在一些温和的矩条件下建立了引导有效性。结合 MSB 的思想，本文提出了另一种重采样方法，即混合乘法子采样周期图引导法（HMP），用于模拟频域频谱均值统计。仿真研究表明，MSB 和 HMP 都取得了良好的性能。

引用次数: 0

A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks 为各种深度-2 神经网络推导小岭变换的统一傅立叶切片法

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-15 DOI: 10.1016/j.jspi.2024.106184

Sho Sonoda , Isao Ishikawa , Masahiro Ikeda

To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function $f$ to the parameter distribution $γ$ so that a network $N N [γ]$ reproduces $f$ , i.e. $N N [γ] = f$ . For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields $F_{p}$ , group convolutional networks on abstract Hilbert space $H$ , fully-connected networks on noncompact symmetric spaces $G / K$ , and pooling layers, or the $d$ -plane ridgelet transform.

要研究神经网络参数，研究参数分布比研究每个神经元的参数更容易。ridgelet 变换是一个伪逆变换算子，它能将给定函数 f 映射到参数分布 γ 上，从而使网络 NN[γ] 重现 f，即 NN[γ]=f。对于欧几里得空间上的深度-2 全连接网络，我们已经发现了小岭变换的闭式表达，因此可以描述参数是如何分布的。然而，对于各种现代神经网络架构，我们还不知道其闭式表达。在本文中，我们解释了一种使用傅立叶表达式的系统方法，以推导出各种现代网络的小岭变换，如有限场 Fp 上的网络、抽象希尔伯特空间 H 上的群卷积网络、非紧凑对称空间 G/K 上的全连接网络以及池化层或 d 平面小岭变换。

{"title":"A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks","authors":"Sho Sonoda , Isao Ishikawa , Masahiro Ikeda","doi":"10.1016/j.jspi.2024.106184","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106184","url":null,"abstract":"<div>To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function <math><mi>f</mi></math> to the parameter distribution <math><mi>γ</mi></math> so that a network <math><mrow><mstyle><mi>N</mi><mi>N</mi></mstyle><mrow><mo>[</mo><mi>γ</mi><mo>]</mo></mrow></mrow></math> reproduces <math><mi>f</mi></math>, i.e. <math><mrow><mstyle><mi>N</mi><mi>N</mi></mstyle><mrow><mo>[</mo><mi>γ</mi><mo>]</mo></mrow><mo>=</mo><mi>f</mi></mrow></math>. For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields <math><msub><mrow><mi>F</mi></mrow><mrow><mi>p</mi></mrow></msub></math>, group convolutional networks on abstract Hilbert space <math><mi>H</mi></math>, fully-connected networks on noncompact symmetric spaces <math><mrow><mi>G</mi><mo>/</mo><mi>K</mi></mrow></math>, and pooling layers, or the <math><mi>d</mi></math>-plane ridgelet transform.</div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106184"},"PeriodicalIF":0.9,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000417/pdfft?md5=98e3c89ff86925f67f13c56d174f0109&pid=1-s2.0-S0378375824000417-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140618803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust nonparametric regression based on deep ReLU neural networks 基于深度 ReLU 神经网络的鲁棒非参数回归

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-15 DOI: 10.1016/j.jspi.2024.106182

Juntong Chen

In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized the importance of safeguarding estimation procedures against systematic contamination. We approach this statistical issue by shifting our focus towards estimating conditional distributions. To address it robustly, we introduce a novel estimation procedure based on $ℓ$ -estimation. Under a mild model assumption, we establish general non-asymptotic risk bounds for the resulting estimators, showcasing their robustness against contamination, outliers, and model misspecification. We then delve into the application of our approach using deep ReLU neural networks. When the model is well-specified and the regression function belongs to an $α$ -Hölder class, employing $ℓ$ -type estimation on suitable networks enables the resulting estimators to achieve the minimax optimal rate of convergence. Additionally, we demonstrate that deep $ℓ$ -type estimators can circumvent the curse of dimensionality by assuming the regression function closely resembles the composition of several Hölder functions. To attain this, new deep fully-connected ReLU neural networks have been designed to approximate this composition class. This approximation result can be of independent interest.

在本文中，我们考虑使用具有 ReLU 激活函数的深度神经网络进行稳健的非参数回归。虽然现有的几种理论上合理的方法都是针对相同重尾噪声分布的鲁棒性，但对抗性攻击的兴起强调了保护估计程序免受系统性污染的重要性。我们通过将重点转向条件分布的估计来解决这一统计问题。为了稳健地解决这个问题，我们引入了一种基于 ℓ 估计的新型估计程序。在温和的模型假设下，我们为所得到的估计值建立了一般的非渐近风险边界，展示了它们对污染、异常值和模型错误规范的稳健性。然后，我们利用深度 ReLU 神经网络深入研究了我们方法的应用。当模型指定良好且回归函数属于 α-Hölder 类时，在合适的网络上采用 ℓ 型估计能使得到的估计器达到最小最优收敛率。此外，我们还证明了深度ℓ 型估计器可以通过假设回归函数与多个霍尔德函数的组成非常相似来规避维度诅咒。为了实现这一目标，我们设计了新的深度全连接 ReLU 神经网络来逼近这一组成类别。这一近似结果具有独立的意义。

{"title":"Robust nonparametric regression based on deep ReLU neural networks","authors":"Juntong Chen","doi":"10.1016/j.jspi.2024.106182","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106182","url":null,"abstract":"<div>In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized the importance of safeguarding estimation procedures against systematic contamination. We approach this statistical issue by shifting our focus towards estimating conditional distributions. To address it robustly, we introduce a novel estimation procedure based on <math><mi>ℓ</mi></math>-estimation. Under a mild model assumption, we establish general non-asymptotic risk bounds for the resulting estimators, showcasing their robustness against contamination, outliers, and model misspecification. We then delve into the application of our approach using deep ReLU neural networks. When the model is well-specified and the regression function belongs to an <math><mi>α</mi></math>-Hölder class, employing <math><mi>ℓ</mi></math>-type estimation on suitable networks enables the resulting estimators to achieve the minimax optimal rate of convergence. Additionally, we demonstrate that deep <math><mi>ℓ</mi></math>-type estimators can circumvent the curse of dimensionality by assuming the regression function closely resembles the composition of several Hölder functions. To attain this, new deep fully-connected ReLU neural networks have been designed to approximate this composition class. This approximation result can be of independent interest.</div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106182"},"PeriodicalIF":0.9,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000399/pdfft?md5=79a5bc36ebe3d6024d39b9f8adf1f910&pid=1-s2.0-S0378375824000399-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140649412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Convergence guarantees for forward gradient descent in the linear regression model 线性回归模型中前向梯度下降的收敛保证

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-06 DOI: 10.1016/j.jspi.2024.106174

Thijs Bos , Johannes Schmidt-Hieber

Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient-free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If $d$ denotes the number of parameters and $k$ the number of samples, we prove that the mean squared error of this method converges for $k ≳ d^{2} log (d)$ with rate $d^{2} log (d) / k$ . Compared to the dimension dependence $d$ for stochastic gradient descent, an additional factor $d log (d)$ occurs.

人们对人工神经网络和生物神经网络之间关系的兴趣再次激发了对无梯度方法的研究。考虑到随机设计的线性回归模型，我们在本研究中从理论上分析了基于梯度随机线性组合的生物（权重扰动）前向梯度方案。如果 d 表示参数个数，k 表示样本个数，我们证明这种方法的均方误差在 k≳d2log(d) 条件下以 d2log(d)/k 的速率收敛。与随机梯度下降法的维度依赖性 d 相比，多了一个系数 dlog(d)。

{"title":"Convergence guarantees for forward gradient descent in the linear regression model","authors":"Thijs Bos , Johannes Schmidt-Hieber","doi":"10.1016/j.jspi.2024.106174","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106174","url":null,"abstract":"<div>Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient-free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If <math><mi>d</mi></math> denotes the number of parameters and <math><mi>k</mi></math> the number of samples, we prove that the mean squared error of this method converges for <math><mrow><mi>k</mi><mo>≳</mo><msup><mrow><mi>d</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>log</mo><mrow><mo>(</mo><mi>d</mi><mo>)</mo></mrow></mrow></math> with rate <math><mrow><msup><mrow><mi>d</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>log</mo><mrow><mo>(</mo><mi>d</mi><mo>)</mo></mrow><mo>/</mo><mi>k</mi></mrow></math>. Compared to the dimension dependence <math><mi>d</mi></math> for stochastic gradient descent, an additional factor <math><mrow><mi>d</mi><mo>log</mo><mrow><mo>(</mo><mi>d</mi><mo>)</mo></mrow></mrow></math> occurs.</div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106174"},"PeriodicalIF":0.9,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000314/pdfft?md5=fc5918288c472da3301b467d899078ad&pid=1-s2.0-S0378375824000314-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140536571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Toward improved inference for Krippendorff’s Alpha agreement coefficient 改进克里彭多夫阿尔法一致系数的推断方法

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-05 DOI: 10.1016/j.jspi.2024.106170

John Hughes

In this article I recommend a better point estimator for Krippendorff’s Alpha agreement coefficient, and develop a jackknife variance estimator that leads to much better interval estimation than does the customary bootstrap procedure or an alternative bootstrap procedure. Having developed the new methodology, I analyze nominal data previously analyzed by Krippendorff, and two experimentally observed datasets: (1) ordinal data from an imaging study of congenital diaphragmatic hernia, and (2) United States Environmental Protection Agency air pollution data for the Philadelphia, Pennsylvania area. The latter two applications are novel. The proposed methodology is now supported in version 2.0 of my open source R package, krippendorffsalpha, which supports common and user-defined distance functions, and can accommodate any number of units, any number of coders, and missingness. Interval computation can be parallelized.

在这篇文章中，我为克里彭多夫的阿尔法一致系数推荐了一个更好的点估计器，并开发了一个杰克刀方差估计器，它能比习惯的自举程序或替代自举程序带来更好的区间估计。在开发出新方法后，我分析了克里彭多夫之前分析过的名义数据，以及两个实验观察数据集：(1) 来自先天性膈疝成像研究的序数数据，以及 (2) 美国环境保护局提供的宾夕法尼亚州费城地区空气污染数据。后两个应用都很新颖。现在，我的开源 R 软件包 krippendorffsalpha 的 2.0 版本支持所提出的方法，该软件包支持常见的和用户定义的距离函数，并能容纳任意数量的单位、任意数量的编码器和缺失。区间计算可以并行化。

引用次数: 0

Informed censoring: The parametric combination of data and expert information 知情剔除：数据和专家信息的参数组合

IF 0.9 4区数学 Q3 STATISTICS & PROBABILITY

Journal of Statistical Planning and Inference

Pub Date : 2024-04-05 DOI: 10.1016/j.jspi.2024.106171

Hansjörg Albrecher , Martin Bladt

The statistical censoring setup is extended to the situation when random measures can be assigned to the realization of datapoints, leading to a new way of incorporating expert information into the usual parametric estimation procedures. The asymptotic theory is provided for the resulting estimators, and some special cases of practical relevance are studied in more detail. Although the proposed framework mathematically generalizes censoring and coarsening at random, and borrows techniques from M-estimation theory, it provides a novel and transparent methodology which enjoys significant practical applicability in situations where expert information is present. The potential of the approach is illustrated by a concrete actuarial application of tail parameter estimation for a heavy-tailed MTPL dataset with limited available expert information.

统计剔除设置被扩展到可以为数据点的实现分配随机度量的情况，从而为将专家信息纳入通常的参数估计程序提供了一种新方法。我们为由此产生的估计器提供了渐近理论，并对一些具有实际意义的特殊情况进行了更详细的研究。尽管所提出的框架在数学上概括了随机普查和粗化，并借鉴了 M 估计理论的技术，但它提供了一种新颖、透明的方法，在存在专家信息的情况下具有重要的实际应用价值。通过对重尾 MTPL 数据集尾部参数估计的具体精算应用，在专家信息有限的情况下，说明了该方法的潜力。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Statistical Planning and Inference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀