首页 > 最新文献

Sankhya. Series B (2008)最新文献

英文 中文
Ratio-cum-product Type Estimators for Rare and Hidden Clustered Population. 稀有和隐藏聚类总体的比值-积型估计。
Pub Date : 2023-01-01 DOI: 10.1007/s13571-022-00298-x
Rajesh Singh, Rohan Mishra

The use of multi-auxiliary variables helps in increasing the precision of the estimators, especially when the population is rare and hidden clustered. In this article, four ratio-cum-product type estimators have been proposed using two auxiliary variables under adaptive cluster sampling (ACS) design. The expressions of the mean square error (MSE) of the proposed ratio-cum-product type estimators have been derived up to the first order of approximation and presented along with their efficiency conditions with respect to the estimators presented in this article. The efficiency of the proposed estimators over similar existing estimators have been assessed on four different populations two of which are of the daily spread of COVID-19 cases. The proposed estimators performed better than the estimators presented in this article on all four populations indicating their wide applicability and precision.

多辅助变量的使用有助于提高估计器的精度,特别是当人口稀少和隐藏聚类时。本文在自适应聚类抽样(ACS)设计下,提出了使用两个辅助变量的四种比积型估计。本文推导了一阶近似下所提出的比积型估计器的均方误差(MSE)表达式,并给出了它们相对于本文所提出的估计器的效率条件。在四个不同的人群中评估了所提出的估计器相对于类似现有估计器的效率,其中两个是每日传播COVID-19病例的人群。所提出的估计器比本文中提出的估计器在所有四种总体上表现得更好,表明它们具有广泛的适用性和精度。
{"title":"Ratio-cum-product Type Estimators for Rare and Hidden Clustered Population.","authors":"Rajesh Singh,&nbsp;Rohan Mishra","doi":"10.1007/s13571-022-00298-x","DOIUrl":"https://doi.org/10.1007/s13571-022-00298-x","url":null,"abstract":"<p><p>The use of multi-auxiliary variables helps in increasing the precision of the estimators, especially when the population is rare and hidden clustered. In this article, four ratio-cum-product type estimators have been proposed using two auxiliary variables under adaptive cluster sampling (ACS) design. The expressions of the mean square error (MSE) of the proposed ratio-cum-product type estimators have been derived up to the first order of approximation and presented along with their efficiency conditions with respect to the estimators presented in this article. The efficiency of the proposed estimators over similar existing estimators have been assessed on four different populations two of which are of the daily spread of COVID-19 cases. The proposed estimators performed better than the estimators presented in this article on all four populations indicating their wide applicability and precision.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"85 1","pages":"33-53"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9734625/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9312849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mortality Comparisons 'At a Glance': A Mortality Concentration Curve and Decomposition Analysis for India. 死亡率比较“一瞥”:印度的死亡率浓度曲线和分解分析。
Pub Date : 2022-01-01 Epub Date: 2022-07-28 DOI: 10.1007/s13571-022-00293-2
John Creedy, S Subramanian

This paper uses the concept of the Mortality Concentration Curve (M-Curve), which plots the cumulative proportion of deaths against the corresponding cumulative proportion of the population (arranged in ascending order of age), and associated measures, to examine mortality experience in India. A feature of the M-curve is that it can be combined with an explicit value judgement (an aversion to early deaths) in order to make welfare-loss comparisons. Empirical comparisons over time, and between regions and genders, are made. Furthermore, in order to provide additional perspective, selective results for the UK and New Zealand are reported. It is also shown how the M-curve concept can be used to separate the contributions to overall mortality of changes over time (or differences between population groups) to the population age distribution and age-specific mortality rates.

本文使用死亡率集中曲线(m曲线)的概念,该曲线将累积死亡比例与相应的人口累积比例(按年龄升序排列)和相关措施绘制出来,以检验印度的死亡率经验。m曲线的一个特点是,它可以与明确的价值判断(对过早死亡的厌恶)结合起来,以便进行福利损失的比较。对不同时期、不同地区和性别进行了实证比较。此外,为了提供额外的视角,报告了英国和新西兰的选择性结果。它还显示了如何使用m曲线概念来区分随时间变化(或人口群体之间的差异)对人口年龄分布和特定年龄死亡率的总体死亡率的贡献。
{"title":"Mortality Comparisons 'At a Glance': A Mortality Concentration Curve and Decomposition Analysis for India.","authors":"John Creedy,&nbsp;S Subramanian","doi":"10.1007/s13571-022-00293-2","DOIUrl":"https://doi.org/10.1007/s13571-022-00293-2","url":null,"abstract":"<p><p>This paper uses the concept of the Mortality Concentration Curve (<i>M</i>-Curve), which plots the cumulative proportion of deaths against the corresponding cumulative proportion of the population (arranged in ascending order of age), and associated measures, to examine mortality experience in India. A feature of the <i>M</i>-curve is that it can be combined with an explicit value judgement (an aversion to early deaths) in order to make welfare-loss comparisons. Empirical comparisons over time, and between regions and genders, are made. Furthermore, in order to provide additional perspective, selective results for the UK and New Zealand are reported. It is also shown how the <i>M</i>-curve concept can be used to separate the contributions to overall mortality of changes over time (or differences between population groups) to the population age distribution and age-specific mortality rates.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":" ","pages":"873-894"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9330966/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40593492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
COVID-19: Optimal Design of Serosurveys for Disease Burden Estimation. COVID-19:疾病负担估算血清调查的优化设计
Pub Date : 2022-01-01 Epub Date: 2021-10-19 DOI: 10.1007/s13571-021-00267-w
Siva Athreya, Giridhara R Babu, Aniruddha Iyer, Mohammed Minhaas B S, Nihesh Rathod, Sharad Shriram, Rajesh Sundaresan, Nidhin Koshy Vaidhiyan, Sarath Yasodharan

We provide a methodology by which an epidemiologist may arrive at an optimal design for a survey whose goal is to estimate the disease burden in a population. For serosurveys with a given budget of C rupees, a specified set of tests with costs, sensitivities, and specificities, we show the existence of optimal designs in four different contexts, including the well known c-optimal design. Usefulness of the results are illustrated via numerical examples. Our results are applicable to a wide range of epidemiological surveys under the assumptions that the estimate's Fisher-information matrix satisfies a uniform positive definite criterion.

我们提供了一种方法,通过这种方法,流行病学家可以为一项旨在估计人群疾病负担的调查得出最佳设计。对于具有给定预算C卢比的血清调查,一组具有成本、灵敏度和特异性的特定测试,我们展示了在四种不同情况下存在最佳设计,包括众所周知的C -最优设计。通过数值算例说明了所得结果的有效性。我们的结果适用于大范围的流行病学调查,假设估计的费雪信息矩阵满足统一的正定准则。
{"title":"COVID-19: Optimal Design of Serosurveys for Disease Burden Estimation.","authors":"Siva Athreya,&nbsp;Giridhara R Babu,&nbsp;Aniruddha Iyer,&nbsp;Mohammed Minhaas B S,&nbsp;Nihesh Rathod,&nbsp;Sharad Shriram,&nbsp;Rajesh Sundaresan,&nbsp;Nidhin Koshy Vaidhiyan,&nbsp;Sarath Yasodharan","doi":"10.1007/s13571-021-00267-w","DOIUrl":"https://doi.org/10.1007/s13571-021-00267-w","url":null,"abstract":"<p><p>We provide a methodology by which an epidemiologist may arrive at an optimal design for a survey whose goal is to estimate the disease burden in a population. For serosurveys with a given budget of <i>C</i> rupees, a specified set of tests with costs, sensitivities, and specificities, we show the existence of optimal designs in four different contexts, including the well known c-optimal design. Usefulness of the results are illustrated via numerical examples. Our results are applicable to a wide range of epidemiological surveys under the assumptions that the estimate's Fisher-information matrix satisfies a uniform positive definite criterion.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"84 2","pages":"472-494"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8524406/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39554683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Poisson Counts, Square Root Transformation and Small Area Estimation: Square Root Transformation. 泊松计数,平方根变换和小面积估计:平方根变换。
Pub Date : 2022-01-01 Epub Date: 2021-10-11 DOI: 10.1007/s13571-021-00269-8
Malay Ghosh, Tamal Ghosh, Masayo Y Hirose

The paper intends to serve two objectives. First, it revisits the celebrated Fay-Herriot model, but with homoscedastic known error variance. The motivation comes from an analysis of count data, in the present case, COVID-19 fatality for all counties in Florida. The Poisson model seems appropriate here, as is typical for rare events. An empirical Bayes (EB) approach is taken for estimation. However, unlike the conventional conjugate gamma or the log-normal prior for the Poisson mean, here we make a square root transformation of the original Poisson data, along with square root transformation of the corresponding mean. Proper back transformation is used to infer about the original Poisson means. The square root transformation makes the normal approximation of the transformed data more justifiable with added homoscedasticity. We obtain exact analytical formulas for the bias and mean squared error of the proposed EB estimators. In addition to illustrating our method with the COVID-19 example, we also evaluate performance of our procedure with simulated data as well.

这篇论文有两个目的。首先,它重新审视了著名的Fay-Herriot模型,但具有均方差已知误差方差。其动机来自对计数数据的分析,在本例中,佛罗里达州所有县的COVID-19死亡人数。泊松模型在这里似乎是合适的,因为它是罕见事件的典型模型。采用经验贝叶斯(EB)方法进行估计。然而,与传统的共轭伽马或泊松均值的对数正态先验不同,这里我们对原始泊松数据进行了平方根变换,并对相应的均值进行了平方根变换。利用适当的反向变换对原泊松均值进行了推断。平方根变换增加了均方差,使变换后的数据的正态近似更加合理。我们得到了所提出的EB估计的偏差和均方误差的精确解析公式。除了用COVID-19示例说明我们的方法外,我们还用模拟数据评估了我们的程序的性能。
{"title":"Poisson Counts, Square Root Transformation and Small Area Estimation: Square Root Transformation.","authors":"Malay Ghosh,&nbsp;Tamal Ghosh,&nbsp;Masayo Y Hirose","doi":"10.1007/s13571-021-00269-8","DOIUrl":"https://doi.org/10.1007/s13571-021-00269-8","url":null,"abstract":"<p><p>The paper intends to serve two objectives. First, it revisits the celebrated Fay-Herriot model, but with homoscedastic known error variance. The motivation comes from an analysis of count data, in the present case, COVID-19 fatality for all counties in Florida. The Poisson model seems appropriate here, as is typical for rare events. An empirical Bayes (EB) approach is taken for estimation. However, unlike the conventional conjugate gamma or the log-normal prior for the Poisson mean, here we make a square root transformation of the original Poisson data, along with square root transformation of the corresponding mean. Proper back transformation is used to infer about the original Poisson means. The square root transformation makes the normal approximation of the transformed data more justifiable with added homoscedasticity. We obtain exact analytical formulas for the bias and mean squared error of the proposed EB estimators. In addition to illustrating our method with the COVID-19 example, we also evaluate performance of our procedure with simulated data as well.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"84 2","pages":"449-471"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8503421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39525271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A shared spatial model for multivariate extreme-valued binary data with non-random missingness. 具有非随机缺失的多元极值二进制数据的共享空间模型。
Pub Date : 2021-11-01 Epub Date: 2019-07-16 DOI: 10.1007/s13571-019-00198-7
Xiaoyue Zhao, Lin Zhang, Dipankar Bandyopadhyay

Clinical studies and trials on periodontal disease (PD) generate a large volume of data collected at various tooth locations of a subject. However, they present a number of statistical complexities. When our focus is on understanding the extent of extreme PD progression, standard analysis under a generalized linear mixed model framework with a symmetric (logit) link may be inappropriate, as the binary split (extreme disease versus not) maybe highly skewed. In addition, PD progression is often hypothesized to be spatially-referenced, i.e. proximal teeth may have a similar PD status than those that are distally located. Furthermore, a non-ignorable quantity of missing data is observed, and the missingness is non-random, as it informs the periodontal health status of the subject. In this paper, we address all the above concerns through a shared (spatial) latent factor model, where the latent factor jointly models the extreme binary responses via a generalized extreme value regression, and the non-randomly missing teeth via a probit regression. Our approach is Bayesian, and the inferential framework is powered by within-Gibbs Hamiltonian Monte Carlo techniques. Through simulation studies and application to a real dataset on PD, we demonstrate the potential advantages of our model in terms of model fit, and obtaining precise parameter estimates over alternatives that do not consider the aforementioned complexities.

牙周病(PD)的临床研究和试验产生了大量的数据,这些数据收集于受试者的不同牙齿位置。然而,它们带来了一些统计上的复杂性。当我们的重点是了解极端PD进展的程度时,在具有对称(logit)链接的广义线性混合模型框架下的标准分析可能是不合适的,因为二元分裂(极端疾病与非极端疾病)可能高度倾斜。此外,PD的进展通常被假设为空间参考,即近端牙齿可能比远端牙齿具有相似的PD状态。此外,观察到不可忽视的缺失数据量,并且缺失是非随机的,因为它告知受试者的牙周健康状况。在本文中,我们通过一个共享(空间)潜在因素模型来解决上述所有问题,其中潜在因素通过广义极值回归联合建模极端二元响应,通过probit回归联合建模非随机缺失牙齿。我们的方法是贝叶斯,而推理框架是由吉布斯哈密顿蒙特卡罗技术提供支持的。通过对PD真实数据集的仿真研究和应用,我们证明了我们的模型在模型拟合方面的潜在优势,并且与不考虑上述复杂性的替代方案相比,可以获得精确的参数估计。
{"title":"A shared spatial model for multivariate extreme-valued binary data with non-random missingness.","authors":"Xiaoyue Zhao,&nbsp;Lin Zhang,&nbsp;Dipankar Bandyopadhyay","doi":"10.1007/s13571-019-00198-7","DOIUrl":"https://doi.org/10.1007/s13571-019-00198-7","url":null,"abstract":"<p><p>Clinical studies and trials on periodontal disease (PD) generate a large volume of data collected at various tooth locations of a subject. However, they present a number of statistical complexities. When our focus is on understanding the extent of extreme PD progression, standard analysis under a generalized linear mixed model framework with a symmetric (logit) link may be inappropriate, as the binary split (extreme disease versus not) maybe highly skewed. In addition, PD progression is often hypothesized to be spatially-referenced, i.e. proximal teeth may have a similar PD status than those that are distally located. Furthermore, a non-ignorable quantity of missing data is observed, and the missingness is non-random, as it informs the periodontal health status of the subject. In this paper, we address all the above concerns through a shared (spatial) latent factor model, where the latent factor jointly models the extreme binary responses via a generalized extreme value regression, and the non-randomly missing teeth via a probit regression. Our approach is Bayesian, and the inferential framework is powered by within-Gibbs Hamiltonian Monte Carlo techniques. Through simulation studies and application to a real dataset on PD, we demonstrate the potential advantages of our model in terms of model fit, and obtaining precise parameter estimates over alternatives that do not consider the aforementioned complexities.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"83 2","pages":"374-396"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s13571-019-00198-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39600185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Statistical Modeling of Longitudinal Data with Non-ignorable Non-monotone Missingness with Semiparametric Bayesian and Machine Learning Components. 使用半参数贝叶斯和机器学习组件对具有不可忽略的非单调缺失性的纵向数据进行统计建模。
Pub Date : 2021-05-01 Epub Date: 2020-03-09 DOI: 10.1007/s13571-019-00222-w
Yu Cao, Nitai D Mukhopadhyay

In longitudinal studies, outcomes are measured repeatedly over time and it is common that not all the patients will be measured throughout the study. For example patients can be lost to follow-up (monotone missingness) or miss one or more visits (non-monotone missingness); hence there are missing outcomes. In the longitudinal setting, we often assume the missingness is related to the unobserved data, which is non-ignorable. Pattern-mixture models (PMM) analyze the joint distribution of outcome and patterns of missingness in longitudinal data with non-ignorable nonmonotone missingness. Existing methods employ PMM and impute the unobserved outcomes using the distribution of observed outcomes, conditioned on missing patterns. We extend the existing methods using latent class analysis (LCA) and a shared-parameter PMM. The LCA groups patterns of missingness with similar features and the shared-parameter PMM allows a subset of parameters to be different between latent classes when fitting a model. We also propose a method for imputation using distribution of observed data conditioning on latent class. Our model improves existing methods by accommodating data with small sample size. In a simulation study our estimator had smaller mean squared error than existing methods. Our methodology is applied to data from a phase II clinical trial that studies quality of life of patients with prostate cancer receiving radiation therapy.

在纵向研究中,结果是随着时间的推移反复测量的,而并非所有患者都会在整个研究期间接受测量,这种情况很常见。例如,患者可能失去随访(单调遗漏)或错过一次或多次就诊(非单调遗漏);因此会出现结果遗漏。在纵向研究中,我们通常假定缺失与未观察到的数据有关,而这些数据是不可忽略的。模式混杂模型(PMM)分析了具有不可忽略的非单调缺失的纵向数据中结果和缺失模式的联合分布。现有方法采用 PMM,利用观察到的结果分布,以缺失模式为条件,对未观察到的结果进行估算。我们利用潜类分析(LCA)和共享参数 PMM 扩展了现有方法。LCA 将具有相似特征的缺失模式分组,而共享参数 PMM 则允许在拟合模型时不同潜类之间的参数子集有所不同。我们还提出了一种利用潜类条件下的观测数据分布进行估算的方法。我们的模型改进了现有的方法,适用于样本量较小的数据。在一项模拟研究中,我们的估计器比现有方法的均方误差更小。我们的方法被应用于一项研究接受放射治疗的前列腺癌患者生活质量的 II 期临床试验数据。
{"title":"Statistical Modeling of Longitudinal Data with Non-ignorable Non-monotone Missingness with Semiparametric Bayesian and Machine Learning Components.","authors":"Yu Cao, Nitai D Mukhopadhyay","doi":"10.1007/s13571-019-00222-w","DOIUrl":"10.1007/s13571-019-00222-w","url":null,"abstract":"<p><p>In longitudinal studies, outcomes are measured repeatedly over time and it is common that not all the patients will be measured throughout the study. For example patients can be lost to follow-up (monotone missingness) or miss one or more visits (non-monotone missingness); hence there are missing outcomes. In the longitudinal setting, we often assume the missingness is related to the unobserved data, which is non-ignorable. Pattern-mixture models (PMM) analyze the joint distribution of outcome and patterns of missingness in longitudinal data with non-ignorable nonmonotone missingness. Existing methods employ PMM and impute the unobserved outcomes using the distribution of observed outcomes, conditioned on missing patterns. We extend the existing methods using latent class analysis (LCA) and a shared-parameter PMM. The LCA groups patterns of missingness with similar features and the shared-parameter PMM allows a subset of parameters to be different between latent classes when fitting a model. We also propose a method for imputation using distribution of observed data conditioning on latent class. Our model improves existing methods by accommodating data with small sample size. In a simulation study our estimator had smaller mean squared error than existing methods. Our methodology is applied to data from a phase II clinical trial that studies quality of life of patients with prostate cancer receiving radiation therapy.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"83 1","pages":"152-169"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8209781/pdf/nihms-1574489.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39249434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Two-sample Nonparametric Test for Circular Data- its Exact Distribution and Performance. 圆形数据的两样本非参数检验——它的精确分布和性能。
Pub Date : 2021-01-01 Epub Date: 2021-02-13 DOI: 10.1007/s13571-020-00244-9
S Rao Jammalamadaka, Stéphane Guerrier, Vasudevan Mangalam

A nonparametric test labelled 'Rao Spacing-frequencies test' is explored and developed for testing whether two circular samples come from the same population. Its exact distribution and performance relative to comparable tests such as the Wheeler-Watson test and the Dixon test in small samples, are discussed. Although this test statistic is shown to be asymptotically normal, as one would expect, this large sample distribution does not provide satisfactory approximations for small to moderate samples. Exact critical values for small samples are obtained and tables provided here, using combinatorial techniques, and asymptotic critical regions are assessed against these. For moderate sample sizes in-between i.e. when the samples are too large making combinatorial techniques computationally prohibitive but yet asymptotic regions do not provide a good approximation, we provide a simple Monte Carlo procedure that gives very accurate critical values. As is well-known, the large number of usual rank-based tests are not applicable in the context of circular data since the values of such ranks depend on the arbitrary choice of origin and the sense of rotation used (clockwise or anti-clockwise). Tests that are invariant under the group of rotations, depend on the data through the so-called 'spacing frequencies', the frequencies of one sample that fall in between the spacings (or gaps) made by the other. The Wheeler-Watson, Dixon, and the proposed Rao tests are of this form and are explicitly useful for circular data, but they also have the added advantage of being valid and useful for comparing any two samples on the real line. Our study and simulations establish the 'Rao spacing-frequencies test' as a desirable, and indeed preferable test in a wide variety of contexts for comparing two circular samples, and as a viable competitor even for data on the real line. Computational help for implementing any of these tests, is made available online "TwoCircles" R package and is part of this paper.

一种非参数测试标记为“Rao间隔频率测试”的探索和开发用于测试两个圆形样本是否来自同一人口。它的确切分布和性能相对于可比较的测试,如惠勒-沃森测试和小样本的狄克逊测试,进行了讨论。尽管这个检验统计量被证明是渐近正态的,正如人们所期望的那样,这个大样本分布并不能为小样本到中等样本提供令人满意的近似。使用组合技术获得了小样本的精确临界值,并提供了表格,并根据这些评估了渐近临界区域。对于中间的中等样本量,即当样本量太大,使得组合技术在计算上令人望而却步,但渐近区域不能提供良好的近似值时,我们提供了一个简单的蒙特卡罗程序,可以给出非常精确的临界值。众所周知,大量通常的基于秩的检验不适用于圆形数据,因为这种秩的值取决于任意选择的原点和所使用的旋转方向(顺时针或逆时针)。在旋转组下不变的测试取决于通过所谓的“间隔频率”获得的数据,即一个样本的频率落在另一个样本的间隔(或间隙)之间。惠勒-沃森、迪克森和提出的Rao检验就是这种形式,对循环数据非常有用,但它们还有一个额外的优点,即对于比较实线上的任何两个样本都是有效和有用的。我们的研究和模拟建立了“Rao间隔频率测试”作为一种理想的,并且确实是在各种情况下比较两个圆形样本的优选测试,并且作为一种可行的竞争对手,即使是真实线上的数据。实现这些测试的计算帮助,可以在网上的“TwoCircles”R包中获得,并且是本文的一部分。
{"title":"A Two-sample Nonparametric Test for Circular Data- its Exact Distribution and Performance.","authors":"S Rao Jammalamadaka,&nbsp;Stéphane Guerrier,&nbsp;Vasudevan Mangalam","doi":"10.1007/s13571-020-00244-9","DOIUrl":"https://doi.org/10.1007/s13571-020-00244-9","url":null,"abstract":"<p><p>A nonparametric test labelled 'Rao Spacing-frequencies test' is explored and developed for testing whether two circular samples come from the same population. Its exact distribution and performance relative to comparable tests such as the Wheeler-Watson test and the Dixon test in small samples, are discussed. Although this test statistic is shown to be asymptotically normal, as one would expect, this large sample distribution does not provide satisfactory approximations for small to moderate samples. Exact critical values for small samples are obtained and tables provided here, using combinatorial techniques, and asymptotic critical regions are assessed against these. For moderate sample sizes in-between i.e. when the samples are too large making combinatorial techniques computationally prohibitive but yet asymptotic regions do not provide a good approximation, we provide a simple Monte Carlo procedure that gives very accurate critical values. As is well-known, the large number of usual rank-based tests are not applicable in the context of circular data since the values of such ranks depend on the arbitrary choice of origin and the sense of rotation used (clockwise or anti-clockwise). Tests that are invariant under the group of rotations, depend on the data through the so-called 'spacing frequencies', the frequencies of one sample that fall in between the spacings (or gaps) made by the other. The Wheeler-Watson, Dixon, and the proposed Rao tests are of this form and are explicitly useful for circular data, but they also have the added advantage of being valid and useful for comparing any two samples on the real line. Our study and simulations establish the 'Rao spacing-frequencies test' as a desirable, and indeed preferable test in a wide variety of contexts for comparing two circular samples, and as a viable competitor even for data on the real line. Computational help for implementing any of these tests, is made available online \"TwoCircles\" R package and is part of this paper.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":" ","pages":"140-166"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s13571-020-00244-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39845237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Clustering Patterns Connecting COVID-19 Dynamics and Human Mobility Using Optimal Transport. 利用最优传输连接 COVID-19 动态和人类流动性的聚类模式。
Pub Date : 2021-01-01 Epub Date: 2021-03-16 DOI: 10.1007/s13571-021-00255-0
Frank Nielsen, Gautier Marti, Sumanta Ray, Saumyadipta Pyne

Social distancing and stay-at-home are among the few measures that are known to be effective in checking the spread of a pandemic such as COVID-19 in a given population. The patterns of dependency between such measures and their effects on disease incidence may vary dynamically and across different populations. We described a new computational framework to measure and compare the temporal relationships between human mobility and new cases of COVID-19 across more than 150 cities of the United States with relatively high incidence of the disease. We used a novel application of Optimal Transport for computing the distance between the normalized patterns induced by bivariate time series for each pair of cities. Thus, we identified 10 clusters of cities with similar temporal dependencies, and computed the Wasserstein barycenter to describe the overall dynamic pattern for each cluster. Finally, we used city-specific socioeconomic covariates to analyze the composition of each cluster.

社会疏远和足不出户是已知能有效阻止 COVID-19 等大流行病在特定人群中传播的少数措施之一。这些措施之间的依赖模式及其对疾病发病率的影响可能会因不同人群而动态变化。我们介绍了一种新的计算框架,用于测量和比较美国 150 多个 COVID-19 发病率相对较高的城市中人类流动性与新增病例之间的时间关系。我们采用了一种新颖的最优传输技术,用于计算每对城市的双变量时间序列引起的归一化模式之间的距离。因此,我们确定了具有相似时间依赖性的 10 个城市集群,并计算了瓦瑟斯坦原点,以描述每个集群的整体动态模式。最后,我们使用特定城市的社会经济协变量来分析每个集群的构成。
{"title":"Clustering Patterns Connecting COVID-19 Dynamics and Human Mobility Using Optimal Transport.","authors":"Frank Nielsen, Gautier Marti, Sumanta Ray, Saumyadipta Pyne","doi":"10.1007/s13571-021-00255-0","DOIUrl":"10.1007/s13571-021-00255-0","url":null,"abstract":"<p><p>Social distancing and stay-at-home are among the few measures that are known to be effective in checking the spread of a pandemic such as COVID-19 in a given population. The patterns of dependency between such measures and their effects on disease incidence may vary dynamically and across different populations. We described a new computational framework to measure and compare the temporal relationships between human mobility and new cases of COVID-19 across more than 150 cities of the United States with relatively high incidence of the disease. We used a novel application of Optimal Transport for computing the distance between the normalized patterns induced by bivariate time series for each pair of cities. Thus, we identified 10 clusters of cities with similar temporal dependencies, and computed the Wasserstein barycenter to describe the overall dynamic pattern for each cluster. Finally, we used city-specific socioeconomic covariates to analyze the composition of each cluster.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"83 Suppl 1","pages":"167-184"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7961163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9438598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Expectation-Maximization Algorithm for Rice-Rayleigh Mixtures With Application to Noise Parameter Estimation in Magnitude MR Datasets. Rice-Rayleigh混合料的期望最大化算法及其在MR数据噪声参数估计中的应用
Pub Date : 2013-11-01 Epub Date: 2013-01-22 DOI: 10.1007/s13571-012-0055-y
Ranjan Maitra

Magnitude magnetic resonance (MR) images are noise-contaminated measurements of the true signal, and it is important to assess the noise in many applications. A recently introduced approach models the magnitude MR datum at each voxel in terms of a mixture of upto one Rayleigh and an a priori unspecified number of Rice components, all with a common noise parameter. The Expectation-Maximization (EM) algorithm was developed for parameter estimation, with the mixing component membership of each voxel as the missing observation. This paper revisits the EM algorithm by introducing more missing observations into the estimation problem such that the complete (observed and missing parts) dataset can be modeled in terms of a regular exponential family. Both the EM algorithm and variance estimation are then fairly straightforward without any need for potentially unstable numerical optimization methods. Compared to local neighborhood- and wavelet-based noise-parameter estimation methods, the new EM-based approach is seen to perform well not only on simulation datasets but also on physical phantom and clinical imaging data.

量级磁共振图像是对真实信号的噪声污染测量,在许多应用中对噪声进行评估是很重要的。最近引入的一种方法是,根据至多一个瑞利分量和一个先验的未指定数量的Rice分量的混合,对每个体素的大小MR基准进行建模,所有这些分量都具有共同的噪声参数。提出了期望最大化(EM)算法,以各体素的混合分量隶属度作为缺失观测值进行参数估计。本文通过在估计问题中引入更多缺失的观测值来重新审视EM算法,这样完整的(观察到的和缺失的部分)数据集可以根据正则指数族进行建模。EM算法和方差估计都是相当直接的,不需要任何可能不稳定的数值优化方法。与基于局部邻域和小波的噪声参数估计方法相比,新的基于em的方法不仅在模拟数据集上表现良好,而且在物理幻象和临床成像数据上也表现良好。
{"title":"On the Expectation-Maximization Algorithm for Rice-Rayleigh Mixtures With Application to Noise Parameter Estimation in Magnitude MR Datasets.","authors":"Ranjan Maitra","doi":"10.1007/s13571-012-0055-y","DOIUrl":"https://doi.org/10.1007/s13571-012-0055-y","url":null,"abstract":"<p><p>Magnitude magnetic resonance (MR) images are noise-contaminated measurements of the true signal, and it is important to assess the noise in many applications. A recently introduced approach models the magnitude MR datum at each voxel in terms of a mixture of upto one Rayleigh and an <i>a priori</i> unspecified number of Rice components, all with a common noise parameter. The Expectation-Maximization (EM) algorithm was developed for parameter estimation, with the mixing component membership of each voxel as the missing observation. This paper revisits the EM algorithm by introducing more missing observations into the estimation problem such that the complete (observed and missing parts) dataset can be modeled in terms of a regular exponential family. Both the EM algorithm and variance estimation are then fairly straightforward without any need for potentially unstable numerical optimization methods. Compared to local neighborhood- and wavelet-based noise-parameter estimation methods, the new EM-based approach is seen to perform well not only on simulation datasets but also on physical phantom and clinical imaging data.</p>","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"75 2","pages":"293-318"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s13571-012-0055-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36096043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
Sankhya. Series B (2008)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1