Pub Date : 2023-12-12DOI: 10.1007/s42952-023-00246-z
Edward Kanuti Ngailo, Saralees Nadarajah
This paper introduces a novel approach for approximating misclassification probabilities in Euclidean distance classifier when the group means exhibit a bilinear structure such as in the growth curve model first proposed by Potthoff and Roy (Biometrika 51:313–326, 1964). Initially, by leveraging certain statistical relationships, we establish two general results for the improved Euclidean discriminant function in both weighted and unweighted growth curve mean structures. We derive these approximations for the expected misclassification probabilities with respect to the distribution of the improved Euclidean discriminant function. Additionally, we compare the misclassification probabilities of the improved Euclidean discriminant function, the standard Euclidean discriminant function, and the linear discriminant function. It is important to note that in cases where the mean structure is weighted, a higher number of repeated measurements yields better classification results with the improved Euclidean discriminant function and the standard Euclidean discriminant function, allowing for more information to be acquired, as opposed to the linear discriminant function, which performs well with a smaller number of repeated measurements. Furthermore, we evaluate the accuracy of the suggested approximations by Monte Carlo simulations.
本文介绍了一种新方法,用于近似欧氏距离分类器中的误分类概率,当群体均值呈现双线性结构时,例如 Potthoff 和 Roy 首次提出的增长曲线模型(Biometrika 51:313-326, 1964)。首先,通过利用某些统计关系,我们为加权和非加权增长曲线均值结构中的改进欧氏判别函数建立了两个一般结果。根据改进欧氏判别函数的分布,我们得出了这些预期误分类概率的近似值。此外,我们还比较了改进欧氏判别函数、标准欧氏判别函数和线性判别函数的误分类概率。值得注意的是,在平均结构加权的情况下,重复测量次数越多,改进欧氏判别函数和标准欧氏判别函数的分类结果就越好,这样可以获得更多的信息,而线性判别函数在重复测量次数较少的情况下表现较好。此外,我们还通过蒙特卡罗模拟评估了建议近似值的准确性。
{"title":"Classification of repeated measurements using bias corrected Euclidean distance discriminant function","authors":"Edward Kanuti Ngailo, Saralees Nadarajah","doi":"10.1007/s42952-023-00246-z","DOIUrl":"https://doi.org/10.1007/s42952-023-00246-z","url":null,"abstract":"<p>This paper introduces a novel approach for approximating misclassification probabilities in Euclidean distance classifier when the group means exhibit a bilinear structure such as in the growth curve model first proposed by Potthoff and Roy (Biometrika 51:313–326, 1964). Initially, by leveraging certain statistical relationships, we establish two general results for the improved Euclidean discriminant function in both weighted and unweighted growth curve mean structures. We derive these approximations for the expected misclassification probabilities with respect to the distribution of the improved Euclidean discriminant function. Additionally, we compare the misclassification probabilities of the improved Euclidean discriminant function, the standard Euclidean discriminant function, and the linear discriminant function. It is important to note that in cases where the mean structure is weighted, a higher number of repeated measurements yields better classification results with the improved Euclidean discriminant function and the standard Euclidean discriminant function, allowing for more information to be acquired, as opposed to the linear discriminant function, which performs well with a smaller number of repeated measurements. Furthermore, we evaluate the accuracy of the suggested approximations by Monte Carlo simulations.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"13 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138575049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-03DOI: 10.1007/s42952-023-00242-3
Young Joo Lee, Yongho Jeon
In this paper, we propose a calibrated ConCave-Convex Procedure (CCCP) for variable selection in high-dimensional functional linear models. The calibrated CCCP approach for the Smoothly Clipped Absolute Deviation (SCAD) penalty is known to produce a consistent solution path with probability converging to one in linear models. We incorporate the SCAD penalty into function-on-scalar regression models and phrase them as a type of group-penalized estimation using a basis expansion approach. We then implement the calibrated CCCP method to solve the nonconvex group-penalized problem. For the tuning procedure, we use the Extended Bayesian Information Criterion (EBIC) to ensure consistency in high-dimensional settings. In simulation studies, we compare the performance of the proposed method with two existing convex-penalized estimators in terms of variable selection consistency and prediction accuracy. Lastly, we apply the method to the gene expression dataset for sparsely estimating the time-varying effects of transcription factors on the regulation of yeast cell cycle genes.
{"title":"Sparse functional linear models via calibrated concave-convex procedure","authors":"Young Joo Lee, Yongho Jeon","doi":"10.1007/s42952-023-00242-3","DOIUrl":"https://doi.org/10.1007/s42952-023-00242-3","url":null,"abstract":"<p>In this paper, we propose a calibrated ConCave-Convex Procedure (CCCP) for variable selection in high-dimensional functional linear models. The calibrated CCCP approach for the Smoothly Clipped Absolute Deviation (SCAD) penalty is known to produce a consistent solution path with probability converging to one in linear models. We incorporate the SCAD penalty into function-on-scalar regression models and phrase them as a type of group-penalized estimation using a basis expansion approach. We then implement the calibrated CCCP method to solve the nonconvex group-penalized problem. For the tuning procedure, we use the Extended Bayesian Information Criterion (EBIC) to ensure consistency in high-dimensional settings. In simulation studies, we compare the performance of the proposed method with two existing convex-penalized estimators in terms of variable selection consistency and prediction accuracy. Lastly, we apply the method to the gene expression dataset for sparsely estimating the time-varying effects of transcription factors on the regulation of yeast cell cycle genes.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"25 7","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-03DOI: 10.1007/s42952-023-00241-4
Meisam Moghimbeygi, Mousa Golalizadeh
Shape, as an intrinsic concept, can be considered as a source of information in some statistical analysis contexts. For instance, one of the important topics in morphology is to study the shape changes along time. From a topological viewpoint, shape data are points on a particular manifold and so to construct a longitudinal model for treating shape variation is not as trivial as thought. Unlike using the common parametric models to do such a task, we invoke Procrustes analysis in the context of a nonparametric framework and propose a simple, yet useful, model to deal with shape changes. After conveying the problem into the nonparametric regression model, we utilize the weighted least squares method to estimates the related parameters. Also, we illustrate implementing this new model in simulation studies and analyzing two biological data sets. Our proposed model shows its superiority while compared with other counterpart models.
{"title":"Nonparametric longitudinal regression model to analyze shape data using the Procrustes rotation","authors":"Meisam Moghimbeygi, Mousa Golalizadeh","doi":"10.1007/s42952-023-00241-4","DOIUrl":"https://doi.org/10.1007/s42952-023-00241-4","url":null,"abstract":"<p>Shape, as an intrinsic concept, can be considered as a source of information in some statistical analysis contexts. For instance, one of the important topics in morphology is to study the shape changes along time. From a topological viewpoint, shape data are points on a particular manifold and so to construct a longitudinal model for treating shape variation is not as trivial as thought. Unlike using the common parametric models to do such a task, we invoke Procrustes analysis in the context of a nonparametric framework and propose a simple, yet useful, model to deal with shape changes. After conveying the problem into the nonparametric regression model, we utilize the weighted least squares method to estimates the related parameters. Also, we illustrate implementing this new model in simulation studies and analyzing two biological data sets. Our proposed model shows its superiority while compared with other counterpart models.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"25 6","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-19DOI: 10.1007/s42952-023-00238-z
Tianqing Liu, Xiaohui Yuan, Liuquan Sun
The regularization approach for variable selection was well developed for semiparametric accelerated failure time (AFT) models, where the response variable is right censored. In the presence of missing data, this approach needs to be tailored to different missing data mechanisms. In this paper, we propose a flexible and generally applicable missing data mechanism for AFT models, which contains both ignorable and nonignorable missing data mechanism assumptions. We propose weighted rank (WR) estimators and corresponding penalized estimators of regression parameters under this missing data mechanism. An advantage of the WR estimators and corresponding penalized estimators is that they do not require specifying a missing data model for the proposed missing data mechanism. The theoretical properties of the WR and corresponding penalized estimators are established. Comprehensive simulation studies and a real data application further demonstrate the merits of our approach.
{"title":"Variable selection for semiparametric accelerated failure time models with nonignorable missing data","authors":"Tianqing Liu, Xiaohui Yuan, Liuquan Sun","doi":"10.1007/s42952-023-00238-z","DOIUrl":"https://doi.org/10.1007/s42952-023-00238-z","url":null,"abstract":"<p>The regularization approach for variable selection was well developed for semiparametric accelerated failure time (AFT) models, where the response variable is right censored. In the presence of missing data, this approach needs to be tailored to different missing data mechanisms. In this paper, we propose a flexible and generally applicable missing data mechanism for AFT models, which contains both ignorable and nonignorable missing data mechanism assumptions. We propose weighted rank (WR) estimators and corresponding penalized estimators of regression parameters under this missing data mechanism. An advantage of the WR estimators and corresponding penalized estimators is that they do not require specifying a missing data model for the proposed missing data mechanism. The theoretical properties of the WR and corresponding penalized estimators are established. Comprehensive simulation studies and a real data application further demonstrate the merits of our approach.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"26 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-18DOI: 10.1007/s42952-023-00240-5
Deru Kong, Wei Shen, Shengli Zhao, WenWu Wang
In real applications, the correlated data are commonly encountered. To model such data, many techniques have been proposed. However, of the developed techniques, emphasis has been on the mean function estimation under correlated errors, with scant attention paid to the derivative estimation. In this paper, we propose the locally weighted least squares regression based on different difference quotients to estimate the different order derivatives under correlated errors. For the proposed estimators, we derive their asymptotic bias and variance with different covariance structure errors, which dramatically reduce the estimation variance compared with traditional methods. Furthermore, we establish their asymptotic normality for constructing confidence interval. Based on the asymptotic mean integrated squared error, we provide a data-driven tuning parameters selection criterion. Simulation studies show that the proposed method is more robust and efficient than four other popular methods. Finally, we illustrate the usefulness of the proposed method with a real data example.
{"title":"Robust and Efficient derivative estimation under correlated errors","authors":"Deru Kong, Wei Shen, Shengli Zhao, WenWu Wang","doi":"10.1007/s42952-023-00240-5","DOIUrl":"https://doi.org/10.1007/s42952-023-00240-5","url":null,"abstract":"<p>In real applications, the correlated data are commonly encountered. To model such data, many techniques have been proposed. However, of the developed techniques, emphasis has been on the mean function estimation under correlated errors, with scant attention paid to the derivative estimation. In this paper, we propose the locally weighted least squares regression based on different difference quotients to estimate the different order derivatives under correlated errors. For the proposed estimators, we derive their asymptotic bias and variance with different covariance structure errors, which dramatically reduce the estimation variance compared with traditional methods. Furthermore, we establish their asymptotic normality for constructing confidence interval. Based on the asymptotic mean integrated squared error, we provide a data-driven tuning parameters selection criterion. Simulation studies show that the proposed method is more robust and efficient than four other popular methods. Finally, we illustrate the usefulness of the proposed method with a real data example.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"26 3","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-14DOI: 10.1007/s42952-023-00239-y
Semin Choi, Gunwoong Park
{"title":"Asymptotic bias of the $$ell _2$$-regularized error variance estimator","authors":"Semin Choi, Gunwoong Park","doi":"10.1007/s42952-023-00239-y","DOIUrl":"https://doi.org/10.1007/s42952-023-00239-y","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"12 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134954534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-13DOI: 10.1007/s42952-023-00235-2
Rohan D. Koshti, Kirtee K. Kamalja
{"title":"A review on concomitants of order statistics and its application in parameter estimation under ranked set sampling","authors":"Rohan D. Koshti, Kirtee K. Kamalja","doi":"10.1007/s42952-023-00235-2","DOIUrl":"https://doi.org/10.1007/s42952-023-00235-2","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"63 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136282166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1007/s42952-023-00232-5
A. M. Elsawah
{"title":"A novel doubling-tripling-threshold accepting hybrid algorithm for constructing asymmetric space-filling designs","authors":"A. M. Elsawah","doi":"10.1007/s42952-023-00232-5","DOIUrl":"https://doi.org/10.1007/s42952-023-00232-5","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"11 34","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135818350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-25DOI: 10.1007/s42952-023-00234-3
Xiaohui Yuan, Yue Wang, Yiming Wang, Tianqing Liu
{"title":"Variable selection for single-index models based on martingale difference divergence","authors":"Xiaohui Yuan, Yue Wang, Yiming Wang, Tianqing Liu","doi":"10.1007/s42952-023-00234-3","DOIUrl":"https://doi.org/10.1007/s42952-023-00234-3","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134973791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-23DOI: 10.1007/s42952-023-00237-0
Xiaohui Yuan, Xinran Zhang, Yue Wang, Chunjie Wang
{"title":"Distributed smoothed rank regression with heterogeneous errors for massive data","authors":"Xiaohui Yuan, Xinran Zhang, Yue Wang, Chunjie Wang","doi":"10.1007/s42952-023-00237-0","DOIUrl":"https://doi.org/10.1007/s42952-023-00237-0","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":"18 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135368681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}