首页 > 最新文献

Journal of the Korean Statistical Society最新文献

英文 中文
Asymmetric kernel density estimation for biased data 偏差数据的非对称核密度估计
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-09-05 DOI: 10.1007/s42952-024-00280-5
Yoshihide Kakizawa

Nonparametric density estimation for nonnegative data is considered in a situation where a random sample is not directly available but the data are instead observed from the length-biased sampling. Due to the so-called boundary bias problem of the location-scale kernel, the approach in this paper is an application of asymmetric kernel. Some nonparametric density estimators are proposed. The mean integrated squared error, strong consistency, and asymptotic normality of the estimators are investigated. Simulation studies and a real data analysis illustrate the estimators.

本文考虑的是非负数据的非参数密度估计,这种情况下无法直接获得随机样本,而是通过长度偏差抽样观察数据。由于位置尺度核存在所谓的边界偏差问题,本文的方法是非对称核的应用。本文提出了一些非参数密度估计量。研究了估计器的平均综合平方误差、强一致性和渐近正态性。模拟研究和实际数据分析说明了这些估计器。
{"title":"Asymmetric kernel density estimation for biased data","authors":"Yoshihide Kakizawa","doi":"10.1007/s42952-024-00280-5","DOIUrl":"https://doi.org/10.1007/s42952-024-00280-5","url":null,"abstract":"<p>Nonparametric density estimation for nonnegative data is considered in a situation where a random sample is not directly available but the data are instead observed from the length-biased sampling. Due to the so-called boundary bias problem of the location-scale kernel, the approach in this paper is an application of asymmetric kernel. Some nonparametric density estimators are proposed. The mean integrated squared error, strong consistency, and asymptotic normality of the estimators are investigated. Simulation studies and a real data analysis illustrate the estimators.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Community detection for networks based on Monte Carlo type algorithms 基于蒙特卡洛算法的网络社群检测
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-08-26 DOI: 10.1007/s42952-024-00287-y
Wei Yu

The community detection is a significant problem in network data analysis. In this paper, we implement community detection by minimizing an objective function based on the difference between the adjacency matrix and its expected value, and explain the rationality of the objective function. To solve the optimization problem, we propose a new algorithm which is referred to the thoughts of Markov Chain Monte Carlo and low discrepancy sequence in the random simulation fields. We introduce a new indicator to compare the performance of the methods by measuring the similarity of the true community and the estimated community. Synthetic networks and real networks are analyzed to investigate the effectiveness of the new method. Results show that the performance of the proposed method is stable in all simulated scenarios. And in most cases, it outperforms existing methods.

社群检测是网络数据分析中的一个重要问题。本文通过最小化基于邻接矩阵与其期望值之差的目标函数来实现社群检测,并解释了目标函数的合理性。为了解决优化问题,我们提出了一种新算法,该算法参考了马尔可夫链蒙特卡罗和随机模拟领域中低差异序列的思想。我们引入了一个新指标,通过测量真实社区与估计社区的相似度来比较各种方法的性能。我们分析了合成网络和真实网络,以研究新方法的有效性。结果表明,在所有模拟场景中,建议方法的性能都很稳定。在大多数情况下,它都优于现有方法。
{"title":"Community detection for networks based on Monte Carlo type algorithms","authors":"Wei Yu","doi":"10.1007/s42952-024-00287-y","DOIUrl":"https://doi.org/10.1007/s42952-024-00287-y","url":null,"abstract":"<p>The community detection is a significant problem in network data analysis. In this paper, we implement community detection by minimizing an objective function based on the difference between the adjacency matrix and its expected value, and explain the rationality of the objective function. To solve the optimization problem, we propose a new algorithm which is referred to the thoughts of Markov Chain Monte Carlo and low discrepancy sequence in the random simulation fields. We introduce a new indicator to compare the performance of the methods by measuring the similarity of the true community and the estimated community. Synthetic networks and real networks are analyzed to investigate the effectiveness of the new method. Results show that the performance of the proposed method is stable in all simulated scenarios. And in most cases, it outperforms existing methods.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrated volatility estimation: the case of observed noise variables 综合波动率估算:观测噪声变量的情况
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-08-21 DOI: 10.1007/s42952-024-00286-z
Erindi Allaj

We propose a new estimator of the integrated volatility in presence of observed noise variables, measured, for example, by the trading volume or the bid-ask-spread. We find that, under specific conditions, the proposed estimator is consistent and the error, adjusted for the noise effects, between the proposed estimator and the integrated volatility has the same asymptotic distribution of the realized volatility estimator under no noise effects. Finally, our results are validated by a simulation and an empirical study.

我们提出了一种新的综合波动率估计方法,它可以在观察到噪声变量的情况下使用。我们发现,在特定条件下,所提出的估计值是一致的,而且根据噪声效应调整后的估计值与综合波动率之间的误差与无噪声效应下的已实现波动率估计值具有相同的渐近分布。最后,模拟和实证研究验证了我们的结果。
{"title":"Integrated volatility estimation: the case of observed noise variables","authors":"Erindi Allaj","doi":"10.1007/s42952-024-00286-z","DOIUrl":"https://doi.org/10.1007/s42952-024-00286-z","url":null,"abstract":"<p>We propose a new estimator of the integrated volatility in presence of observed noise variables, measured, for example, by the trading volume or the bid-ask-spread. We find that, under specific conditions, the proposed estimator is consistent and the error, adjusted for the noise effects, between the proposed estimator and the integrated volatility has the same asymptotic distribution of the realized volatility estimator under no noise effects. Finally, our results are validated by a simulation and an empirical study.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using statistical models for optimal packaging in semiconductor manufacturing processes 使用统计模型优化半导体制造工艺中的封装
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-08-07 DOI: 10.1007/s42952-024-00284-1
Dongguen Kim, Heejin Kim, Yejin Kim, Minwoo Chae, Young Myoung Ko, Young-Mok Bae, Hyungsub Sim, Young Chan Oh, Keum Hwan Noh

The importance of the back-end process in semiconductor manufacturing has recently received significant attention from global manufacturers. The analysis of manufacturing data often provides crucial insights into problems inherent in the manufacturing processes. An important goal of the back-end process is to improve the yield of final products, called packages. A simple way to achieve this goal is to characterize low-quality wafers based on the analysis of manufacturing data and discard them before proceeding to the packaging step. Alternatively, this paper proposes a novel packaging method that significantly improves the package yield using statistical models scoring the quality of dies. We prove that the proposed packaging method is optimal and conduct thorough numerical experiments, showing its superiority.

后端流程在半导体制造中的重要性最近受到了全球制造商的极大关注。对制造数据的分析往往能提供对制造过程中固有问题的重要见解。后端工艺的一个重要目标是提高最终产品(即封装)的产量。实现这一目标的一个简单方法是根据制造数据分析确定低质量晶片的特征,并在进入封装步骤之前将其丢弃。作为替代方案,本文提出了一种新颖的封装方法,利用统计模型对模具质量进行评分,从而显著提高封装产量。我们证明了所提出的封装方法是最优的,并进行了全面的数值实验,证明了其优越性。
{"title":"Using statistical models for optimal packaging in semiconductor manufacturing processes","authors":"Dongguen Kim, Heejin Kim, Yejin Kim, Minwoo Chae, Young Myoung Ko, Young-Mok Bae, Hyungsub Sim, Young Chan Oh, Keum Hwan Noh","doi":"10.1007/s42952-024-00284-1","DOIUrl":"https://doi.org/10.1007/s42952-024-00284-1","url":null,"abstract":"<p>The importance of the back-end process in semiconductor manufacturing has recently received significant attention from global manufacturers. The analysis of manufacturing data often provides crucial insights into problems inherent in the manufacturing processes. An important goal of the back-end process is to improve the yield of final products, called packages. A simple way to achieve this goal is to characterize low-quality wafers based on the analysis of manufacturing data and discard them before proceeding to the packaging step. Alternatively, this paper proposes a novel packaging method that significantly improves the package yield using statistical models scoring the quality of dies. We prove that the proposed packaging method is optimal and conduct thorough numerical experiments, showing its superiority.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized parametric help in Hilbertian additive regression 希尔伯特加法回归中的广义参数帮助
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-07-30 DOI: 10.1007/s42952-024-00283-2
Seung Hyun Moon, Young Kyung Lee, Byeong U. Park

This paper introduces a powerful bias reduction technique applied to local linear additive regression. The main idea is to make use of a parametric family. Existing techniques based on this idea use a parametric model that is linear in the parameter. In this paper we generalize the approaches by allowing nonlinear parametric families. We develop the methodology and theory for response variables taking values in a general separable Hilbert space. Under mild conditions, our proposed approach not only offers flexibility but also gains bias reduction while maintaining the variance of the local linear additive regression estimators. We also provide numerical evidences that support our approach.

本文介绍了一种应用于局部线性加法回归的强大的减少偏差技术。其主要思想是利用参数族。基于这一思想的现有技术使用的参数模型是参数的线性模型。在本文中,我们通过允许使用非线性参数族来推广这些方法。我们开发了在一般可分离希尔伯特空间取值的响应变量的方法和理论。在温和的条件下,我们提出的方法不仅具有灵活性,还能在保持局部线性加法回归估计方差的同时减少偏差。我们还提供了支持我们方法的数字证据。
{"title":"Generalized parametric help in Hilbertian additive regression","authors":"Seung Hyun Moon, Young Kyung Lee, Byeong U. Park","doi":"10.1007/s42952-024-00283-2","DOIUrl":"https://doi.org/10.1007/s42952-024-00283-2","url":null,"abstract":"<p>This paper introduces a powerful bias reduction technique applied to local linear additive regression. The main idea is to make use of a parametric family. Existing techniques based on this idea use a parametric model that is linear in the parameter. In this paper we generalize the approaches by allowing nonlinear parametric families. We develop the methodology and theory for response variables taking values in a general separable Hilbert space. Under mild conditions, our proposed approach not only offers flexibility but also gains bias reduction while maintaining the variance of the local linear additive regression estimators. We also provide numerical evidences that support our approach.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141867270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Objective Bayesian multiple testing for k normal populations k 个正常群体的客观贝叶斯多重检验
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-07-29 DOI: 10.1007/s42952-024-00281-4
Sang Gil Kang, Yongku Kim

This article proposes objective Bayesian multiple testing procedures for a normal model. The challenging task of considering all the configurations of true and false null hypotheses is addressed here by ordering the null hypotheses based on their Bayes factors. This approach reduces the size of the compared models for posterior search from (2^k) to (k+1), for k null hypotheses. Furthermore, the consistency of the proposed multiple testing procedures is established and their behavior is analyzed with simulated and real examples. In addition, the proposed procedures are compared with classical and Bayesian multiple testing procedures in all the possible configurations of true and false ordered null hypotheses.

本文提出了正态模型的客观贝叶斯多重检验程序。考虑所有真假零假设的配置是一项具有挑战性的任务,本文通过根据贝叶斯因子对零假设进行排序来解决这一问题。对于 k 个空假设,这种方法将用于后验搜索的比较模型的大小从 (2^k) 减少到 (k+1/)。此外,还建立了所提出的多重检验程序的一致性,并用模拟和实际例子分析了它们的行为。此外,在所有可能的真假有序零假设配置中,将所提出的程序与经典和贝叶斯多重检验程序进行了比较。
{"title":"Objective Bayesian multiple testing for k normal populations","authors":"Sang Gil Kang, Yongku Kim","doi":"10.1007/s42952-024-00281-4","DOIUrl":"https://doi.org/10.1007/s42952-024-00281-4","url":null,"abstract":"<p>This article proposes objective Bayesian multiple testing procedures for a normal model. The challenging task of considering all the configurations of true and false null hypotheses is addressed here by ordering the null hypotheses based on their Bayes factors. This approach reduces the size of the compared models for posterior search from <span>(2^k)</span> to <span>(k+1)</span>, for <i>k</i> null hypotheses. Furthermore, the consistency of the proposed multiple testing procedures is established and their behavior is analyzed with simulated and real examples. In addition, the proposed procedures are compared with classical and Bayesian multiple testing procedures in all the possible configurations of true and false ordered null hypotheses.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141872985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kernel machine in semiparametric regression with nonignorable missing responses 具有不可忽略的缺失响应的半参数回归中的核机器
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-07-26 DOI: 10.1007/s42952-024-00279-y
Zhenzhen Fu, Ke Yang, Yaohua Rong, Yu Shu

Missing data is prevalent in many fields. Among all missing mechanisms, nonignorable missing data is more challenging for model identification. In this paper, we propose a semiparametric regression model estimation method with nonignorable missing responses. To be specific, we first construct a parametric model for the propensity score and apply the generalized method of moments to obtain the estimated propensity score. For nonignorable missing responses, based on the inverse probability weighting approach, we propose the penalized garrotized kernel machine method to flexibly depict the complex nonlinear relationships between the response and the predictors, allow for interactions between the predictors, and eliminate the redundant variables automatically. The cyclical coordinate descent algorithm is provided to solve the corresponding optimization problems. Numerical results and real data analysis indicate that our proposed method achieves better prediction performance compared with the competing ones.

缺失数据在许多领域都很普遍。在所有缺失机制中,不可忽略的缺失数据对模型识别来说更具挑战性。本文提出了一种具有不可忽略的缺失响应的半参数回归模型估计方法。具体来说,我们首先为倾向得分构建一个参数模型,然后应用广义矩方法得到估计的倾向得分。对于不可忽略的缺失反应,我们在反概率加权法的基础上,提出了惩罚性加权核机器方法,以灵活地描述反应与预测因子之间复杂的非线性关系,允许预测因子之间的交互作用,并自动消除冗余变量。此外,还提供了循环坐标下降算法来解决相应的优化问题。数值结果和实际数据分析表明,与其他竞争方法相比,我们提出的方法具有更好的预测性能。
{"title":"Kernel machine in semiparametric regression with nonignorable missing responses","authors":"Zhenzhen Fu, Ke Yang, Yaohua Rong, Yu Shu","doi":"10.1007/s42952-024-00279-y","DOIUrl":"https://doi.org/10.1007/s42952-024-00279-y","url":null,"abstract":"<p>Missing data is prevalent in many fields. Among all missing mechanisms, nonignorable missing data is more challenging for model identification. In this paper, we propose a semiparametric regression model estimation method with nonignorable missing responses. To be specific, we first construct a parametric model for the propensity score and apply the generalized method of moments to obtain the estimated propensity score. For nonignorable missing responses, based on the inverse probability weighting approach, we propose the penalized garrotized kernel machine method to flexibly depict the complex nonlinear relationships between the response and the predictors, allow for interactions between the predictors, and eliminate the redundant variables automatically. The cyclical coordinate descent algorithm is provided to solve the corresponding optimization problems. Numerical results and real data analysis indicate that our proposed method achieves better prediction performance compared with the competing ones.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141778016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial regression with multiplicative errors, and its application with LiDAR measurements 带乘法误差的空间回归及其在激光雷达测量中的应用
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-07-23 DOI: 10.1007/s42952-024-00282-3
Hojun You, Wei-Ying Wu, Chae Young Lim, Kyubaek Yoon, Jongeun Choi

Multiplicative errors in addition to spatially referenced observations often arise in geodetic applications, particularly with light detection and ranging (LiDAR) measurements. However, regression involving multiplicative errors remains relatively unexplored in such applications. In this regard, we present a penalized modified least squares estimator to handle the complexities of a multiplicative error structure while identifying significant variables in spatially dependent observations. The proposed estimator can be also applied to classical additive error spatial regression. By establishing asymptotic properties of the proposed estimator under increasing domain asymptotics with stochastic sampling design, we provide a rigorous foundation for its effectiveness. A comprehensive simulation study confirms the superior performance of our proposed estimator in accurately estimating and selecting parameters, outperforming existing approaches. To demonstrate its real-world applicability, we employ our proposed method, along with other alternative techniques, to estimate a rotational landslide surface using LiDAR measurements. The results highlight the efficacy and potential of our approach in tackling complex spatial regression problems involving multiplicative errors.

在大地测量应用中,尤其是在光探测和测距(LiDAR)测量中,经常会出现除空间参考观测值之外的乘法误差。然而,在此类应用中,涉及乘法误差的回归仍相对较少。为此,我们提出了一种受惩罚的修正最小二乘估计器,用于处理乘法误差结构的复杂性,同时识别空间相关观测中的重要变量。所提出的估计器也可应用于经典的加法误差空间回归。通过建立随机抽样设计下的增域渐近估计器的渐近特性,我们为其有效性提供了严格的基础。一项全面的模拟研究证实,我们提出的估计器在准确估计和选择参数方面表现出色,优于现有方法。为了证明该方法在现实世界中的适用性,我们采用了我们提出的方法以及其他替代技术,利用激光雷达测量结果对旋转滑坡表面进行了估算。结果凸显了我们的方法在解决涉及乘法误差的复杂空间回归问题方面的功效和潜力。
{"title":"Spatial regression with multiplicative errors, and its application with LiDAR measurements","authors":"Hojun You, Wei-Ying Wu, Chae Young Lim, Kyubaek Yoon, Jongeun Choi","doi":"10.1007/s42952-024-00282-3","DOIUrl":"https://doi.org/10.1007/s42952-024-00282-3","url":null,"abstract":"<p>Multiplicative errors in addition to spatially referenced observations often arise in geodetic applications, particularly with light detection and ranging (LiDAR) measurements. However, regression involving multiplicative errors remains relatively unexplored in such applications. In this regard, we present a penalized modified least squares estimator to handle the complexities of a multiplicative error structure while identifying significant variables in spatially dependent observations. The proposed estimator can be also applied to classical additive error spatial regression. By establishing asymptotic properties of the proposed estimator under increasing domain asymptotics with stochastic sampling design, we provide a rigorous foundation for its effectiveness. A comprehensive simulation study confirms the superior performance of our proposed estimator in accurately estimating and selecting parameters, outperforming existing approaches. To demonstrate its real-world applicability, we employ our proposed method, along with other alternative techniques, to estimate a rotational landslide surface using LiDAR measurements. The results highlight the efficacy and potential of our approach in tackling complex spatial regression problems involving multiplicative errors.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online debiased lasso estimation and inference for heterogenous updating regressions 异质更新回归的在线去偏拉索估计和推理
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-07-19 DOI: 10.1007/s42952-024-00278-z
Yajie Mi, Lei Wang

In the era of big data, online updating problems have attracted extensive attention. In practice, the covariates set of the models may change according to the conditions of data streams. In this paper, we propose a two-stage online debiased lasso estimation and inference method for high-dimensional heterogenous linear regression models with new variables added midway. At the first stage, the homogenization strategy is conducted to represent the heterogenous models by defining the pseudo covariates and responses. At the second stage, we conduct the online debiased lasso estimation procedure to obtain the final estimator. Theoretically, the asymptotic normality of the heterogenous online debiased lasso estimator (HODL) is established. The finite-sample performance of the proposed estimators is studied through simulation studies and a real data example.

在大数据时代,在线更新问题受到广泛关注。在实际应用中,模型的协变量集可能会随着数据流条件的变化而变化。本文针对中途添加新变量的高维异质线性回归模型,提出了一种两阶段在线去偏拉索估计与推理方法。在第一阶段,我们采用同质化策略,通过定义伪协变量和响应来表示异质模型。在第二阶段,我们进行在线去偏拉索估计程序,以获得最终估计器。从理论上讲,异质在线除杂套索估计器(HODL)的渐近正态性是成立的。通过模拟研究和真实数据实例,研究了所提估计器的有限样本性能。
{"title":"Online debiased lasso estimation and inference for heterogenous updating regressions","authors":"Yajie Mi, Lei Wang","doi":"10.1007/s42952-024-00278-z","DOIUrl":"https://doi.org/10.1007/s42952-024-00278-z","url":null,"abstract":"<p>In the era of big data, online updating problems have attracted extensive attention. In practice, the covariates set of the models may change according to the conditions of data streams. In this paper, we propose a two-stage online debiased lasso estimation and inference method for high-dimensional heterogenous linear regression models with new variables added midway. At the first stage, the homogenization strategy is conducted to represent the heterogenous models by defining the pseudo covariates and responses. At the second stage, we conduct the online debiased lasso estimation procedure to obtain the final estimator. Theoretically, the asymptotic normality of the heterogenous online debiased lasso estimator (HODL) is established. The finite-sample performance of the proposed estimators is studied through simulation studies and a real data example.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scale invariant and efficient estimation for groupwise scaled envelope model 分组比例包络模型的规模不变性和高效估算
IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY Pub Date : 2024-07-14 DOI: 10.1007/s42952-024-00277-0
Jing Zhang, Zhensheng Huang

Motivated by different groups containing different group information under the heteroscedastic error structure, we propose the groupwise scaled envelope model that is invariable to scale changes and is permissible for distinct regression coefficients and the heteroscedastic error structure across groups. It retains the potential of the scaled envelope methods to keep the scale invariant and allows for both different regression coefficients and different error structures for diverse groups. Further, we demonstrate the maximum likelihood estimators and its theoretical properties including parameter identifiability, asymptotic distribution and consistency of the groupwise scaled envelope estimator. Lastly, simulation studies and a real-data example demonstrate the advantages of the groupwise scaled envelope estimators, including a comparison with the standard model estimators, groupwise envelope estimators, scaled envelope estimators and separate scaled envelope estimators.

在异方差误差结构下,不同群体包含不同的群体信息,受此启发,我们提出了群体比例包络模型,该模型不受规模变化的影响,允许不同群体有不同的回归系数和异方差误差结构。它保留了缩放包络法保持尺度不变的潜力,并允许不同群体有不同的回归系数和不同的误差结构。此外,我们还展示了最大似然估计器及其理论特性,包括参数可识别性、渐近分布和分组缩放包络估计器的一致性。最后,模拟研究和一个真实数据示例展示了分组缩放包络估计器的优势,包括与标准模型估计器、分组包络估计器、缩放包络估计器和单独缩放包络估计器的比较。
{"title":"Scale invariant and efficient estimation for groupwise scaled envelope model","authors":"Jing Zhang, Zhensheng Huang","doi":"10.1007/s42952-024-00277-0","DOIUrl":"https://doi.org/10.1007/s42952-024-00277-0","url":null,"abstract":"<p>Motivated by different groups containing different group information under the heteroscedastic error structure, we propose the groupwise scaled envelope model that is invariable to scale changes and is permissible for distinct regression coefficients and the heteroscedastic error structure across groups. It retains the potential of the scaled envelope methods to keep the scale invariant and allows for both different regression coefficients and different error structures for diverse groups. Further, we demonstrate the maximum likelihood estimators and its theoretical properties including parameter identifiability, asymptotic distribution and consistency of the groupwise scaled envelope estimator. Lastly, simulation studies and a real-data example demonstrate the advantages of the groupwise scaled envelope estimators, including a comparison with the standard model estimators, groupwise envelope estimators, scaled envelope estimators and separate scaled envelope estimators.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141614495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the Korean Statistical Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1