Scenarios involving high-dimensional, low-sample-size (HDLSS) data are often encountered in modern scientific fields involving genetic microarrays, medical imaging, and finance, where the number of variables can greatly exceed the number of observations. In such settings, a reliable estimation of cross-covariance structures is essential for understanding relationships between variable sets. However, classical estimators often exhibit severe noise accumulation. To address this issue, in this study, we propose a novel thresholding estimator of the cross-covariance matrix for HDLSS settings. We consider the asymptotic properties of the sample cross-covariance matrix and show that the estimator contains large amounts of noise in the high-dimensional setting, which renders it inconsistent. To solve this problem occurring in high-dimensional settings, we develop a new thresholding estimator based on the automatic sparse estimation methodology and show that the estimator is consistent under mild assumptions. We analyze and evaluate the performance of the proposed estimator based on numerical simulations and actual data analysis. The simulations demonstrate that the method attains consistency without requiring the stringent high-dimensional conditions assumed by existing approaches, and the real-data analysis illustrates its applicability to high-dimensional regression problems, wherein improved parameter estimation enhances prediction accuracy. In conclusion, our findings serve as a theoretically sound tool for cross-covariance estimation in HDLSS contexts, with potential implications for a wide range of high-dimensional data analyses.
扫码关注我们
求助内容:
应助结果提醒方式:
