{"title":"Performance Evaluation of Canonical Correlation Analysis and Generalized Canonical Correlation Analysis with Some Continuous Distributed Data","authors":"C.N. Okoli, C.T Eze-Golden","doi":"10.37745/ijmss.13/vol11n22234","DOIUrl":null,"url":null,"abstract":"This study was embarked to examine the performance evaluation of canonical correlation and generalized canonical correlation analysis with some continuous distributed data (Gamma, Gaussian, Exponential and Beta). The objectives of the study were to: ascertain if the anthropometric indicators of patients were correlated; ascertain if there is any relationship between vital signs and anthropometric dimensions of patients; obtain the relative efficiency of CCA and GCCA techniques for four continuous distributed simulated data; and determine the model performance adequacy of CCA and GCCA techniques. Real life medical data set was used, consisting of three response variables (Respiration rate, heart rate, temperature) named the vital signs and three predictor variables (Hip circumference, weight, height) named anthropometric dimensions. The study employed the real life data set to simulate data of sample sizes 15, 30, 45, 60, 100, 120, 140, 160, 400, 600, 800 and 1000 for the four continuous distributions. A computer programming language codes were written via R-Studio package to solve the numerous numerical problems in this study. The result of the study revealed that anthropometric dimensions, being the independent variables were not correlated, which implied that there was no symptom of multicollinearity using the Eigen values/condition index technique. In addition, there was significant relationship between vital signs and anthropometric dimensions of patients using Wilks’ Lambda, Hotelling-Lawley Trace, Pillai’s Bartlett Trace and Roy’s Largest Root multivariate statistics. The adequacy of the CCA and GCCA was evaluated using Wilcoxon rank sum test; and the result revealed that GCCA was more efficient than that of CCA for the Gamma and Beta distributed data, while for Gaussian and Exponential distributed data, the relative efficiency of the CCA and GCCA was the same.","PeriodicalId":476297,"journal":{"name":"International journal of mathematics and statistics studies","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of mathematics and statistics studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37745/ijmss.13/vol11n22234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study was embarked to examine the performance evaluation of canonical correlation and generalized canonical correlation analysis with some continuous distributed data (Gamma, Gaussian, Exponential and Beta). The objectives of the study were to: ascertain if the anthropometric indicators of patients were correlated; ascertain if there is any relationship between vital signs and anthropometric dimensions of patients; obtain the relative efficiency of CCA and GCCA techniques for four continuous distributed simulated data; and determine the model performance adequacy of CCA and GCCA techniques. Real life medical data set was used, consisting of three response variables (Respiration rate, heart rate, temperature) named the vital signs and three predictor variables (Hip circumference, weight, height) named anthropometric dimensions. The study employed the real life data set to simulate data of sample sizes 15, 30, 45, 60, 100, 120, 140, 160, 400, 600, 800 and 1000 for the four continuous distributions. A computer programming language codes were written via R-Studio package to solve the numerous numerical problems in this study. The result of the study revealed that anthropometric dimensions, being the independent variables were not correlated, which implied that there was no symptom of multicollinearity using the Eigen values/condition index technique. In addition, there was significant relationship between vital signs and anthropometric dimensions of patients using Wilks’ Lambda, Hotelling-Lawley Trace, Pillai’s Bartlett Trace and Roy’s Largest Root multivariate statistics. The adequacy of the CCA and GCCA was evaluated using Wilcoxon rank sum test; and the result revealed that GCCA was more efficient than that of CCA for the Gamma and Beta distributed data, while for Gaussian and Exponential distributed data, the relative efficiency of the CCA and GCCA was the same.