{"title":"一些连续分布数据的典型相关分析和广义典型相关分析的性能评价","authors":"C.N. Okoli, C.T Eze-Golden","doi":"10.37745/ijmss.13/vol11n22234","DOIUrl":null,"url":null,"abstract":"This study was embarked to examine the performance evaluation of canonical correlation and generalized canonical correlation analysis with some continuous distributed data (Gamma, Gaussian, Exponential and Beta). The objectives of the study were to: ascertain if the anthropometric indicators of patients were correlated; ascertain if there is any relationship between vital signs and anthropometric dimensions of patients; obtain the relative efficiency of CCA and GCCA techniques for four continuous distributed simulated data; and determine the model performance adequacy of CCA and GCCA techniques. Real life medical data set was used, consisting of three response variables (Respiration rate, heart rate, temperature) named the vital signs and three predictor variables (Hip circumference, weight, height) named anthropometric dimensions. The study employed the real life data set to simulate data of sample sizes 15, 30, 45, 60, 100, 120, 140, 160, 400, 600, 800 and 1000 for the four continuous distributions. A computer programming language codes were written via R-Studio package to solve the numerous numerical problems in this study. The result of the study revealed that anthropometric dimensions, being the independent variables were not correlated, which implied that there was no symptom of multicollinearity using the Eigen values/condition index technique. In addition, there was significant relationship between vital signs and anthropometric dimensions of patients using Wilks’ Lambda, Hotelling-Lawley Trace, Pillai’s Bartlett Trace and Roy’s Largest Root multivariate statistics. The adequacy of the CCA and GCCA was evaluated using Wilcoxon rank sum test; and the result revealed that GCCA was more efficient than that of CCA for the Gamma and Beta distributed data, while for Gaussian and Exponential distributed data, the relative efficiency of the CCA and GCCA was the same.","PeriodicalId":476297,"journal":{"name":"International journal of mathematics and statistics studies","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Evaluation of Canonical Correlation Analysis and Generalized Canonical Correlation Analysis with Some Continuous Distributed Data\",\"authors\":\"C.N. Okoli, C.T Eze-Golden\",\"doi\":\"10.37745/ijmss.13/vol11n22234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study was embarked to examine the performance evaluation of canonical correlation and generalized canonical correlation analysis with some continuous distributed data (Gamma, Gaussian, Exponential and Beta). The objectives of the study were to: ascertain if the anthropometric indicators of patients were correlated; ascertain if there is any relationship between vital signs and anthropometric dimensions of patients; obtain the relative efficiency of CCA and GCCA techniques for four continuous distributed simulated data; and determine the model performance adequacy of CCA and GCCA techniques. Real life medical data set was used, consisting of three response variables (Respiration rate, heart rate, temperature) named the vital signs and three predictor variables (Hip circumference, weight, height) named anthropometric dimensions. The study employed the real life data set to simulate data of sample sizes 15, 30, 45, 60, 100, 120, 140, 160, 400, 600, 800 and 1000 for the four continuous distributions. A computer programming language codes were written via R-Studio package to solve the numerous numerical problems in this study. The result of the study revealed that anthropometric dimensions, being the independent variables were not correlated, which implied that there was no symptom of multicollinearity using the Eigen values/condition index technique. In addition, there was significant relationship between vital signs and anthropometric dimensions of patients using Wilks’ Lambda, Hotelling-Lawley Trace, Pillai’s Bartlett Trace and Roy’s Largest Root multivariate statistics. The adequacy of the CCA and GCCA was evaluated using Wilcoxon rank sum test; and the result revealed that GCCA was more efficient than that of CCA for the Gamma and Beta distributed data, while for Gaussian and Exponential distributed data, the relative efficiency of the CCA and GCCA was the same.\",\"PeriodicalId\":476297,\"journal\":{\"name\":\"International journal of mathematics and statistics studies\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of mathematics and statistics studies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.37745/ijmss.13/vol11n22234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of mathematics and statistics studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37745/ijmss.13/vol11n22234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
本研究针对一些连续分布数据(Gamma、Gaussian、Exponential和Beta),探讨典型相关和广义典型相关分析的性能评价。本研究的目的是:确定患者的人体测量指标是否相关;确定病人的生命体征与人体尺寸是否有关系;获得四种连续分布模拟数据的CCA和GCCA技术的相对效率;并确定CCA和GCCA技术的模型性能充分性。使用真实生活医疗数据集,包括三个反应变量(呼吸率、心率、体温)和三个预测变量(臀围、体重、身高),分别称为生命体征和人体测量尺寸。本研究采用现实生活数据集对4个连续分布的样本量分别为15、30、45、60、100、120、140、160、400、600、800和1000的数据进行模拟。通过R-Studio包编写计算机编程语言代码,解决了本研究中大量的数值问题。研究结果表明,作为自变量的人体尺寸不相关,这意味着使用特征值/条件指标技术不存在多重共线性症状。此外,使用Wilks’Lambda、Hotelling-Lawley Trace、Pillai’s Bartlett Trace和Roy’s Largest Root多元统计量,患者的生命体征与人体测量尺寸之间存在显著相关。采用Wilcoxon秩和检验评价CCA和GCCA的充分性;结果表明,对于Gamma分布和Beta分布的数据,GCCA比CCA效率更高,而对于高斯分布和指数分布的数据,CCA和GCCA的相对效率相同。
Performance Evaluation of Canonical Correlation Analysis and Generalized Canonical Correlation Analysis with Some Continuous Distributed Data
This study was embarked to examine the performance evaluation of canonical correlation and generalized canonical correlation analysis with some continuous distributed data (Gamma, Gaussian, Exponential and Beta). The objectives of the study were to: ascertain if the anthropometric indicators of patients were correlated; ascertain if there is any relationship between vital signs and anthropometric dimensions of patients; obtain the relative efficiency of CCA and GCCA techniques for four continuous distributed simulated data; and determine the model performance adequacy of CCA and GCCA techniques. Real life medical data set was used, consisting of three response variables (Respiration rate, heart rate, temperature) named the vital signs and three predictor variables (Hip circumference, weight, height) named anthropometric dimensions. The study employed the real life data set to simulate data of sample sizes 15, 30, 45, 60, 100, 120, 140, 160, 400, 600, 800 and 1000 for the four continuous distributions. A computer programming language codes were written via R-Studio package to solve the numerous numerical problems in this study. The result of the study revealed that anthropometric dimensions, being the independent variables were not correlated, which implied that there was no symptom of multicollinearity using the Eigen values/condition index technique. In addition, there was significant relationship between vital signs and anthropometric dimensions of patients using Wilks’ Lambda, Hotelling-Lawley Trace, Pillai’s Bartlett Trace and Roy’s Largest Root multivariate statistics. The adequacy of the CCA and GCCA was evaluated using Wilcoxon rank sum test; and the result revealed that GCCA was more efficient than that of CCA for the Gamma and Beta distributed data, while for Gaussian and Exponential distributed data, the relative efficiency of the CCA and GCCA was the same.