{"title":"Trichotomization with two cutoff values using Kruskal-Wallis test by minimum P-value approach","authors":"T. Ogura, C. Shiraishi","doi":"10.2478/jamsi-2022-0010","DOIUrl":null,"url":null,"abstract":"Abstract In clinical trials, age is often converted to binary data by the cutoff value. However, when looking at a scatter plot for a group of patients whose age is larger than or equal to the cutoff value, age and outcome may not be related. If the group whose age is greater than or equal to the cutoff value is further divided into two groups, the older of the two groups may appear to be at lower risk. In this case, it may be necessary to further divide the group of patients whose age is greater than or equal to the cutoff value into two groups. This study provides a method for determining which of the two or three groups is the best split. The following two methods are used to divide the data. The existing method, the Wilcoxon-Mann-Whitney test by minimum P-value approach, divides data into two groups by one cutoff value. A new method, the Kruskal-Wallis test by minimum P-value approach, divides data into three groups by two cutoff values. Of the two tests, the one with the smaller P-value is used. Because this was a new decision procedure, it was tested using Monte Carlo simulations (MCSs) before application to the available COVID-19 data. The MCS results showed that this method performs well. In the COVID-19 data, it was optimal to divide into three groups by two cutoff values of 60 and 70 years old. By looking at COVID-19 data separated into three groups according to the two cutoff values, it was confirmed that each group had different features. We provided the R code that can be used to replicate the results of this manuscript. Another practical example can be performed by replacing x and y with appropriate ones.","PeriodicalId":43016,"journal":{"name":"Journal of Applied Mathematics Statistics and Informatics","volume":"18 1","pages":"19 - 32"},"PeriodicalIF":0.3000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Mathematics Statistics and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jamsi-2022-0010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract In clinical trials, age is often converted to binary data by the cutoff value. However, when looking at a scatter plot for a group of patients whose age is larger than or equal to the cutoff value, age and outcome may not be related. If the group whose age is greater than or equal to the cutoff value is further divided into two groups, the older of the two groups may appear to be at lower risk. In this case, it may be necessary to further divide the group of patients whose age is greater than or equal to the cutoff value into two groups. This study provides a method for determining which of the two or three groups is the best split. The following two methods are used to divide the data. The existing method, the Wilcoxon-Mann-Whitney test by minimum P-value approach, divides data into two groups by one cutoff value. A new method, the Kruskal-Wallis test by minimum P-value approach, divides data into three groups by two cutoff values. Of the two tests, the one with the smaller P-value is used. Because this was a new decision procedure, it was tested using Monte Carlo simulations (MCSs) before application to the available COVID-19 data. The MCS results showed that this method performs well. In the COVID-19 data, it was optimal to divide into three groups by two cutoff values of 60 and 70 years old. By looking at COVID-19 data separated into three groups according to the two cutoff values, it was confirmed that each group had different features. We provided the R code that can be used to replicate the results of this manuscript. Another practical example can be performed by replacing x and y with appropriate ones.