B. Senthil Kumar, Harvey Vanlalpeka, J. Zohmingthanga, N. S. Kumar, L. Hmingliana, Lalrempuia Sailo
{"title":"应用流行病学危险因素对病例和对照组胃癌分类的Logistic回归分析","authors":"B. Senthil Kumar, Harvey Vanlalpeka, J. Zohmingthanga, N. S. Kumar, L. Hmingliana, Lalrempuia Sailo","doi":"10.22232/stj.2021.09.02.19","DOIUrl":null,"url":null,"abstract":"The main purpose of this study is to design a machine learning classifier that can accurately classify between gastric cancer (cases) patient and healthy individuals (controls) from epidemiological and environmental factors. The dataset contains missing values which are replaced by median using imputation technique. The basic idea of this work is to reduce the cost function by applying gradient descent to detect the optimal global minima. The proposed logistic regression has utilized 29 features as the input and produces an accuracy of 98.51%. This accuracy is achieved with learning rate 0.000915 and number of iterations 150000, which are devised for training the logistic regression model.","PeriodicalId":22107,"journal":{"name":"Silpakorn University Science and Technology Journal","volume":"184 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Logistic Regression for Gastric Cancer Classification using epidemiological risk factors in Cases and Controls\",\"authors\":\"B. Senthil Kumar, Harvey Vanlalpeka, J. Zohmingthanga, N. S. Kumar, L. Hmingliana, Lalrempuia Sailo\",\"doi\":\"10.22232/stj.2021.09.02.19\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main purpose of this study is to design a machine learning classifier that can accurately classify between gastric cancer (cases) patient and healthy individuals (controls) from epidemiological and environmental factors. The dataset contains missing values which are replaced by median using imputation technique. The basic idea of this work is to reduce the cost function by applying gradient descent to detect the optimal global minima. The proposed logistic regression has utilized 29 features as the input and produces an accuracy of 98.51%. This accuracy is achieved with learning rate 0.000915 and number of iterations 150000, which are devised for training the logistic regression model.\",\"PeriodicalId\":22107,\"journal\":{\"name\":\"Silpakorn University Science and Technology Journal\",\"volume\":\"184 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Silpakorn University Science and Technology Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22232/stj.2021.09.02.19\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Silpakorn University Science and Technology Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22232/stj.2021.09.02.19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Logistic Regression for Gastric Cancer Classification using epidemiological risk factors in Cases and Controls
The main purpose of this study is to design a machine learning classifier that can accurately classify between gastric cancer (cases) patient and healthy individuals (controls) from epidemiological and environmental factors. The dataset contains missing values which are replaced by median using imputation technique. The basic idea of this work is to reduce the cost function by applying gradient descent to detect the optimal global minima. The proposed logistic regression has utilized 29 features as the input and produces an accuracy of 98.51%. This accuracy is achieved with learning rate 0.000915 and number of iterations 150000, which are devised for training the logistic regression model.