{"title":"Construction of an automated screening system to predict breast cancer diagnosis and prognosis","authors":"Sou-Young Jin, Jae-Kyung Won, Hojin Lee, Ho-Jin Choi","doi":"10.1111/j.1755-9294.2012.01124.x","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p><b>Background and aim:</b> Using machine learning methods can be helpful in the clinical decision processes such as pathological diagnosis with the aid of microscopic feature datasets. In the present study using the Breast Cancer Wisconsin dataset, an optimal algorithm (classifiers) which can predict both diagnosis (benign vs malignant) and prognosis (recur vs non-recur) was devised by comparing several classification algorithms. <b>Methods:</b> The performance of a two-step algorithm, which sequentially decides diagnosis and prognosis, was compared with that of a multi-class classifier, which divides classes simultaneously. <b>Results:</b> In the two-step classifier, it was discovered that the functional trees (FT) algorithm is the best for the first step of classification, and Naïve Bayes is the best for the second step of classification. On the other hand, the one-step classifier shows better accuracy and better prediction on benign and non-recurring cases than the two-step classifier, but it shows lower accuracy on predicting recurring cases, leading to lower sensitivity. <b>Conclusions:</b> We conclude that the two-step classifier with FT and Naïve Bayes is better than the one-step classifier. This work will be helpful in setting the automated screening system in real clinics and highlight clues to improve the accuracy by refining data and algorithm selection in data mining or machine learning processes.</p>\n </div>","PeriodicalId":92990,"journal":{"name":"Basic and applied pathology","volume":"5 1","pages":"15-18"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/j.1755-9294.2012.01124.x","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Basic and applied pathology","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/j.1755-9294.2012.01124.x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Background and aim: Using machine learning methods can be helpful in the clinical decision processes such as pathological diagnosis with the aid of microscopic feature datasets. In the present study using the Breast Cancer Wisconsin dataset, an optimal algorithm (classifiers) which can predict both diagnosis (benign vs malignant) and prognosis (recur vs non-recur) was devised by comparing several classification algorithms. Methods: The performance of a two-step algorithm, which sequentially decides diagnosis and prognosis, was compared with that of a multi-class classifier, which divides classes simultaneously. Results: In the two-step classifier, it was discovered that the functional trees (FT) algorithm is the best for the first step of classification, and Naïve Bayes is the best for the second step of classification. On the other hand, the one-step classifier shows better accuracy and better prediction on benign and non-recurring cases than the two-step classifier, but it shows lower accuracy on predicting recurring cases, leading to lower sensitivity. Conclusions: We conclude that the two-step classifier with FT and Naïve Bayes is better than the one-step classifier. This work will be helpful in setting the automated screening system in real clinics and highlight clues to improve the accuracy by refining data and algorithm selection in data mining or machine learning processes.