P. S. Shekar Varma, Sushil Kumar, K. Sri Vasuki Reddy
{"title":"Machine Learning Based Breast Cancer Visualization and Classification","authors":"P. S. Shekar Varma, Sushil Kumar, K. Sri Vasuki Reddy","doi":"10.1109/ICITIIT51526.2021.9399603","DOIUrl":null,"url":null,"abstract":"In contemporary years, the categorization of breast cancer has become an engrossing subject in the department of healthcare informatics due to prodigious deaths of the women across the world caused by this cancer. With the upcoming heed and variety of approaches in image processing and machine learning (ML), there has been an endeavor to erect a pattern recognition model that is well-grounded to boost the diagnosis standard. Diverse research has been attempted on mastering the prediction of the possibility of breast cancer using predefined data mining algorithms. In this paper, a model is presented using the support vector machine (SVM) algorithm for the manual categorizing of the histology images of breast cancer samples into benign and malignant subclasses to anticipate the interpretation. Primarily all the data incorporating a set of 30 features relating to the cell nuclei shown in the digitalized images of fine needle aspirate (FNA) of a breast mass are considered. Ten existing values of features are added up for every nuclei sample then the mean, the standard deviation, the worst and largest of the mentioned attributes are measured proceeding to 30 features. The total features obtained are visualized and apprehended to gain insight for future diagnosis. The principal component analysis (PCA) dimensionality reduction strategy is implemented to successfully augment the valiance of the attributes resolving eigenvector problem. The ultimate outcome is conceptualized using the confusion matrix and the receiver operating characteristic curve (ROC). This SVM forged model proves to show 97% accuracy with the recommended dataset.","PeriodicalId":161452,"journal":{"name":"2021 International Conference on Innovative Trends in Information Technology (ICITIIT)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Innovative Trends in Information Technology (ICITIIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITIIT51526.2021.9399603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In contemporary years, the categorization of breast cancer has become an engrossing subject in the department of healthcare informatics due to prodigious deaths of the women across the world caused by this cancer. With the upcoming heed and variety of approaches in image processing and machine learning (ML), there has been an endeavor to erect a pattern recognition model that is well-grounded to boost the diagnosis standard. Diverse research has been attempted on mastering the prediction of the possibility of breast cancer using predefined data mining algorithms. In this paper, a model is presented using the support vector machine (SVM) algorithm for the manual categorizing of the histology images of breast cancer samples into benign and malignant subclasses to anticipate the interpretation. Primarily all the data incorporating a set of 30 features relating to the cell nuclei shown in the digitalized images of fine needle aspirate (FNA) of a breast mass are considered. Ten existing values of features are added up for every nuclei sample then the mean, the standard deviation, the worst and largest of the mentioned attributes are measured proceeding to 30 features. The total features obtained are visualized and apprehended to gain insight for future diagnosis. The principal component analysis (PCA) dimensionality reduction strategy is implemented to successfully augment the valiance of the attributes resolving eigenvector problem. The ultimate outcome is conceptualized using the confusion matrix and the receiver operating characteristic curve (ROC). This SVM forged model proves to show 97% accuracy with the recommended dataset.