Galib Muhammad Shahriar Himel, Md Masudul Islam, Kh Abdullah Al-Aff, Shams Ibne Karim, Md Kabir Uddin Sikder
{"title":"基于皮肤镜的无创数字系统中使用视觉变换器自动分析的皮肤癌分割和分类。","authors":"Galib Muhammad Shahriar Himel, Md Masudul Islam, Kh Abdullah Al-Aff, Shams Ibne Karim, Md Kabir Uddin Sikder","doi":"10.1155/2024/3022192","DOIUrl":null,"url":null,"abstract":"<p><p>Skin cancer is a significant health concern worldwide, and early and accurate diagnosis plays a crucial role in improving patient outcomes. In recent years, deep learning models have shown remarkable success in various computer vision tasks, including image classification. In this research study, we introduce an approach for skin cancer classification using vision transformer, a state-of-the-art deep learning architecture that has demonstrated exceptional performance in diverse image analysis tasks. The study utilizes the HAM10000 dataset; a publicly available dataset comprising 10,015 skin lesion images classified into two categories: benign (6705 images) and malignant (3310 images). This dataset consists of high-resolution images captured using dermatoscopes and carefully annotated by expert dermatologists. Preprocessing techniques, such as normalization and augmentation, are applied to enhance the robustness and generalization of the model. The vision transformer architecture is adapted to the skin cancer classification task. The model leverages the self-attention mechanism to capture intricate spatial dependencies and long-range dependencies within the images, enabling it to effectively learn relevant features for accurate classification. Segment Anything Model (SAM) is employed to segment the cancerous areas from the images; achieving an IOU of 96.01% and Dice coefficient of 98.14% and then various pretrained models are used for classification using vision transformer architecture. Extensive experiments and evaluations are conducted to assess the performance of our approach. The results demonstrate the superiority of the vision transformer model over traditional deep learning architectures in skin cancer classification in general with some exceptions. Upon experimenting on six different models, ViT-Google, ViT-MAE, ViT-ResNet50, ViT-VAN, ViT-BEiT, and ViT-DiT, we found out that the ML approach achieves 96.15% accuracy using Google's ViT patch-32 model with a low false negative ratio on the test dataset, showcasing its potential as an effective tool for aiding dermatologists in the diagnosis of skin cancer.</p>","PeriodicalId":47063,"journal":{"name":"International Journal of Biomedical Imaging","volume":"2024 ","pages":"3022192"},"PeriodicalIF":3.3000,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10858797/pdf/","citationCount":"0","resultStr":"{\"title\":\"Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-Based Noninvasive Digital System.\",\"authors\":\"Galib Muhammad Shahriar Himel, Md Masudul Islam, Kh Abdullah Al-Aff, Shams Ibne Karim, Md Kabir Uddin Sikder\",\"doi\":\"10.1155/2024/3022192\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Skin cancer is a significant health concern worldwide, and early and accurate diagnosis plays a crucial role in improving patient outcomes. In recent years, deep learning models have shown remarkable success in various computer vision tasks, including image classification. In this research study, we introduce an approach for skin cancer classification using vision transformer, a state-of-the-art deep learning architecture that has demonstrated exceptional performance in diverse image analysis tasks. The study utilizes the HAM10000 dataset; a publicly available dataset comprising 10,015 skin lesion images classified into two categories: benign (6705 images) and malignant (3310 images). This dataset consists of high-resolution images captured using dermatoscopes and carefully annotated by expert dermatologists. Preprocessing techniques, such as normalization and augmentation, are applied to enhance the robustness and generalization of the model. The vision transformer architecture is adapted to the skin cancer classification task. The model leverages the self-attention mechanism to capture intricate spatial dependencies and long-range dependencies within the images, enabling it to effectively learn relevant features for accurate classification. Segment Anything Model (SAM) is employed to segment the cancerous areas from the images; achieving an IOU of 96.01% and Dice coefficient of 98.14% and then various pretrained models are used for classification using vision transformer architecture. Extensive experiments and evaluations are conducted to assess the performance of our approach. The results demonstrate the superiority of the vision transformer model over traditional deep learning architectures in skin cancer classification in general with some exceptions. Upon experimenting on six different models, ViT-Google, ViT-MAE, ViT-ResNet50, ViT-VAN, ViT-BEiT, and ViT-DiT, we found out that the ML approach achieves 96.15% accuracy using Google's ViT patch-32 model with a low false negative ratio on the test dataset, showcasing its potential as an effective tool for aiding dermatologists in the diagnosis of skin cancer.</p>\",\"PeriodicalId\":47063,\"journal\":{\"name\":\"International Journal of Biomedical Imaging\",\"volume\":\"2024 \",\"pages\":\"3022192\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10858797/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Biomedical Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2024/3022192\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biomedical Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2024/3022192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-Based Noninvasive Digital System.
Skin cancer is a significant health concern worldwide, and early and accurate diagnosis plays a crucial role in improving patient outcomes. In recent years, deep learning models have shown remarkable success in various computer vision tasks, including image classification. In this research study, we introduce an approach for skin cancer classification using vision transformer, a state-of-the-art deep learning architecture that has demonstrated exceptional performance in diverse image analysis tasks. The study utilizes the HAM10000 dataset; a publicly available dataset comprising 10,015 skin lesion images classified into two categories: benign (6705 images) and malignant (3310 images). This dataset consists of high-resolution images captured using dermatoscopes and carefully annotated by expert dermatologists. Preprocessing techniques, such as normalization and augmentation, are applied to enhance the robustness and generalization of the model. The vision transformer architecture is adapted to the skin cancer classification task. The model leverages the self-attention mechanism to capture intricate spatial dependencies and long-range dependencies within the images, enabling it to effectively learn relevant features for accurate classification. Segment Anything Model (SAM) is employed to segment the cancerous areas from the images; achieving an IOU of 96.01% and Dice coefficient of 98.14% and then various pretrained models are used for classification using vision transformer architecture. Extensive experiments and evaluations are conducted to assess the performance of our approach. The results demonstrate the superiority of the vision transformer model over traditional deep learning architectures in skin cancer classification in general with some exceptions. Upon experimenting on six different models, ViT-Google, ViT-MAE, ViT-ResNet50, ViT-VAN, ViT-BEiT, and ViT-DiT, we found out that the ML approach achieves 96.15% accuracy using Google's ViT patch-32 model with a low false negative ratio on the test dataset, showcasing its potential as an effective tool for aiding dermatologists in the diagnosis of skin cancer.
期刊介绍:
The International Journal of Biomedical Imaging is managed by a board of editors comprising internationally renowned active researchers. The journal is freely accessible online and also offered for purchase in print format. It employs a web-based review system to ensure swift turnaround times while maintaining high standards. In addition to regular issues, special issues are organized by guest editors. The subject areas covered include (but are not limited to):
Digital radiography and tomosynthesis
X-ray computed tomography (CT)
Magnetic resonance imaging (MRI)
Single photon emission computed tomography (SPECT)
Positron emission tomography (PET)
Ultrasound imaging
Diffuse optical tomography, coherence, fluorescence, bioluminescence tomography, impedance tomography
Neutron imaging for biomedical applications
Magnetic and optical spectroscopy, and optical biopsy
Optical, electron, scanning tunneling/atomic force microscopy
Small animal imaging
Functional, cellular, and molecular imaging
Imaging assays for screening and molecular analysis
Microarray image analysis and bioinformatics
Emerging biomedical imaging techniques
Imaging modality fusion
Biomedical imaging instrumentation
Biomedical image processing, pattern recognition, and analysis
Biomedical image visualization, compression, transmission, and storage
Imaging and modeling related to systems biology and systems biomedicine
Applied mathematics, applied physics, and chemistry related to biomedical imaging
Grid-enabling technology for biomedical imaging and informatics