{"title":"A weakly supervised end-to-end framework for semantic segmentation of cancerous area in whole slide image","authors":"Yanbo Feng, Adel Hafiane, Hélène Laurent","doi":"10.1007/s10044-024-01251-6","DOIUrl":null,"url":null,"abstract":"<p>The segmentation of pathological image is an indispensable content in the cancerous diagnosis and grading, which is provided to doctors for the location and quantitative analysis of pathologically altered tissue. However, pathological whole slide image (WSI) generally has gigapixel size and huge region-level objective to be segmented. Extracting patches from WSI can address the limitation of computer memory, but the integrity of target is hence affected. Moreover, supervised learning methods require manually annotated labels for training, which is laborious and time-consuming. Thus, we studied a novel weakly supervised learning (WSL)-based end-to-end framework for semantic segmentation of cancerous area in WSI. The proposed framework is based on the block-level segmentation of convolutional neural network (CNN), while CNN is required to integrate the global average pooling layer and single fully connected layer as WSL-CNN. Class activation map and dense conditional random field (DenseCRF) are adapted to realize pixel-level segmentation of the cancerous area in patch, which is incorporated into the classification process of WSL-CNN. The hierarchically double use of DenseCRF effectively improves the precision of semantic segmentation. A region-based annotation method and a flexible method of constructing training dataset are proposed to reduce the workload of annotation. Experiments show that the block-level segmentation of CNNs has better performance than the pixel-level segmentation of fully convolutional networks, ResNet50 is the best one that achieves F1 score of 0.87426, Jaccard score of 0.78079, Recall of 0.94251 and Precision of 0.82182. The proposed framework can effectively refine the block-level prediction as semantic segmentation without pixel-level label. The precision of all tested CNNs get improved in the experiments, with WSL-ResNet50 achieving F1 score of 0.90630, Jaccard score of 0.83230, Recall of 0.92051 and Precision of 0.89789. We propose a complete end-to-end framework, including the specific structure of neural network, the construction of training dataset, the prediction method using neural network and the post-processing. CNN-like architectures can be widely transplanted into this framework to realize semantic segmentation, solving the problem of insufficient label of large-scale medical image to a certain extent.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"26 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Analysis and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10044-024-01251-6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The segmentation of pathological image is an indispensable content in the cancerous diagnosis and grading, which is provided to doctors for the location and quantitative analysis of pathologically altered tissue. However, pathological whole slide image (WSI) generally has gigapixel size and huge region-level objective to be segmented. Extracting patches from WSI can address the limitation of computer memory, but the integrity of target is hence affected. Moreover, supervised learning methods require manually annotated labels for training, which is laborious and time-consuming. Thus, we studied a novel weakly supervised learning (WSL)-based end-to-end framework for semantic segmentation of cancerous area in WSI. The proposed framework is based on the block-level segmentation of convolutional neural network (CNN), while CNN is required to integrate the global average pooling layer and single fully connected layer as WSL-CNN. Class activation map and dense conditional random field (DenseCRF) are adapted to realize pixel-level segmentation of the cancerous area in patch, which is incorporated into the classification process of WSL-CNN. The hierarchically double use of DenseCRF effectively improves the precision of semantic segmentation. A region-based annotation method and a flexible method of constructing training dataset are proposed to reduce the workload of annotation. Experiments show that the block-level segmentation of CNNs has better performance than the pixel-level segmentation of fully convolutional networks, ResNet50 is the best one that achieves F1 score of 0.87426, Jaccard score of 0.78079, Recall of 0.94251 and Precision of 0.82182. The proposed framework can effectively refine the block-level prediction as semantic segmentation without pixel-level label. The precision of all tested CNNs get improved in the experiments, with WSL-ResNet50 achieving F1 score of 0.90630, Jaccard score of 0.83230, Recall of 0.92051 and Precision of 0.89789. We propose a complete end-to-end framework, including the specific structure of neural network, the construction of training dataset, the prediction method using neural network and the post-processing. CNN-like architectures can be widely transplanted into this framework to realize semantic segmentation, solving the problem of insufficient label of large-scale medical image to a certain extent.
期刊介绍:
The journal publishes high quality articles in areas of fundamental research in intelligent pattern analysis and applications in computer science and engineering. It aims to provide a forum for original research which describes novel pattern analysis techniques and industrial applications of the current technology. In addition, the journal will also publish articles on pattern analysis applications in medical imaging. The journal solicits articles that detail new technology and methods for pattern recognition and analysis in applied domains including, but not limited to, computer vision and image processing, speech analysis, robotics, multimedia, document analysis, character recognition, knowledge engineering for pattern recognition, fractal analysis, and intelligent control. The journal publishes articles on the use of advanced pattern recognition and analysis methods including statistical techniques, neural networks, genetic algorithms, fuzzy pattern recognition, machine learning, and hardware implementations which are either relevant to the development of pattern analysis as a research area or detail novel pattern analysis applications. Papers proposing new classifier systems or their development, pattern analysis systems for real-time applications, fuzzy and temporal pattern recognition and uncertainty management in applied pattern recognition are particularly solicited.