分割方法集成的实证研究

Signals Pub Date : 2022-06-01 DOI:10.3390/signals3020022

L. Nanni, A. Lumini, Andrea Loreggia, A. Formaggio, Daniela Cuza

{"title":"分割方法集成的实证研究","authors":"L. Nanni, A. Lumini, Andrea Loreggia, A. Formaggio, Daniela Cuza","doi":"10.3390/signals3020022","DOIUrl":null,"url":null,"abstract":"Recognizing objects in images requires complex skills that involve knowledge about the context and the ability to identify the borders of the objects. In computer vision, this task is called semantic segmentation and it pertains to the classification of each pixel in an image. The task is of main importance in many real-life scenarios: in autonomous vehicles, it allows the identification of objects surrounding the vehicle; in medical diagnosis, it improves the ability of early detecting of dangerous pathologies and thus mitigates the risk of serious consequences. In this work, we propose a new ensemble method able to solve the semantic segmentation task. The model is based on convolutional neural networks (CNNs) and transformers. An ensemble uses many different models whose predictions are aggregated to form the output of the ensemble system. The performance and quality of the ensemble prediction are strongly connected with some factors; one of the most important is the diversity among individual models. In our approach, this is enforced by adopting different loss functions and testing different data augmentations. We developed the proposed method by combining DeepLabV3+, HarDNet-MSEG, and Pyramid Vision Transformers. The developed solution was then assessed through an extensive empirical evaluation in five different scenarios: polyp detection, skin detection, leukocytes recognition, environmental microorganism detection, and butterfly recognition. The model provides state-of-the-art results.","PeriodicalId":93815,"journal":{"name":"Signals","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"An Empirical Study on Ensemble of Segmentation Approaches\",\"authors\":\"L. Nanni, A. Lumini, Andrea Loreggia, A. Formaggio, Daniela Cuza\",\"doi\":\"10.3390/signals3020022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognizing objects in images requires complex skills that involve knowledge about the context and the ability to identify the borders of the objects. In computer vision, this task is called semantic segmentation and it pertains to the classification of each pixel in an image. The task is of main importance in many real-life scenarios: in autonomous vehicles, it allows the identification of objects surrounding the vehicle; in medical diagnosis, it improves the ability of early detecting of dangerous pathologies and thus mitigates the risk of serious consequences. In this work, we propose a new ensemble method able to solve the semantic segmentation task. The model is based on convolutional neural networks (CNNs) and transformers. An ensemble uses many different models whose predictions are aggregated to form the output of the ensemble system. The performance and quality of the ensemble prediction are strongly connected with some factors; one of the most important is the diversity among individual models. In our approach, this is enforced by adopting different loss functions and testing different data augmentations. We developed the proposed method by combining DeepLabV3+, HarDNet-MSEG, and Pyramid Vision Transformers. The developed solution was then assessed through an extensive empirical evaluation in five different scenarios: polyp detection, skin detection, leukocytes recognition, environmental microorganism detection, and butterfly recognition. The model provides state-of-the-art results.\",\"PeriodicalId\":93815,\"journal\":{\"name\":\"Signals\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signals\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/signals3020022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signals","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/signals3020022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

识别图像中的物体需要复杂的技能，包括上下文知识和识别物体边界的能力。在计算机视觉中，这项任务被称为语义分割，它涉及图像中每个像素的分类。这项任务在许多现实场景中都非常重要：在自动驾驶汽车中，它可以识别车辆周围的物体；在医学诊断中，它提高了早期发现危险病理的能力，从而降低了严重后果的风险。在这项工作中，我们提出了一种新的集成方法，能够解决语义分割任务。该模型基于卷积神经网络和变换器。系综使用许多不同的模型，这些模型的预测被聚合以形成系综系统的输出。集成预测的性能和质量与一些因素密切相关；其中最重要的是各个模型之间的多样性。在我们的方法中，这是通过采用不同的损失函数和测试不同的数据增强来实现的。我们结合DeepLabV3+、HarDNet MSEG和Pyramid Vision Transformers开发了所提出的方法。然后，通过在五种不同场景下进行广泛的经验评估来评估开发的解决方案：息肉检测、皮肤检测、白细胞识别、环境微生物检测和蝴蝶识别。该模型提供了最先进的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An Empirical Study on Ensemble of Segmentation Approaches

Recognizing objects in images requires complex skills that involve knowledge about the context and the ability to identify the borders of the objects. In computer vision, this task is called semantic segmentation and it pertains to the classification of each pixel in an image. The task is of main importance in many real-life scenarios: in autonomous vehicles, it allows the identification of objects surrounding the vehicle; in medical diagnosis, it improves the ability of early detecting of dangerous pathologies and thus mitigates the risk of serious consequences. In this work, we propose a new ensemble method able to solve the semantic segmentation task. The model is based on convolutional neural networks (CNNs) and transformers. An ensemble uses many different models whose predictions are aggregated to form the output of the ensemble system. The performance and quality of the ensemble prediction are strongly connected with some factors; one of the most important is the diversity among individual models. In our approach, this is enforced by adopting different loss functions and testing different data augmentations. We developed the proposed method by combining DeepLabV3+, HarDNet-MSEG, and Pyramid Vision Transformers. The developed solution was then assessed through an extensive empirical evaluation in five different scenarios: polyp detection, skin detection, leukocytes recognition, environmental microorganism detection, and butterfly recognition. The model provides state-of-the-art results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signals

CiteScore

3.20

自引率

0.00%

发文量

审稿时长

11 weeks