基于遗传规划的语义分割网络叠加

IF 1.7 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Genetic Programming and Evolvable Machines Pub Date : 2023-10-26 DOI:10.1007/s10710-023-09464-0

Illya Bakurov, Marco Buzzelli, Raimondo Schettini, Mauro Castelli, Leonardo Vanneschi

{"title":"基于遗传规划的语义分割网络叠加","authors":"Illya Bakurov, Marco Buzzelli, Raimondo Schettini, Mauro Castelli, Leonardo Vanneschi","doi":"10.1007/s10710-023-09464-0","DOIUrl":null,"url":null,"abstract":"Abstract Semantic segmentation consists of classifying each pixel of an image and constitutes an essential step towards scene recognition and understanding. Deep convolutional encoder–decoder neural networks now constitute state-of-the-art methods in the field of semantic segmentation. The problem of street scenes’ segmentation for automotive applications constitutes an important application field of such networks and introduces a set of imperative exigencies. Since the models need to be executed on self-driving vehicles to make fast decisions in response to a constantly changing environment, they are not only expected to operate reliably but also to process the input images rapidly. In this paper, we explore genetic programming (GP) as a meta-model that combines four different efficiency-oriented networks for the analysis of urban scenes. Notably, we present and examine two approaches. In the first approach, we represent solutions as GP trees that combine networks’ outputs such that each output class’s prediction is obtained through the same meta-model. In the second approach, we propose representing solutions as lists of GP trees, each designed to provide a unique meta-model for a given target class. The main objective is to develop efficient and accurate combination models that could be easily interpreted, therefore allowing gathering some hints on how to improve the existing networks. The experiments performed on the Cityscapes dataset of urban scene images with semantic pixel-wise annotations confirm the effectiveness of the proposed approach. Specifically, our best-performing models improve systems’ generalization ability by approximately 5% compared to traditional ensembles, 30% for the less performing state-of-the-art CNN and show competitive results with respect to state-of-the-art ensembles. Additionally, they are small in size, allow interpretability, and use fewer features due to GP’s automatic feature selection.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"32 9","pages":"0"},"PeriodicalIF":1.7000,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic segmentation network stacking with genetic programming\",\"authors\":\"Illya Bakurov, Marco Buzzelli, Raimondo Schettini, Mauro Castelli, Leonardo Vanneschi\",\"doi\":\"10.1007/s10710-023-09464-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Semantic segmentation consists of classifying each pixel of an image and constitutes an essential step towards scene recognition and understanding. Deep convolutional encoder–decoder neural networks now constitute state-of-the-art methods in the field of semantic segmentation. The problem of street scenes’ segmentation for automotive applications constitutes an important application field of such networks and introduces a set of imperative exigencies. Since the models need to be executed on self-driving vehicles to make fast decisions in response to a constantly changing environment, they are not only expected to operate reliably but also to process the input images rapidly. In this paper, we explore genetic programming (GP) as a meta-model that combines four different efficiency-oriented networks for the analysis of urban scenes. Notably, we present and examine two approaches. In the first approach, we represent solutions as GP trees that combine networks’ outputs such that each output class’s prediction is obtained through the same meta-model. In the second approach, we propose representing solutions as lists of GP trees, each designed to provide a unique meta-model for a given target class. The main objective is to develop efficient and accurate combination models that could be easily interpreted, therefore allowing gathering some hints on how to improve the existing networks. The experiments performed on the Cityscapes dataset of urban scene images with semantic pixel-wise annotations confirm the effectiveness of the proposed approach. Specifically, our best-performing models improve systems’ generalization ability by approximately 5% compared to traditional ensembles, 30% for the less performing state-of-the-art CNN and show competitive results with respect to state-of-the-art ensembles. Additionally, they are small in size, allow interpretability, and use fewer features due to GP’s automatic feature selection.\",\"PeriodicalId\":50424,\"journal\":{\"name\":\"Genetic Programming and Evolvable Machines\",\"volume\":\"32 9\",\"pages\":\"0\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetic Programming and Evolvable Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s10710-023-09464-0\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Programming and Evolvable Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10710-023-09464-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

语义分割包括对图像的每个像素进行分类，是实现场景识别和理解的重要步骤。深度卷积编码器-解码器神经网络是目前语义分割领域最先进的方法。面向汽车应用的街景分割问题是街景网络的一个重要应用领域，同时也带来了一系列迫切需要解决的问题。由于这些模型需要在自动驾驶汽车上执行，以便对不断变化的环境做出快速决策，因此它们不仅要可靠地运行，还要快速处理输入图像。在本文中，我们将遗传规划(GP)作为一种元模型进行探索，该模型结合了四种不同的以效率为导向的网络，用于城市场景的分析。值得注意的是，我们提出并研究了两种方法。在第一种方法中，我们将解决方案表示为GP树，该树结合了网络的输出，以便通过相同的元模型获得每个输出类的预测。在第二种方法中，我们建议将解决方案表示为GP树列表，每个树都旨在为给定的目标类提供唯一的元模型。主要目标是开发有效和准确的组合模型，这些模型可以很容易地解释，因此可以收集一些关于如何改进现有网络的提示。在城市场景图像的cityscape数据集上进行了带有语义像素化注释的实验，验证了该方法的有效性。具体来说，与传统集成相比，我们表现最好的模型将系统的泛化能力提高了约5%，对于表现较差的最先进的CNN提高了30%，并且相对于最先进的集成显示出具有竞争力的结果。此外，它们体积小，具有可解释性，并且由于GP的自动特征选择而使用更少的特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semantic segmentation network stacking with genetic programming

Abstract Semantic segmentation consists of classifying each pixel of an image and constitutes an essential step towards scene recognition and understanding. Deep convolutional encoder–decoder neural networks now constitute state-of-the-art methods in the field of semantic segmentation. The problem of street scenes’ segmentation for automotive applications constitutes an important application field of such networks and introduces a set of imperative exigencies. Since the models need to be executed on self-driving vehicles to make fast decisions in response to a constantly changing environment, they are not only expected to operate reliably but also to process the input images rapidly. In this paper, we explore genetic programming (GP) as a meta-model that combines four different efficiency-oriented networks for the analysis of urban scenes. Notably, we present and examine two approaches. In the first approach, we represent solutions as GP trees that combine networks’ outputs such that each output class’s prediction is obtained through the same meta-model. In the second approach, we propose representing solutions as lists of GP trees, each designed to provide a unique meta-model for a given target class. The main objective is to develop efficient and accurate combination models that could be easily interpreted, therefore allowing gathering some hints on how to improve the existing networks. The experiments performed on the Cityscapes dataset of urban scene images with semantic pixel-wise annotations confirm the effectiveness of the proposed approach. Specifically, our best-performing models improve systems’ generalization ability by approximately 5% compared to traditional ensembles, 30% for the less performing state-of-the-art CNN and show competitive results with respect to state-of-the-art ensembles. Additionally, they are small in size, allow interpretability, and use fewer features due to GP’s automatic feature selection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genetic Programming and Evolvable Machines 工程技术-计算机：理论方法

CiteScore

5.90

自引率

3.80%

发文量

审稿时长

6 months

期刊介绍： A unique source reporting on methods for artificial evolution of programs and machines... Reports innovative and significant progress in automatic evolution of software and hardware. Features both theoretical and application papers. Covers hardware implementations, artificial life, molecular computing and emergent computation techniques. Examines such related topics as evolutionary algorithms with variable-size genomes, alternate methods of program induction, approaches to engineering systems development based on embryology, morphogenesis or other techniques inspired by adaptive natural systems.