数据质量对deeplabv3 +算法水体分割的影响

Anirudh Edpuganti, Pillalamarri Akshaya, Jangala Gouthami, Sajith Variyar, Sowmya, R. Sivanpillai
{"title":"数据质量对deeplabv3 +算法水体分割的影响","authors":"Anirudh Edpuganti, Pillalamarri Akshaya, Jangala Gouthami, Sajith Variyar, Sowmya, R. Sivanpillai","doi":"10.5194/isprs-archives-xlviii-m-3-2023-81-2023","DOIUrl":null,"url":null,"abstract":"Abstract. Training Deep Learning (DL) algorithms for segmenting features require hundreds to thousands of input data and corresponding labels. Generating thousands of input images and labels requires considerable resources and time. Hence, it is common practice to use opensource imagery data and labels available online. Most of these open-source data have little or no metadata describing their quality or suitability making it problematic for training or evaluating DL models. This study evaluated the effect of data quality on training DeepLabV3+, using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We generated subsets of 256 × 256 pixels, and 10% of these images (802) were set aside for testing. First, we trained and validated the DeepLabV3+ model with the remaining images. Second, we removed images with incorrect labels and trained another DeepLabV3+ network. Finally, we trained the third DeepLabV3+ network after removing images with turbid water or with floating vegetation. All three trained models were evaluated with test images and then we calculated accuracy metrics. As the quality of the input images improved, accuracy of the predicted masks generated from the first model increased from 92.8% to 94.3% in the second model. The third model’s accuracy was 96.4%, demonstrating the network’s ability to better learn and predict water bodies when the input data had fewer class variations. Based on the results we recommend assessing the quality of open-source data for incorrect labels and variations in the target class prior to training DeepLabV3+ or any other DL network.\n","PeriodicalId":30634,"journal":{"name":"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"EFFECT OF DATA QUALITY ON WATER BODY SEGMENTATION WITH DEEPLABV3+ ALGORITHM\",\"authors\":\"Anirudh Edpuganti, Pillalamarri Akshaya, Jangala Gouthami, Sajith Variyar, Sowmya, R. Sivanpillai\",\"doi\":\"10.5194/isprs-archives-xlviii-m-3-2023-81-2023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Training Deep Learning (DL) algorithms for segmenting features require hundreds to thousands of input data and corresponding labels. Generating thousands of input images and labels requires considerable resources and time. Hence, it is common practice to use opensource imagery data and labels available online. Most of these open-source data have little or no metadata describing their quality or suitability making it problematic for training or evaluating DL models. This study evaluated the effect of data quality on training DeepLabV3+, using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We generated subsets of 256 × 256 pixels, and 10% of these images (802) were set aside for testing. First, we trained and validated the DeepLabV3+ model with the remaining images. Second, we removed images with incorrect labels and trained another DeepLabV3+ network. Finally, we trained the third DeepLabV3+ network after removing images with turbid water or with floating vegetation. All three trained models were evaluated with test images and then we calculated accuracy metrics. As the quality of the input images improved, accuracy of the predicted masks generated from the first model increased from 92.8% to 94.3% in the second model. The third model’s accuracy was 96.4%, demonstrating the network’s ability to better learn and predict water bodies when the input data had fewer class variations. Based on the results we recommend assessing the quality of open-source data for incorrect labels and variations in the target class prior to training DeepLabV3+ or any other DL network.\\n\",\"PeriodicalId\":30634,\"journal\":{\"name\":\"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/isprs-archives-xlviii-m-3-2023-81-2023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/isprs-archives-xlviii-m-3-2023-81-2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 1

摘要

摘要用于分割特征的训练深度学习(DL)算法需要数百到数千个输入数据和相应的标签。生成数千个输入图像和标签需要大量的资源和时间。因此,使用在线可用的开源图像数据和标签是一种常见的做法。这些开源数据中的大多数几乎没有或根本没有描述其质量或适用性的元数据,这使得训练或评估DL模型成为问题。本研究使用从Kaggle获得的Sentinel 2 A/B RGB图像和标签,评估了数据质量对训练DeepLabV3+的影响。我们生成了256个子集 × 256个像素并且这些图像(802)的10%被留出用于测试。首先,我们用剩下的图像对DeepLabV3+模型进行了训练和验证。其次,我们删除了带有错误标签的图像,并训练了另一个DeepLabV3+网络。最后,我们在去除含有浑浊水或漂浮植被的图像后,训练了第三个DeepLabV3+网络。所有三个训练的模型都用测试图像进行了评估,然后我们计算了准确性指标。随着输入图像质量的提高,从第一模型生成的预测掩模的精度从92.8%提高到第二模型中的94.3%。第三个模型的准确率为96.4%,表明当输入数据的类别变化较小时,该网络能够更好地学习和预测水体。根据结果,我们建议在训练DeepLabV3+或任何其他DL网络之前,评估目标类中不正确标签和变化的开源数据的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
EFFECT OF DATA QUALITY ON WATER BODY SEGMENTATION WITH DEEPLABV3+ ALGORITHM
Abstract. Training Deep Learning (DL) algorithms for segmenting features require hundreds to thousands of input data and corresponding labels. Generating thousands of input images and labels requires considerable resources and time. Hence, it is common practice to use opensource imagery data and labels available online. Most of these open-source data have little or no metadata describing their quality or suitability making it problematic for training or evaluating DL models. This study evaluated the effect of data quality on training DeepLabV3+, using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We generated subsets of 256 × 256 pixels, and 10% of these images (802) were set aside for testing. First, we trained and validated the DeepLabV3+ model with the remaining images. Second, we removed images with incorrect labels and trained another DeepLabV3+ network. Finally, we trained the third DeepLabV3+ network after removing images with turbid water or with floating vegetation. All three trained models were evaluated with test images and then we calculated accuracy metrics. As the quality of the input images improved, accuracy of the predicted masks generated from the first model increased from 92.8% to 94.3% in the second model. The third model’s accuracy was 96.4%, demonstrating the network’s ability to better learn and predict water bodies when the input data had fewer class variations. Based on the results we recommend assessing the quality of open-source data for incorrect labels and variations in the target class prior to training DeepLabV3+ or any other DL network.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.70
自引率
0.00%
发文量
949
审稿时长
16 weeks
期刊最新文献
EVALUATION OF CONSUMER-GRADE AND SURVEY-GRADE UAV-LIDAR EVALUATING GEOMETRY OF AN INDOOR SCENARIO WITH OCCLUSIONS BASED ON TOTAL STATION MEASUREMENTS OF WALL ELEMENTS INVESTIGATION ON THE USE OF NeRF FOR HERITAGE 3D DENSE RECONSTRUCTION FOR INTERIOR SPACES TERRESTRIAL 3D MAPPING OF FORESTS: GEOREFERENCING CHALLENGES AND SENSORS COMPARISONS SPECTRAL ANALYSIS OF IMAGES OF PLANTS UNDER STRESS USING A CLOSE-RANGE CAMERA
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1