使用条件生成式对抗网络(cGAN)进行数据扩增:使用不同机器学习技术进行下水道状况分类和测试的应用

IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Journal of Hydroinformatics Pub Date : 2024-06-13 DOI:10.2166/hydro.2024.135
Haile Woldesellasse, Solomon Tesfamariam
{"title":"使用条件生成式对抗网络(cGAN)进行数据扩增:使用不同机器学习技术进行下水道状况分类和测试的应用","authors":"Haile Woldesellasse, Solomon Tesfamariam","doi":"10.2166/hydro.2024.135","DOIUrl":null,"url":null,"abstract":"\n The increasing availability of condition assessment data highlights the challenge of managing data imbalance in the asset management of aging infrastructure. Aging sewer pipes pose significant threats to health and the environment, underscoring the importance of proactive management practices to enhance asset maintenance and mitigate associated risks. While machine learning (ML) models are widely employed to model the complex deterioration process of sewer pipes, they face performance limitations when trained on imbalanced condition grade data. This paper addresses this issue by proposing a novel approach using conditional generative adversarial network (cGAN) for data augmentation. By generating synthetic data for minority classes, the skewed distribution of the sewer dataset is balanced, facilitating more robust and accurate predictive models. The utility of the proposed method is evaluated by training different ML classifiers, including neural network (NN), decision tree, quadratic discriminant analysis, Naïve Bayes, support vector machine (SVM), and K-nearest neighbor. Quadratic discriminant, Naïve Bayes, NN, and SVM classifiers demonstrated improvement. The cGAN-based data augmentation method also outperformed two other data imbalance handling techniques, random under-sampling, and cost-sensitive NN. Consequently, data generated by cGAN can effectively aid asset management by developing proactive classifiers that accurately predict pipes at a high risk of failure.","PeriodicalId":54801,"journal":{"name":"Journal of Hydroinformatics","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data augmentation using conditional generative adversarial network (cGAN): applications for sewer condition classification and testing using different machine learning techniques\",\"authors\":\"Haile Woldesellasse, Solomon Tesfamariam\",\"doi\":\"10.2166/hydro.2024.135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The increasing availability of condition assessment data highlights the challenge of managing data imbalance in the asset management of aging infrastructure. Aging sewer pipes pose significant threats to health and the environment, underscoring the importance of proactive management practices to enhance asset maintenance and mitigate associated risks. While machine learning (ML) models are widely employed to model the complex deterioration process of sewer pipes, they face performance limitations when trained on imbalanced condition grade data. This paper addresses this issue by proposing a novel approach using conditional generative adversarial network (cGAN) for data augmentation. By generating synthetic data for minority classes, the skewed distribution of the sewer dataset is balanced, facilitating more robust and accurate predictive models. The utility of the proposed method is evaluated by training different ML classifiers, including neural network (NN), decision tree, quadratic discriminant analysis, Naïve Bayes, support vector machine (SVM), and K-nearest neighbor. Quadratic discriminant, Naïve Bayes, NN, and SVM classifiers demonstrated improvement. The cGAN-based data augmentation method also outperformed two other data imbalance handling techniques, random under-sampling, and cost-sensitive NN. Consequently, data generated by cGAN can effectively aid asset management by developing proactive classifiers that accurately predict pipes at a high risk of failure.\",\"PeriodicalId\":54801,\"journal\":{\"name\":\"Journal of Hydroinformatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Hydroinformatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.2166/hydro.2024.135\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydroinformatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.2166/hydro.2024.135","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

状况评估数据的可用性越来越高,这凸显了在老化基础设施的资产管理中管理数据失衡所面临的挑战。老化的下水管道对健康和环境构成了严重威胁,这凸显了积极主动的管理措施对加强资产维护和降低相关风险的重要性。虽然机器学习(ML)模型被广泛用于对下水管道复杂的老化过程进行建模,但在对不平衡的状态等级数据进行训练时,这些模型的性能会受到限制。本文针对这一问题,提出了一种使用条件生成对抗网络(cGAN)进行数据增强的新方法。通过生成少数等级的合成数据,下水道数据集的倾斜分布得到了平衡,从而有助于建立更稳健、更准确的预测模型。通过训练不同的 ML 分类器,包括神经网络 (NN)、决策树、二次判别分析、奈夫贝叶斯、支持向量机 (SVM) 和 K-nearest neighbor,对所提出方法的实用性进行了评估。四元判别分析、奈夫贝叶斯、神经网络和 SVM 分类器的效果都有所改善。基于 cGAN 的数据增强方法还优于其他两种数据不平衡处理技术,即随机欠采样和成本敏感 NN。因此,cGAN 生成的数据可以通过开发前瞻性分类器来准确预测故障风险较高的管道,从而有效地帮助资产管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Data augmentation using conditional generative adversarial network (cGAN): applications for sewer condition classification and testing using different machine learning techniques
The increasing availability of condition assessment data highlights the challenge of managing data imbalance in the asset management of aging infrastructure. Aging sewer pipes pose significant threats to health and the environment, underscoring the importance of proactive management practices to enhance asset maintenance and mitigate associated risks. While machine learning (ML) models are widely employed to model the complex deterioration process of sewer pipes, they face performance limitations when trained on imbalanced condition grade data. This paper addresses this issue by proposing a novel approach using conditional generative adversarial network (cGAN) for data augmentation. By generating synthetic data for minority classes, the skewed distribution of the sewer dataset is balanced, facilitating more robust and accurate predictive models. The utility of the proposed method is evaluated by training different ML classifiers, including neural network (NN), decision tree, quadratic discriminant analysis, Naïve Bayes, support vector machine (SVM), and K-nearest neighbor. Quadratic discriminant, Naïve Bayes, NN, and SVM classifiers demonstrated improvement. The cGAN-based data augmentation method also outperformed two other data imbalance handling techniques, random under-sampling, and cost-sensitive NN. Consequently, data generated by cGAN can effectively aid asset management by developing proactive classifiers that accurately predict pipes at a high risk of failure.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Hydroinformatics
Journal of Hydroinformatics 工程技术-工程:土木
CiteScore
4.80
自引率
3.70%
发文量
59
审稿时长
3 months
期刊介绍: Journal of Hydroinformatics is a peer-reviewed journal devoted to the application of information technology in the widest sense to problems of the aquatic environment. It promotes Hydroinformatics as a cross-disciplinary field of study, combining technological, human-sociological and more general environmental interests, including an ethical perspective.
期刊最新文献
Sensitivity of model-based leakage localisation in water distribution networks to water demand sampling rates and spatio-temporal data gaps Efficient functioning of a sewer system: application of novel hybrid machine learning methods for the prediction of particle Froude number Quantile mapping technique for enhancing satellite-derived precipitation data in hydrological modelling: a case study of the Lam River Basin, Vietnam Development and application of a hybrid artificial neural network model for simulating future stream flows in catchments with limited in situ observed data Formation of meandering streams in a young floodplain within the Yarlung Tsangpo Grand Canyon in the Tibetan Plateau
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1