Automatic test suite generation for key-points detection DNNs using many-objective search (experience paper)

Fitash Ul Haq, Donghwan Shin, L. Briand, Thomas Stifter, Jun Wang
{"title":"Automatic test suite generation for key-points detection DNNs using many-objective search (experience paper)","authors":"Fitash Ul Haq, Donghwan Shin, L. Briand, Thomas Stifter, Jun Wang","doi":"10.1145/3460319.3464802","DOIUrl":null,"url":null,"abstract":"Automatically detecting the positions of key-points (e.g., facial key-points or finger key-points) in an image is an essential problem in many applications, such as driver's gaze detection and drowsiness detection in automated driving systems. With the recent advances of Deep Neural Networks (DNNs), Key-Points detection DNNs (KP-DNNs) have been increasingly employed for that purpose. Nevertheless, KP-DNN testing and validation have remained a challenging problem because KP-DNNs predict many independent key-points at the same time---where each individual key-point may be critical in the targeted application---and images can vary a great deal according to many factors. In this paper, we present an approach to automatically generate test data for KP-DNNs using many-objective search. In our experiments, focused on facial key-points detection DNNs developed for an industrial automotive application, we show that our approach can generate test suites to severely mispredict, on average, more than 93% of all key-points. In comparison, random search-based test data generation can only severely mispredict 41% of them. Many of these mispredictions, however, are not avoidable and should not therefore be considered failures. We also empirically compare state-of-the-art, many-objective search algorithms and their variants, tailored for test suite generation. Furthermore, we investigate and demonstrate how to learn specific conditions, based on image characteristics (e.g., head posture and skin color), that lead to severe mispredictions. Such conditions serve as a basis for risk analysis or DNN retraining.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3460319.3464802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

Automatically detecting the positions of key-points (e.g., facial key-points or finger key-points) in an image is an essential problem in many applications, such as driver's gaze detection and drowsiness detection in automated driving systems. With the recent advances of Deep Neural Networks (DNNs), Key-Points detection DNNs (KP-DNNs) have been increasingly employed for that purpose. Nevertheless, KP-DNN testing and validation have remained a challenging problem because KP-DNNs predict many independent key-points at the same time---where each individual key-point may be critical in the targeted application---and images can vary a great deal according to many factors. In this paper, we present an approach to automatically generate test data for KP-DNNs using many-objective search. In our experiments, focused on facial key-points detection DNNs developed for an industrial automotive application, we show that our approach can generate test suites to severely mispredict, on average, more than 93% of all key-points. In comparison, random search-based test data generation can only severely mispredict 41% of them. Many of these mispredictions, however, are not avoidable and should not therefore be considered failures. We also empirically compare state-of-the-art, many-objective search algorithms and their variants, tailored for test suite generation. Furthermore, we investigate and demonstrate how to learn specific conditions, based on image characteristics (e.g., head posture and skin color), that lead to severe mispredictions. Such conditions serve as a basis for risk analysis or DNN retraining.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用多目标搜索自动生成关键点检测dnn的测试套件(经验论文)
自动检测图像中关键点的位置(例如面部关键点或手指关键点)是许多应用中的关键问题,例如自动驾驶系统中的驾驶员凝视检测和睡意检测。随着深度神经网络(dnn)的最新进展,关键点检测dnn (kp - dnn)已越来越多地用于这一目的。然而,KP-DNN测试和验证仍然是一个具有挑战性的问题,因为KP-DNN同时预测许多独立的关键点——其中每个单独的关键点在目标应用中可能是关键的——并且图像可能根据许多因素而变化很大。在本文中,我们提出了一种使用多目标搜索自动生成kp - dnn测试数据的方法。在我们的实验中,专注于为工业汽车应用开发的面部关键点检测dnn,我们表明我们的方法可以生成严重错误预测的测试套件,平均超过93%的所有关键点。相比之下,基于随机搜索的测试数据生成只能严重错误地预测其中的41%。然而,许多这样的错误预测是不可避免的,因此不应被视为失败。我们还经验地比较了最先进的、多目标的搜索算法及其变体,为测试套件的生成量身定制。此外,我们研究并演示了如何根据图像特征(例如,头部姿势和肤色)学习导致严重错误预测的特定条件。这些条件可作为风险分析或DNN再培训的基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Semantic table structure identification in spreadsheets Parema: an unpacking framework for demystifying VM-based Android packers TERA: optimizing stochastic regression tests in machine learning projects Empirically evaluating readily available information for regression test optimization in continuous integration RESTest: automated black-box testing of RESTful web APIs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1