Uncertainty Quantification for Trusted Machine Learning in Space System Cyber Security

2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT) Pub Date : 2021-07-01 DOI:10.1109/SMC-IT51442.2021.00012

Douglas Woodward, M. Hobbs, James Andrew Gilbertson, N. Cohen

{"title":"Uncertainty Quantification for Trusted Machine Learning in Space System Cyber Security","authors":"Douglas Woodward, M. Hobbs, James Andrew Gilbertson, N. Cohen","doi":"10.1109/SMC-IT51442.2021.00012","DOIUrl":null,"url":null,"abstract":"In recent years, the Aerospace Corporation has been developing machine learning systems to detect cyber anomalies in space system command and telemetry streams. However, to enable the use of deep learning in such high consequence environments, the models must be trustworthy. One aspect of trust is a model’s ability to accurately quantify the uncertainty of its predictions. Although many deep learning models output what seem to be confidence scores, current academic research has repeatedly shown that models often return high confidence even when very wrong and are unable to diagnose and respond appropriately to out-of-distribution inputs. This can result in catastrophic overconfidence when models are faced with adversarial inputs or concept drift. Even on routine inputs, without reliable uncertainty quantification, it is hard for human-machine teaming to take place as humans cannot trust the model’s reported confidence score. In short, all models are wrong sometimes, but models which know when they are wrong are considerably more useful. To this end, The Aerospace Corporation conducted a literature review and implemented current state of the art methods, including deep ensembles and temperature scaling for confidence calibration, to accurately quantify the uncertainty of deep learning model predictions. We further incorporated and tested these techniques within the existing cyber defense model framework for more trustworthy cyber anomaly detection models. We show that not only are these techniques successful, they are also easy to implement, extensible to many applications and machine learning model variants, and provide interpretable results for a wide audience. From this, Aerospace recommends further adoption of such techniques in high consequence environments.","PeriodicalId":292159,"journal":{"name":"2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMC-IT51442.2021.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In recent years, the Aerospace Corporation has been developing machine learning systems to detect cyber anomalies in space system command and telemetry streams. However, to enable the use of deep learning in such high consequence environments, the models must be trustworthy. One aspect of trust is a model’s ability to accurately quantify the uncertainty of its predictions. Although many deep learning models output what seem to be confidence scores, current academic research has repeatedly shown that models often return high confidence even when very wrong and are unable to diagnose and respond appropriately to out-of-distribution inputs. This can result in catastrophic overconfidence when models are faced with adversarial inputs or concept drift. Even on routine inputs, without reliable uncertainty quantification, it is hard for human-machine teaming to take place as humans cannot trust the model’s reported confidence score. In short, all models are wrong sometimes, but models which know when they are wrong are considerably more useful. To this end, The Aerospace Corporation conducted a literature review and implemented current state of the art methods, including deep ensembles and temperature scaling for confidence calibration, to accurately quantify the uncertainty of deep learning model predictions. We further incorporated and tested these techniques within the existing cyber defense model framework for more trustworthy cyber anomaly detection models. We show that not only are these techniques successful, they are also easy to implement, extensible to many applications and machine learning model variants, and provide interpretable results for a wide audience. From this, Aerospace recommends further adoption of such techniques in high consequence environments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

空间系统网络安全中可信机器学习的不确定性量化

近年来，航空航天公司一直在开发机器学习系统，以检测空间系统命令和遥测流中的网络异常。然而，为了在如此高后果的环境中使用深度学习，模型必须是值得信赖的。信任的一个方面是模型准确量化预测不确定性的能力。尽管许多深度学习模型输出的似乎是置信度分数，但目前的学术研究一再表明，即使在非常错误的情况下，模型通常也会返回高置信度，并且无法对分布外的输入进行诊断和适当响应。当模型面临对抗性输入或概念漂移时，这可能导致灾难性的过度自信。即使在常规输入中，如果没有可靠的不确定性量化，由于人类无法信任模型报告的置信度得分，人机合作也很难发生。简而言之，所有的模型有时都是错误的，但是知道自己什么时候是错误的模型要有用得多。为此，航空航天公司进行了文献综述，并采用了当前最先进的方法，包括深度集成和温度缩放置信度校准，以准确量化深度学习模型预测的不确定性。我们在现有的网络防御模型框架中进一步整合和测试了这些技术，以获得更可信的网络异常检测模型。我们表明，这些技术不仅是成功的，而且易于实现，可扩展到许多应用程序和机器学习模型变体，并为广泛的受众提供可解释的结果。因此，Aerospace建议在高后果环境中进一步采用此类技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT)

自引率

0.00%

发文量

期刊最新文献

Engineering of Autonomic & Autonomous Systems (EASe) 2021 Summary Nanosatellite constellation control framework using evolutionary contact plan design STINT Workshop Summary Satellite-Based AIS Trade-Off Analysis in the Context of the PANSAT CubeSat Mission High Performance, Web-Based, Real-Time Telemetry Visualization for Deep Space Mission Support and Operations