Test and Evaluation Harnesses for Learning Systems

2022 IEEE AUTOTESTCON Pub Date : 2022-08-29 DOI:10.1109/AUTOTESTCON47462.2022.9984783

Tyler Cody, P. Beling, Laura Freeman

引用次数: 0

Abstract

There is an increasing demand for operational uses of machine learning (ML), however, a lack of best practices for test and evaluation (T &E) of learning systems is a hindrance to supply. This manuscript proposes a new framework for best practices, described as T &E harnesses, that corresponds principally to the task of engineering a learning system-in contrast to the status quo task of solving a learning problem. The primary difference is a question of scope. This manuscript places T &E for ML into the broader scope of systems engineering processes. Importantly, two challenge problems, acquisition and operations, are used to motivate the use of T &E harnesses for learning systems. This manuscript draws from recent findings in experimental design for ML, combinatorial interaction testing of ML solutions, and the general systems modeling of ML. The concept of T &E harnesses is closely tied to existing models of systems engineering processes. We draw the conclusion that existing best practices for T &E form a subset of what is needed to rigorously test for system-level satisfaction of stakeholder needs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

学习系统的测试和评估工具

对机器学习(ML)的操作用途的需求不断增加，然而，缺乏学习系统的测试和评估(t&e)的最佳实践是供应的障碍。本文提出了一个最佳实践的新框架，描述为技术与技术控制，它主要对应于工程学习系统的任务，而不是解决学习问题的现状任务。主要的区别在于范围的问题。这份手稿将ML的t&e置于系统工程过程的更广泛的范围内。重要的是，两个具有挑战性的问题，即获取和运营，被用来激励对学习系统使用技术与环境管理。本文借鉴了最近在机器学习实验设计、机器学习解决方案的组合交互测试和机器学习的一般系统建模方面的发现。t&e利用的概念与系统工程过程的现有模型密切相关。我们得出结论，现有的t&e最佳实践构成了严格测试涉众需求的系统级满意度所需的子集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE AUTOTESTCON

自引率

0.00%

发文量

期刊最新文献

Next Generation Streaming Data Test System for High Bandwidth Applications Information Assurance in modern ATE Towards Continuous Cyber Testing with Reinforcement Learning for Whole Campaign Emulation The Dichotomy of Commonality versus Form Factor for O-level ATE Securing ATE Using the DoD's Risk Management Framework