Computer vision models can make systematic errors, performing well on average but substantially worse on particular subsets (or slices) of data. In this work, we introduce Visual Concept Reviewer (VCR), a human-in-the-loop slice discovery framework that enables practitioners to interactively discover and understand systematic errors in object-detection models via novel use of visual concepts–semantically meaningful and frequently recurring image segments representing objects, parts, or abstract properties.
Leveraging recent advances in vision foundation models, VCR automatically generates segment-level visual concepts that serve as interpretable primitives for diagnosing issues in object-detection models, while also supporting lightweight human supervision when needed. VCR combines visual concepts with metadata in a tabular format and adapts frequent itemset mining techniques to identify common absences and presences of concepts associated with poor model performance at interactive speeds. VCR also keeps humans in the loop for interpretation and refinement at each step of the slice discovery process. We demonstrate VCR’s effectiveness and scalability through a new evaluation benchmark with 1713 slice discovery settings across three datasets. A user study with six expert industry machine learning scientists and engineers provides qualitative evidence of VCR’s utility in real-world workflows.
扫码关注我们
求助内容:
应助结果提醒方式:
