Background: This study aimed to demonstrate the feasibility of using computer vision (CV) to unobtrusively extract body motion metrics from videos of emergency medicine (EM) clinicians, and gather validity evidence of these metrics to differentiate POCUS skills between novice and experts, as well as to capture skills gained over time.
Methods: Prospective cohort study including novice and expert EM clinicians performing echocardiogram (ECHO) and focused assessment with sonography for trauma (FAST) exams on a live simulated patient. Expert observers provided objective structured clinical examination (OSCE) scores (numerical ratings on a scale from 1 to 100), and sonographers' hands and head motion metrics (path length, speed, acceleration, jerk, and smoothness) were extracted via CV using 2-dimensional videos. Data points were captured at baseline, and for novices at baseline and after 12-15 months of residency training.
Results: CV achieved high detection rates (99.52% ECHO, 98.70% FAST). At baseline, experts demonstrated superior OSCE scores (ECHO: 98.6 ± 2.1 vs 63.4 ± 17.0; FAST: 99.2 ± 1.5 vs 68.9 ± 17.7, p < 0.001) and faster task completion (101.8 ± 44.7 vs 240.3 ± 84.1 s, p < 0.001). Experts exhibited smoother hand movements (left hand smoothness: -129.3 ± 47.6 vs -241.3 ± 64.6, p < 0.001) and reduced total path lengths. After 12-15 months of training, novices showed significant improvements in OSCE scores (ECHO: 85.3 ± 10.3; FAST: 84.8 ± 6.5) and task efficiency (134.0 ± 35.6 s), with improvements in motion smoothness and reduced path lengths (p < 0.001). Motion metrics strongly correlated with OSCE scores (r = 0.455-0.783) and task completion time (r = 0.491-0.951).
Conclusions: CV successfully extracted objective motion metrics that differentiated POCUS skill levels between novices and experts and captured skill development over time. This approach offers a scalable, unobtrusive method for objective POCUS assessment, while supporting competency-based medical education frameworks.

