Pub Date : 2024-11-04DOI: 10.1109/TAI.2024.3489532
Lutong Qin;Lei Zhang;Chengrun Li;Chaoda Song;Dongzhou Cheng;Shuoyuan Wang;Hao Wu;Aiguo Song
Recently, deep neural networks have triumphed over a large variety of human activity recognition (HAR) applications on resource-constrained mobile devices. However, most existing works are static and ignore the fact that the computational budget usually changes drastically across various devices, which prevent real-world HAR deployment. It still remains a major challenge: how to adaptively and instantly tradeoff accuracy and latency at runtime for on-device activity inference using time series sensor data? To address this issue, this article introduces a new collaborative learning scheme by training a set of subnetworks executed at varying network widths when fueled with different sensor input resolutions as data augmentation, which can instantly switch on-the-fly at different width-resolution configurations for flexible and dynamic activity inference under varying resource budgets. Particularly, it offers a promising performance-boosting solution by utilizing self-distillation to transfer the unique knowledge among multiple width-resolution configuration, which can capture stronger feature representations for activity recognition. Extensive experiments and ablation studies on three public HAR benchmark datasets validate the effectiveness and efficiency of our approach. A real implementation is evaluated on a mobile device. This discovery opens up the possibility to directly access accuracy-latency spectrum of deep learning models in versatile real-world HAR deployments. Code is available at https://github.com/Lutong-Qin/Collaborative_HAR