{"title":"CIM: A Novel Clustering-based Energy-Efficient Data Imputation Method for Human Activity Recognition","authors":"Dina Hussein, Ganapati Bhat","doi":"10.1145/3609111","DOIUrl":null,"url":null,"abstract":"Human activity recognition (HAR) is an important component in a number of health applications, including rehabilitation, Parkinson’s disease, daily activity monitoring, and fitness monitoring. State-of-the-art HAR approaches use multiple sensors on the body to accurately identify activities at runtime. These approaches typically assume that data from all sensors are available for runtime activity recognition. However, data from one or more sensors may be unavailable due to malfunction, energy constraints, or communication challenges between the sensors. Missing data can lead to significant degradation in the accuracy, thus affecting quality of service to users. A common approach for handling missing data is to train classifiers or sensor data recovery algorithms for each combination of missing sensors. However, this results in significant memory and energy overhead on resource-constrained wearable devices. In strong contrast to prior approaches, this paper presents a clustering-based approach (CIM) to impute missing data at runtime. We first define a set of possible clusters and representative data patterns for each sensor in HAR. Then, we create and store a mapping between clusters across sensors. At runtime, when data from a sensor are missing, we utilize the stored mapping table to obtain most likely cluster for the missing sensor. The representative window for the identified cluster is then used as imputation to perform activity classification. We also provide a method to obtain imputation-aware activity prediction sets to handle uncertainty in data when using imputation. Experiments on three HAR datasets show that CIM achieves accuracy within 10% of a baseline without missing data for one missing sensor when providing single activity labels. The accuracy gap drops to less than 1% with imputation-aware classification. Measurements on a low-power processor show that CIM achieves close to 100% energy savings compared to state-of-the-art generative approaches.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"14 1","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3609111","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Human activity recognition (HAR) is an important component in a number of health applications, including rehabilitation, Parkinson’s disease, daily activity monitoring, and fitness monitoring. State-of-the-art HAR approaches use multiple sensors on the body to accurately identify activities at runtime. These approaches typically assume that data from all sensors are available for runtime activity recognition. However, data from one or more sensors may be unavailable due to malfunction, energy constraints, or communication challenges between the sensors. Missing data can lead to significant degradation in the accuracy, thus affecting quality of service to users. A common approach for handling missing data is to train classifiers or sensor data recovery algorithms for each combination of missing sensors. However, this results in significant memory and energy overhead on resource-constrained wearable devices. In strong contrast to prior approaches, this paper presents a clustering-based approach (CIM) to impute missing data at runtime. We first define a set of possible clusters and representative data patterns for each sensor in HAR. Then, we create and store a mapping between clusters across sensors. At runtime, when data from a sensor are missing, we utilize the stored mapping table to obtain most likely cluster for the missing sensor. The representative window for the identified cluster is then used as imputation to perform activity classification. We also provide a method to obtain imputation-aware activity prediction sets to handle uncertainty in data when using imputation. Experiments on three HAR datasets show that CIM achieves accuracy within 10% of a baseline without missing data for one missing sensor when providing single activity labels. The accuracy gap drops to less than 1% with imputation-aware classification. Measurements on a low-power processor show that CIM achieves close to 100% energy savings compared to state-of-the-art generative approaches.
期刊介绍:
The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.