Human activity recognition (HAR) has emerged as a prominent research field in recent years. Current HAR models are only able to model bilateral correlations between two sensing devices for feature extraction. However, for some activities, exploiting correlations among more than two sensing devices, which we call hyper-correlations in this paper, is essential for extracting discriminatory features. In this work, we propose a novel HyperHAR framework that automatically models both bilateral and hyper-correlations among sensing devices. The HyperHAR consists of three modules. The Intra-sensing Device Feature Extraction Module generates latent representation across the data of each sensing device, based on which the Inter-sensing Device Multi-order Correlations Learning Module simultaneously learns both bilateral correlations and hyper-correlations. Lastly, the Information Aggregation Module generates a representation for an individual sensing device by aggregating the bilateral correlations and hyper-correlations it involves in. It also generates the representation for a pair of sensing devices by aggregating the hyper-correlations between the pair and other different individual sensing devices. We also propose a computationally more efficient HyperHAR-Lite framework, a lightweight variant of the HyperHAR framework, at a small cost of accuracy. Both the HyperHAR and HyperHAR-Lite outperform SOTA models across three commonly used benchmark datasets with significant margins. We validate the efficiency and effectiveness of the proposed frameworks through an ablation study and quantitative and qualitative analysis.
{"title":"HyperHAR: Inter-sensing Device Bilateral Correlations and Hyper-correlations Learning Approach for Wearable Sensing Device Based Human Activity Recognition","authors":"Nafees Ahmad, Ho-fung Leung","doi":"10.1145/3643511","DOIUrl":"https://doi.org/10.1145/3643511","url":null,"abstract":"Human activity recognition (HAR) has emerged as a prominent research field in recent years. Current HAR models are only able to model bilateral correlations between two sensing devices for feature extraction. However, for some activities, exploiting correlations among more than two sensing devices, which we call hyper-correlations in this paper, is essential for extracting discriminatory features. In this work, we propose a novel HyperHAR framework that automatically models both bilateral and hyper-correlations among sensing devices. The HyperHAR consists of three modules. The Intra-sensing Device Feature Extraction Module generates latent representation across the data of each sensing device, based on which the Inter-sensing Device Multi-order Correlations Learning Module simultaneously learns both bilateral correlations and hyper-correlations. Lastly, the Information Aggregation Module generates a representation for an individual sensing device by aggregating the bilateral correlations and hyper-correlations it involves in. It also generates the representation for a pair of sensing devices by aggregating the hyper-correlations between the pair and other different individual sensing devices. We also propose a computationally more efficient HyperHAR-Lite framework, a lightweight variant of the HyperHAR framework, at a small cost of accuracy. Both the HyperHAR and HyperHAR-Lite outperform SOTA models across three commonly used benchmark datasets with significant margins. We validate the efficiency and effectiveness of the proposed frameworks through an ablation study and quantitative and qualitative analysis.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"39 11","pages":"1:1-1:29"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140261817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Learning (FL) enables distributed training of human sensing models in a privacy-preserving manner. While promising, federated global models suffer from cross-domain accuracy degradation when the labeled source domains statistically differ from the unlabeled target domain. To tackle this problem, recent methods perform pairwise computation on the source and target domains to minimize the domain discrepancy by adversarial strategy. However, these methods are limited by the fact that pairwise source-target adversarial alignment alone only achieves domain-level alignment, which entails the alignment of domain-invariant as well as environment-dependent features. The misalignment of environment-dependent features may cause negative impact on the performance of the federated global model. In this paper, we introduce FDAS, a Federated adversarial Domain Adaptation with Semantic Knowledge Correction method. FDAS achieves concurrent alignment at both domain and semantic levels to improve the semantic quality of the aligned features, thereby reducing the misalignment of environment-dependent features. Moreover, we design a cross-domain semantic similarity metric and further devise feature selection and feature refinement mechanisms to enhance the two-level alignment. In addition, we propose a similarity-aware model fine-tuning strategy to further improve the target model performance. We evaluate the performance of FDAS extensively on four public and a real-world human sensing datasets. Extensive experiments demonstrate the superior effectiveness of FDAS and its potential in the real-world ubiquitous computing scenarios.
{"title":"Privacy-Preserving and Cross-Domain Human Sensing by Federated Domain Adaptation with Semantic Knowledge Correction","authors":"Kaijie Gong, Yi Gao, Wei Dong","doi":"10.1145/3643503","DOIUrl":"https://doi.org/10.1145/3643503","url":null,"abstract":"Federated Learning (FL) enables distributed training of human sensing models in a privacy-preserving manner. While promising, federated global models suffer from cross-domain accuracy degradation when the labeled source domains statistically differ from the unlabeled target domain. To tackle this problem, recent methods perform pairwise computation on the source and target domains to minimize the domain discrepancy by adversarial strategy. However, these methods are limited by the fact that pairwise source-target adversarial alignment alone only achieves domain-level alignment, which entails the alignment of domain-invariant as well as environment-dependent features. The misalignment of environment-dependent features may cause negative impact on the performance of the federated global model. In this paper, we introduce FDAS, a Federated adversarial Domain Adaptation with Semantic Knowledge Correction method. FDAS achieves concurrent alignment at both domain and semantic levels to improve the semantic quality of the aligned features, thereby reducing the misalignment of environment-dependent features. Moreover, we design a cross-domain semantic similarity metric and further devise feature selection and feature refinement mechanisms to enhance the two-level alignment. In addition, we propose a similarity-aware model fine-tuning strategy to further improve the target model performance. We evaluate the performance of FDAS extensively on four public and a real-world human sensing datasets. Extensive experiments demonstrate the superior effectiveness of FDAS and its potential in the real-world ubiquitous computing scenarios.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"31 2","pages":"6:1-6:26"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140262659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuning Wang, Linghui Zhong, Yongjian Fu, Lili Chen, Ju Ren, Yaoxue Zhang
Facial expression recognition (FER) is a crucial task for human-computer interaction and a multitude of multimedia applications that typically call for friendly, unobtrusive, ubiquitous, and even long-term monitoring. Achieving such a FER system meeting these multi-requirements faces critical challenges, mainly including the tiny irregular non-periodic deformation of emotion movements, high variability in facial positions and severe self-interference caused by users' own other behavior. In this work, we present UFace, a long-term, unobtrusive and reliable FER system for daily life using acoustic signals generated by a portable smartphone. We design an innovative network model with dual-stream input based on the attention mechanism, which can leverage distance-time profile features from various viewpoints to extract fine-grained emotion-related signal changes, thus enabling accurate identification of many kinds of expressions. Meanwhile, we propose effective mechanisms to deal with a series of interference issues during actual use. We implement UFace prototype with a daily-used smartphone and conduct extensive experiments in various real-world environments. The results demonstrate that UFace can successfully recognize 7 typical facial expressions with an average accuracy of 87.8% across 20 participants. Besides, the evaluation of different distances, angles, and interferences proves the great potential of the proposed system to be employed in practical scenarios.
面部表情识别(FER)是人机交互和众多多媒体应用的一项重要任务,这些应用通常需要友好、无干扰、无处不在甚至长期的监控。实现符合这些多重要求的表情识别系统面临着严峻的挑战,主要包括情绪运动的微小不规则非周期性变形、面部位置的高度可变性以及用户自身其他行为造成的严重自我干扰。在这项工作中,我们利用便携式智能手机产生的声学信号,为日常生活提供了一个长期、不显眼且可靠的 FER 系统--UFace。我们设计了一种基于注意力机制的双流输入创新网络模型,该模型可利用来自不同视角的距离-时间轮廓特征来提取与情绪相关的细粒度信号变化,从而实现对多种表情的准确识别。同时,我们提出了有效的机制来应对实际使用过程中的一系列干扰问题。我们利用日常使用的智能手机实现了 UFace 原型,并在各种真实环境中进行了广泛的实验。结果表明,UFace 可以成功识别 7 种典型的面部表情,20 名参与者的平均识别准确率为 87.8%。此外,对不同距离、角度和干扰的评估也证明了该系统在实际应用中的巨大潜力。
{"title":"UFace: Your Smartphone Can \"Hear\" Your Facial Expression!","authors":"Shuning Wang, Linghui Zhong, Yongjian Fu, Lili Chen, Ju Ren, Yaoxue Zhang","doi":"10.1145/3643546","DOIUrl":"https://doi.org/10.1145/3643546","url":null,"abstract":"Facial expression recognition (FER) is a crucial task for human-computer interaction and a multitude of multimedia applications that typically call for friendly, unobtrusive, ubiquitous, and even long-term monitoring. Achieving such a FER system meeting these multi-requirements faces critical challenges, mainly including the tiny irregular non-periodic deformation of emotion movements, high variability in facial positions and severe self-interference caused by users' own other behavior. In this work, we present UFace, a long-term, unobtrusive and reliable FER system for daily life using acoustic signals generated by a portable smartphone. We design an innovative network model with dual-stream input based on the attention mechanism, which can leverage distance-time profile features from various viewpoints to extract fine-grained emotion-related signal changes, thus enabling accurate identification of many kinds of expressions. Meanwhile, we propose effective mechanisms to deal with a series of interference issues during actual use. We implement UFace prototype with a daily-used smartphone and conduct extensive experiments in various real-world environments. The results demonstrate that UFace can successfully recognize 7 typical facial expressions with an average accuracy of 87.8% across 20 participants. Besides, the evaluation of different distances, angles, and interferences proves the great potential of the proposed system to be employed in practical scenarios.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"32 20","pages":"22:1-22:27"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140262848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chongzhi Xu, Xiaolong Zheng, Z. Ren, Liang Liu, Huadong Ma
The focus of Advanced driver-assistance systems (ADAS) is extending from the vehicle and road conditions to the driver because the driver's attention is critical to driving safety. Although existing sensor and camera based methods can monitor driver attention, they rely on specialised hardware and environmental conditions. In this paper, we aim to develop an effective and easy-to-use driver attention monitoring system based on UWB radar. We exploit the strong association between head motions and driver attention and propose UHead that infers driver attention by monitoring the direction and angle of the driver's head rotation. The core idea is to extract rotational time-frequency representation from reflected signals and to estimate head rotation angles from complex head reflections. To eliminate the dynamic noise generated by other body parts, UHead leverages the large magnitude and high velocity of head rotation to extract head motion information from the dynamically coupled information. UHead uses a bilinear joint time-frequency representation to avoid the loss of time and frequency resolution caused by windowing of traditional methods. We also design a head structure-based rotation angle estimation algorithm to accurately estimate the rotation angle from the time-varying rotation information of multiple reflection points in the head. Experimental results show that we achieve 12.96° median error of 3D head rotation angle estimation in real vehicle scenes.
{"title":"UHead: Driver Attention Monitoring System Using UWB Radar","authors":"Chongzhi Xu, Xiaolong Zheng, Z. Ren, Liang Liu, Huadong Ma","doi":"10.1145/3643551","DOIUrl":"https://doi.org/10.1145/3643551","url":null,"abstract":"The focus of Advanced driver-assistance systems (ADAS) is extending from the vehicle and road conditions to the driver because the driver's attention is critical to driving safety. Although existing sensor and camera based methods can monitor driver attention, they rely on specialised hardware and environmental conditions. In this paper, we aim to develop an effective and easy-to-use driver attention monitoring system based on UWB radar. We exploit the strong association between head motions and driver attention and propose UHead that infers driver attention by monitoring the direction and angle of the driver's head rotation. The core idea is to extract rotational time-frequency representation from reflected signals and to estimate head rotation angles from complex head reflections. To eliminate the dynamic noise generated by other body parts, UHead leverages the large magnitude and high velocity of head rotation to extract head motion information from the dynamically coupled information. UHead uses a bilinear joint time-frequency representation to avoid the loss of time and frequency resolution caused by windowing of traditional methods. We also design a head structure-based rotation angle estimation algorithm to accurately estimate the rotation angle from the time-varying rotation information of multiple reflection points in the head. Experimental results show that we achieve 12.96° median error of 3D head rotation angle estimation in real vehicle scenes.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"1 3","pages":"25:1-25:28"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140260960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We observe first that informed decision-making does not require absolute metrics and can instead be done by comparing designs. Second, we can use domain-specific heuristics to perform these comparisons. We combine these insights to develop DeltaLCA, an open-source interactive design tool that addresses the dual challenges of automating life cycle inventory generation and data availability by performing comparative analyses of electronics designs. Users can upload standard design files from Electronic Design Automation (EDA) software and the tool will guide them through determining which one has greater carbon footprints. DeltaLCA leverages electronics-specific LCA datasets and heuristics and tries to automatically rank the two designs, prompting users to provide additional information only when necessary. We show through case studies DeltaLCA achieves the same result as evaluating full LCAs, and that it accelerates LCA comparisons from eight expert-hours to a single click for devices with ~30 components, and 15 minutes for more complex devices with ~100 components.
{"title":"DeltaLCA: Comparative Life-Cycle Assessment for Electronics Design","authors":"Zhihang Zhang, Felix Hähnlein, Yuxuan Mei, Zachary Englhardt, Shwetak Patel, Adriana Schulz, Vikram Iyer","doi":"10.1145/3643561","DOIUrl":"https://doi.org/10.1145/3643561","url":null,"abstract":"Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We observe first that informed decision-making does not require absolute metrics and can instead be done by comparing designs. Second, we can use domain-specific heuristics to perform these comparisons. We combine these insights to develop DeltaLCA, an open-source interactive design tool that addresses the dual challenges of automating life cycle inventory generation and data availability by performing comparative analyses of electronics designs. Users can upload standard design files from Electronic Design Automation (EDA) software and the tool will guide them through determining which one has greater carbon footprints. DeltaLCA leverages electronics-specific LCA datasets and heuristics and tries to automatically rank the two designs, prompting users to provide additional information only when necessary. We show through case studies DeltaLCA achieves the same result as evaluating full LCAs, and that it accelerates LCA comparisons from eight expert-hours to a single click for devices with ~30 components, and 15 minutes for more complex devices with ~100 components.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"95 3","pages":"29:1-29:29"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140261211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Sun, Chunyu Xia, Xinyu Zhang, Hao Chen, C. Zhang
Egocentric non-intrusive sensing of human activities of daily living (ADL) in free-living environments represents a holy grail in ubiquitous computing. Existing approaches, such as egocentric vision and wearable motion sensors, either can be intrusive or have limitations in capturing non-ambulatory actions. To address these challenges, we propose EgoADL, the first egocentric ADL sensing system that uses an in-pocket smartphone as a multi-modal sensor hub to capture body motion, interactions with the physical environment and daily objects using non-visual sensors (audio, wireless sensing, and motion sensors). We collected a 120-hour multimodal dataset and annotated 20-hour data into 221 ADL, 70 object interactions, and 91 actions. EgoADL proposes multi-modal frame-wise slow-fast encoders to learn the feature representation of multi-sensory data that characterizes the complementary advantages of different modalities and adapt a transformer-based sequence-to-sequence model to decode the time-series sensor signals into a sequence of words that represent ADL. In addition, we introduce a self-supervised learning framework that extracts intrinsic supervisory signals from the multi-modal sensing data to overcome the lack of labeling data and achieve better generalization and extensibility. Our experiments in free-living environments demonstrate that EgoADL can achieve comparable performance with video-based approaches, bringing the vision of ambient intelligence closer to reality.
{"title":"Multimodal Daily-Life Logging in Free-living Environment Using Non-Visual Egocentric Sensors on a Smartphone","authors":"Ke Sun, Chunyu Xia, Xinyu Zhang, Hao Chen, C. Zhang","doi":"10.1145/3643553","DOIUrl":"https://doi.org/10.1145/3643553","url":null,"abstract":"Egocentric non-intrusive sensing of human activities of daily living (ADL) in free-living environments represents a holy grail in ubiquitous computing. Existing approaches, such as egocentric vision and wearable motion sensors, either can be intrusive or have limitations in capturing non-ambulatory actions. To address these challenges, we propose EgoADL, the first egocentric ADL sensing system that uses an in-pocket smartphone as a multi-modal sensor hub to capture body motion, interactions with the physical environment and daily objects using non-visual sensors (audio, wireless sensing, and motion sensors). We collected a 120-hour multimodal dataset and annotated 20-hour data into 221 ADL, 70 object interactions, and 91 actions. EgoADL proposes multi-modal frame-wise slow-fast encoders to learn the feature representation of multi-sensory data that characterizes the complementary advantages of different modalities and adapt a transformer-based sequence-to-sequence model to decode the time-series sensor signals into a sequence of words that represent ADL. In addition, we introduce a self-supervised learning framework that extracts intrinsic supervisory signals from the multi-modal sensing data to overcome the lack of labeling data and achieve better generalization and extensibility. Our experiments in free-living environments demonstrate that EgoADL can achieve comparable performance with video-based approaches, bringing the vision of ambient intelligence closer to reality.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"70 5","pages":"17:1-17:32"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140261276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fei Shang, Panlong Yang, Dawei Yan, Sijia Zhang, Xiang-Yang Li
WiFi has gradually developed into one of the main candidate technologies for ubiquitous sensing. Based on commercial off-the-shelf (COTS) WiFi devices, this paper proposes LiquImager, which can simultaneously identify liquid and image container regardless of container shape and position. Since the container size is close to the wavelength, diffraction makes the effect of the liquid on the signal difficult to approximate with a simple geometric model (such as ray tracking). Based on Maxwell's equations, we construct an electric field scattering sensing model. Using few measurements provided by COTS WiFi devices, we solve the scattering model to obtain the medium distribution of the sensing domain, which is used for identifing and imaging liquids. To suppress the signal noise, we propose LiqU-Net for image enhancement. For the centimeter-scale container that is randomly placed in an area of 25 cm × 25 cm, LiquImager can identify the liquid more than 90% accuracy. In terms of container imaging, LiquImager can accurately find the edge of the container for 4 types of containers with a volume less than 500 ml.
{"title":"LiquImager: Fine-grained Liquid Identification and Container Imaging System with COTS WiFi Devices","authors":"Fei Shang, Panlong Yang, Dawei Yan, Sijia Zhang, Xiang-Yang Li","doi":"10.1145/3643509","DOIUrl":"https://doi.org/10.1145/3643509","url":null,"abstract":"WiFi has gradually developed into one of the main candidate technologies for ubiquitous sensing. Based on commercial off-the-shelf (COTS) WiFi devices, this paper proposes LiquImager, which can simultaneously identify liquid and image container regardless of container shape and position. Since the container size is close to the wavelength, diffraction makes the effect of the liquid on the signal difficult to approximate with a simple geometric model (such as ray tracking). Based on Maxwell's equations, we construct an electric field scattering sensing model. Using few measurements provided by COTS WiFi devices, we solve the scattering model to obtain the medium distribution of the sensing domain, which is used for identifing and imaging liquids. To suppress the signal noise, we propose LiqU-Net for image enhancement. For the centimeter-scale container that is randomly placed in an area of 25 cm × 25 cm, LiquImager can identify the liquid more than 90% accuracy. In terms of container imaging, LiquImager can accurately find the edge of the container for 4 types of containers with a volume less than 500 ml.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"29 11","pages":"15:1-15:29"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140262334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The inadequate use of finger properties has limited the input space of touch interaction. By leveraging the category of contacting fingers, finger-specific interaction is able to expand input vocabulary. However, accurate finger identification remains challenging, as it requires either additional sensors or limited sets of identifiable fingers to achieve ideal accuracy in previous works. We introduce SpeciFingers, a novel approach to identify fingers with the capacitive raw data on touchscreens. We apply a neural network of an encoder-decoder architecture, which captures the spatio-temporal features in capacitive image sequences. To assist users in recovering from misidentification, we propose a correction mechanism to replace the existing undo-redo process. Also, we present a design space of finger-specific interaction with example interaction techniques. In particular, we designed and implemented a use case of optimizing the performance in pointing on small targets. We evaluated our identification model and error correction mechanism in our use case.
{"title":"SpeciFingers: Finger Identification and Error Correction on Capacitive Touchscreens","authors":"Zeyuan Huang, Cangjun Gao, Haiyan Wang, Xiaoming Deng, Yu-Kun Lai, Cuixia Ma, Sheng-feng Qin, Yong-Jin Liu, Hongan Wang","doi":"10.1145/3643559","DOIUrl":"https://doi.org/10.1145/3643559","url":null,"abstract":"The inadequate use of finger properties has limited the input space of touch interaction. By leveraging the category of contacting fingers, finger-specific interaction is able to expand input vocabulary. However, accurate finger identification remains challenging, as it requires either additional sensors or limited sets of identifiable fingers to achieve ideal accuracy in previous works. We introduce SpeciFingers, a novel approach to identify fingers with the capacitive raw data on touchscreens. We apply a neural network of an encoder-decoder architecture, which captures the spatio-temporal features in capacitive image sequences. To assist users in recovering from misidentification, we propose a correction mechanism to replace the existing undo-redo process. Also, we present a design space of finger-specific interaction with example interaction techniques. In particular, we designed and implemented a use case of optimizing the performance in pointing on small targets. We evaluated our identification model and error correction mechanism in our use case.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"17 4","pages":"8:1-8:28"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140262365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces MultiMesh, a multi-subject 3D human mesh construction system based on commodity WiFi. Our system can reuse commodity WiFi devices in the environment and is capable of working in non-line-of-sight (NLoS) conditions compared with the traditional computer vision-based approach. Specifically, we leverage an L-shaped antenna array to generate the two-dimensional angle of arrival (2D AoA) of reflected signals for subject separation in the physical space. We further leverage the angle of departure and time of flight of the signal to enhance the resolvability for precise separation of close subjects. Then we exploit information from various signal dimensions to mitigate the interference of indirect reflections according to different signal propagation paths. Moreover, we employ the continuity of human movement in the spatial-temporal domain to track weak reflected signals of faraway subjects. Finally, we utilize a deep learning model to digitize 2D AoA images of each subject into the 3D human mesh. We conducted extensive experiments in real-world multi-subject scenarios under various environments to evaluate the performance of our system. For example, we conduct experiments with occlusion and perform human mesh construction for different distances between two subjects and different distances between subjects and WiFi devices. The results show that MultiMesh can accurately construct 3D human meshes for multiple users with an average vertex error of 4cm. The evaluations also demonstrate that our system could achieve comparable performance for unseen environments and people. Moreover, we also evaluate the accuracy of spatial information extraction and the performance of subject detection. These evaluations demonstrate the robustness and effectiveness of our system.
本文介绍了基于商用 WiFi 的多主体 3D 人体网格构建系统 MultiMesh。与传统的基于计算机视觉的方法相比,我们的系统可以重复使用环境中的商用 WiFi 设备,并且能够在非视线(NLoS)条件下工作。具体来说,我们利用 L 型天线阵列生成反射信号的二维到达角(2D AoA),以便在物理空间中进行主体分离。我们进一步利用信号的离去角和飞行时间来提高精确分离近距离物体的分辨率。然后,我们利用各种信号维度的信息,根据不同的信号传播路径来减轻间接反射的干扰。此外,我们还利用时空域中人体运动的连续性来跟踪远处主体的微弱反射信号。最后,我们利用深度学习模型将每个受试者的 2D AoA 图像数字化为 3D 人体网格。我们在各种环境下的真实世界多主体场景中进行了大量实验,以评估我们系统的性能。例如,我们进行了遮挡实验,并在两个受试者之间的不同距离以及受试者与 WiFi 设备之间的不同距离下进行了人体网格构建。结果表明,MultiMesh 可以为多个用户准确构建三维人体网格,平均顶点误差为 4 厘米。评估结果还表明,我们的系统可以在不可见的环境和人物中实现相当的性能。此外,我们还评估了空间信息提取的准确性和主体检测的性能。这些评估证明了我们系统的鲁棒性和有效性。
{"title":"Multi-Subject 3D Human Mesh Construction Using Commodity WiFi","authors":"Yichao Wang, Yili Ren, Jie Yang","doi":"10.1145/3643504","DOIUrl":"https://doi.org/10.1145/3643504","url":null,"abstract":"This paper introduces MultiMesh, a multi-subject 3D human mesh construction system based on commodity WiFi. Our system can reuse commodity WiFi devices in the environment and is capable of working in non-line-of-sight (NLoS) conditions compared with the traditional computer vision-based approach. Specifically, we leverage an L-shaped antenna array to generate the two-dimensional angle of arrival (2D AoA) of reflected signals for subject separation in the physical space. We further leverage the angle of departure and time of flight of the signal to enhance the resolvability for precise separation of close subjects. Then we exploit information from various signal dimensions to mitigate the interference of indirect reflections according to different signal propagation paths. Moreover, we employ the continuity of human movement in the spatial-temporal domain to track weak reflected signals of faraway subjects. Finally, we utilize a deep learning model to digitize 2D AoA images of each subject into the 3D human mesh. We conducted extensive experiments in real-world multi-subject scenarios under various environments to evaluate the performance of our system. For example, we conduct experiments with occlusion and perform human mesh construction for different distances between two subjects and different distances between subjects and WiFi devices. The results show that MultiMesh can accurately construct 3D human meshes for multiple users with an average vertex error of 4cm. The evaluations also demonstrate that our system could achieve comparable performance for unseen environments and people. Moreover, we also evaluate the accuracy of spatial information extraction and the performance of subject detection. These evaluations demonstrate the robustness and effectiveness of our system.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"14 3","pages":"23:1-23:25"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140260952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhizhang Hu, Amir Radmehr, Yue Zhang, Shijia Pan, Phuc Nguyen
While occlusal diseases - the main cause of tooth loss -- significantly impact patients' teeth and well-being, they are the most underdiagnosed dental diseases nowadays. Experiencing occlusal diseases could result in difficulties in eating, speaking, and chronicle headaches, ultimately impacting patients' quality of life. Although attempts have been made to develop sensing systems for teeth activity monitoring, solutions that support sufficient sensing resolution for occlusal monitoring are missing. To fill that gap, this paper presents IOTeeth, a cost-effective and automated intra-oral sensing system for continuous and fine-grained monitoring of occlusal diseases. The IOTeeth system includes an intra-oral piezoelectric-based sensing array integrated into a dental retainer platform to support reliable occlusal disease recognition. IOTeeth focuses on biting and grinding activities from the canines and front teeth, which contain essential information of occlusion. IOTeeth's intra-oral wearable collects signals from the sensors and fetches them into a lightweight and robust deep learning model called Physioaware Attention Network (PAN Net) for occlusal disease recognition. We evaluate IOTeeth with 12 articulator teeth models from dental clinic patients. Evaluation results show an F1 score of 0.97 for activity recognition with leave-one-out validation and an average F1 score of 0.92 for dental disease recognition for different activities with leave-one-out validation.
咬合疾病--牙齿脱落的主要原因--严重影响患者的牙齿和健康,但却是目前最容易被忽视的牙科疾病。咬合疾病会导致进食困难、说话困难和长期头痛,最终影响患者的生活质量。虽然人们一直在尝试开发用于牙齿活动监测的传感系统,但目前还缺少能够为咬合监测提供足够传感分辨率的解决方案。为了填补这一空白,本文介绍了 IOTeeth,这是一种经济高效的自动口内传感系统,可对咬合疾病进行连续、精细的监测。IOTeeth 系统包括一个口内压电传感阵列,集成在一个牙科保持器平台上,支持可靠的咬合疾病识别。IOTeeth 主要监测犬齿和前牙的咬合和磨牙活动,这些活动包含咬合的基本信息。IOTeeth 的口内可穿戴设备收集来自传感器的信号,并将这些信号提取到一个名为 "物理感知注意力网络(PAN Net)"的轻量级鲁棒深度学习模型中,用于咬合疾病识别。我们使用牙科诊所患者的 12 个铰接牙齿模型对 IOTeeth 进行了评估。评估结果显示,在留空验证的情况下,活动识别的 F1 得分为 0.97,在留空验证的情况下,不同活动的牙科疾病识别平均 F1 得分为 0.92。
{"title":"IOTeeth: Intra-Oral Teeth Sensing System for Dental Occlusal Diseases Recognition","authors":"Zhizhang Hu, Amir Radmehr, Yue Zhang, Shijia Pan, Phuc Nguyen","doi":"10.1145/3643516","DOIUrl":"https://doi.org/10.1145/3643516","url":null,"abstract":"While occlusal diseases - the main cause of tooth loss -- significantly impact patients' teeth and well-being, they are the most underdiagnosed dental diseases nowadays. Experiencing occlusal diseases could result in difficulties in eating, speaking, and chronicle headaches, ultimately impacting patients' quality of life. Although attempts have been made to develop sensing systems for teeth activity monitoring, solutions that support sufficient sensing resolution for occlusal monitoring are missing. To fill that gap, this paper presents IOTeeth, a cost-effective and automated intra-oral sensing system for continuous and fine-grained monitoring of occlusal diseases. The IOTeeth system includes an intra-oral piezoelectric-based sensing array integrated into a dental retainer platform to support reliable occlusal disease recognition. IOTeeth focuses on biting and grinding activities from the canines and front teeth, which contain essential information of occlusion. IOTeeth's intra-oral wearable collects signals from the sensors and fetches them into a lightweight and robust deep learning model called Physioaware Attention Network (PAN Net) for occlusal disease recognition. We evaluate IOTeeth with 12 articulator teeth models from dental clinic patients. Evaluation results show an F1 score of 0.97 for activity recognition with leave-one-out validation and an average F1 score of 0.92 for dental disease recognition for different activities with leave-one-out validation.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"23 4","pages":"7:1-7:29"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140262213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}