The widespread adoption of wearable devices has led to a surge in the development of multi-device wearable human activity recognition (WHAR) systems. Nevertheless, the performance of traditional supervised learning-based methods to WHAR is limited by the challenge of collecting ample annotated wearable data. To overcome this limitation, self-supervised learning (SSL) has emerged as a promising solution by first training a competent feature extractor on a substantial quantity of unlabeled data, followed by refining a minimal classifier with a small amount of labeled data. Despite the promise of SSL in WHAR, the majority of studies have not considered missing device scenarios in multi-device WHAR. To bridge this gap, we propose a multi-device SSL WHAR method termed Spatial-Temporal Masked Autoencoder (STMAE). STMAE captures discriminative activity representations by utilizing the asymmetrical encoder-decoder structure and two-stage spatial-temporal masking strategy, which can exploit the spatial-temporal correlations in multi-device data to improve the performance of SSL WHAR, especially on missing device scenarios. Experiments on four real-world datasets demonstrate the efficacy of STMAE in various practical scenarios.
{"title":"Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity Recognition","authors":"Shenghuan Miao, Ling Chen, Rong Hu","doi":"10.1145/3631415","DOIUrl":"https://doi.org/10.1145/3631415","url":null,"abstract":"The widespread adoption of wearable devices has led to a surge in the development of multi-device wearable human activity recognition (WHAR) systems. Nevertheless, the performance of traditional supervised learning-based methods to WHAR is limited by the challenge of collecting ample annotated wearable data. To overcome this limitation, self-supervised learning (SSL) has emerged as a promising solution by first training a competent feature extractor on a substantial quantity of unlabeled data, followed by refining a minimal classifier with a small amount of labeled data. Despite the promise of SSL in WHAR, the majority of studies have not considered missing device scenarios in multi-device WHAR. To bridge this gap, we propose a multi-device SSL WHAR method termed Spatial-Temporal Masked Autoencoder (STMAE). STMAE captures discriminative activity representations by utilizing the asymmetrical encoder-decoder structure and two-stage spatial-temporal masking strategy, which can exploit the spatial-temporal correlations in multi-device data to improve the performance of SSL WHAR, especially on missing device scenarios. Experiments on four real-world datasets demonstrate the efficacy of STMAE in various practical scenarios.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 3","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Learning (FL) aims to improve machine learning privacy by allowing several data owners in edge and ubiquitous computing systems to collaboratively train a model, while preserving their local training data private, and sharing only model training parameters. However, FL systems remain vulnerable to privacy attacks, and in particular, to membership inference attacks that allow adversaries to determine whether a given data sample belongs to participants' training data, thus, raising a significant threat in sensitive ubiquitous computing systems. Indeed, membership inference attacks are based on a binary classifier that is able to differentiate between member data samples used to train a model and non-member data samples not used for training. In this context, several defense mechanisms, including differential privacy, have been proposed to counter such privacy attacks. However, the main drawback of these methods is that they may reduce model accuracy while incurring non-negligible computational costs. In this paper, we precisely address this problem with PASTEL, a FL privacy-preserving mechanism that is based on a novel multi-objective learning function. On the one hand, PASTEL decreases the generalization gap to reduce the difference between member data and non-member data, and on the other hand, PASTEL reduces model loss and leverages adaptive gradient descent optimization for preserving high model accuracy. Our experimental evaluations conducted on eight widely used datasets and five model architectures show that PASTEL significantly reduces membership inference attack success rates by up to -28%, reaching optimal privacy protection in most cases, with low to no perceptible impact on model accuracy.
{"title":"PASTEL","authors":"F. Elhattab, Sara Bouchenak, Cédric Boscher","doi":"10.1145/3633808","DOIUrl":"https://doi.org/10.1145/3633808","url":null,"abstract":"Federated Learning (FL) aims to improve machine learning privacy by allowing several data owners in edge and ubiquitous computing systems to collaboratively train a model, while preserving their local training data private, and sharing only model training parameters. However, FL systems remain vulnerable to privacy attacks, and in particular, to membership inference attacks that allow adversaries to determine whether a given data sample belongs to participants' training data, thus, raising a significant threat in sensitive ubiquitous computing systems. Indeed, membership inference attacks are based on a binary classifier that is able to differentiate between member data samples used to train a model and non-member data samples not used for training. In this context, several defense mechanisms, including differential privacy, have been proposed to counter such privacy attacks. However, the main drawback of these methods is that they may reduce model accuracy while incurring non-negligible computational costs. In this paper, we precisely address this problem with PASTEL, a FL privacy-preserving mechanism that is based on a novel multi-objective learning function. On the one hand, PASTEL decreases the generalization gap to reduce the difference between member data and non-member data, and on the other hand, PASTEL reduces model loss and leverages adaptive gradient descent optimization for preserving high model accuracy. Our experimental evaluations conducted on eight widely used datasets and five model architectures show that PASTEL significantly reduces membership inference attack success rates by up to -28%, reaching optimal privacy protection in most cases, with low to no perceptible impact on model accuracy.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"14 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangyang Gu, Jing Chen, Cong Wu, Kun He, Ziming Zhao, Ruiying Du
Unlawful wireless cameras are often hidden to secretly monitor private activities. However, existing methods to detect and localize these cameras are interactively complex or require expensive specialized hardware. In this paper, we present LocCams, an efficient and robust approach for hidden camera detection and localization using only a commodity device (e.g., a smartphone). By analyzing data packets in the wireless local area network, LocCams passively detects hidden cameras based on the packet transmission rate. Camera localization is achieved by identifying whether the physical channel between our detector and the hidden camera is a Line-of-Sight (LOS) propagation path based on the distribution of channel state information subcarriers, and utilizing a feature extraction approach based on a Convolutional Neural Network (CNN) model for reliable localization. Our extensive experiments, involving various subjects, cameras, distances, user positions, and room configurations, demonstrate LocCams' effectiveness. Additionally, to evaluate the performance of the method in real life, we use subjects, cameras, and rooms that do not appear in the training set to evaluate the transferability of the model. With an overall accuracy of 95.12% within 30 seconds of detection, LocCams provides robust detection and localization of hidden cameras.
{"title":"LocCams","authors":"Yangyang Gu, Jing Chen, Cong Wu, Kun He, Ziming Zhao, Ruiying Du","doi":"10.1145/3631432","DOIUrl":"https://doi.org/10.1145/3631432","url":null,"abstract":"Unlawful wireless cameras are often hidden to secretly monitor private activities. However, existing methods to detect and localize these cameras are interactively complex or require expensive specialized hardware. In this paper, we present LocCams, an efficient and robust approach for hidden camera detection and localization using only a commodity device (e.g., a smartphone). By analyzing data packets in the wireless local area network, LocCams passively detects hidden cameras based on the packet transmission rate. Camera localization is achieved by identifying whether the physical channel between our detector and the hidden camera is a Line-of-Sight (LOS) propagation path based on the distribution of channel state information subcarriers, and utilizing a feature extraction approach based on a Convolutional Neural Network (CNN) model for reliable localization. Our extensive experiments, involving various subjects, cameras, distances, user positions, and room configurations, demonstrate LocCams' effectiveness. Additionally, to evaluate the performance of the method in real life, we use subjects, cameras, and rooms that do not appear in the training set to evaluate the transferability of the model. With an overall accuracy of 95.12% within 30 seconds of detection, LocCams provides robust detection and localization of hidden cameras.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"13 2","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Human Activity Recognition (HAR) is a challenging, multi-label classification problem as activities may co-occur and sensor signals corresponding to the same activity may vary in different contexts (e.g., different device placements). This paper proposes a Deep Heterogeneous Contrastive Hyper-Graph Learning (DHC-HGL) framework that captures heterogenous Context-Aware HAR (CA-HAR) hypergraph properties in a message-passing and neighborhood-aggregation fashion. Prior work only explored homogeneous or shallow-node-heterogeneous graphs. DHC-HGL handles heterogeneous CA-HAR data by innovatively 1) Constructing three different types of sub-hypergraphs that are each passed through different custom HyperGraph Convolution (HGC) layers designed to handle edge-heterogeneity and 2) Adopting a contrastive loss function to ensure node-heterogeneity. In rigorous evaluation on two CA-HAR datasets, DHC-HGL significantly outperformed state-of-the-art baselines by 5.8% to 16.7% on Matthews Correlation Coefficient (MCC) and 3.0% to 8.4% on Macro F1 scores. UMAP visualizations of learned CA-HAR node embeddings are also presented to enhance model explainability. Our code is publicly available1 to encourage further research.
{"title":"Deep Heterogeneous Contrastive Hyper-Graph Learning for In-the-Wild Context-Aware Human Activity Recognition","authors":"Wen Ge, Guanyi Mou, Emmanuel O. Agu, Kyumin Lee","doi":"10.1145/3631444","DOIUrl":"https://doi.org/10.1145/3631444","url":null,"abstract":"Human Activity Recognition (HAR) is a challenging, multi-label classification problem as activities may co-occur and sensor signals corresponding to the same activity may vary in different contexts (e.g., different device placements). This paper proposes a Deep Heterogeneous Contrastive Hyper-Graph Learning (DHC-HGL) framework that captures heterogenous Context-Aware HAR (CA-HAR) hypergraph properties in a message-passing and neighborhood-aggregation fashion. Prior work only explored homogeneous or shallow-node-heterogeneous graphs. DHC-HGL handles heterogeneous CA-HAR data by innovatively 1) Constructing three different types of sub-hypergraphs that are each passed through different custom HyperGraph Convolution (HGC) layers designed to handle edge-heterogeneity and 2) Adopting a contrastive loss function to ensure node-heterogeneity. In rigorous evaluation on two CA-HAR datasets, DHC-HGL significantly outperformed state-of-the-art baselines by 5.8% to 16.7% on Matthews Correlation Coefficient (MCC) and 3.0% to 8.4% on Macro F1 scores. UMAP visualizations of learned CA-HAR node embeddings are also presented to enhance model explainability. Our code is publicly available1 to encourage further research.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"12 34","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Smart hospital patient rooms incorporate various smart devices to allow digital control of the entertainment --- such as TV and soundbar --- and the environment --- including lights, blinds, and thermostat. This technology can benefit patients by providing a more accessible, engaging, and personalized approach to their care. Many patients arrive at a rehabilitation hospital because they suffered a life-changing event such as a spinal cord injury or stroke. It can be challenging for patients to learn to cope with the changed abilities that are the new norm in their lives. This study explores ways smart patient rooms can support rehabilitation education to prepare patients for life outside the hospital's care. We conducted 20 contextual inquiries and four interviews with rehabilitation educators as they performed education sessions with patients and informal caregivers. Using thematic analysis, our findings offer insights into how smart patient rooms could revolutionize patient education by fostering better engagement with educational content, reducing interruptions during sessions, providing more agile education content management, and customizing therapy elements for each patient's unique needs. Lastly, we discuss design opportunities for future smart patient room implementations for a better educational experience in any healthcare context.
{"title":"Reenvisioning Patient Education with Smart Hospital Patient Rooms","authors":"Joshua Dawson, K. J. Phanich, Jason Wiese","doi":"10.1145/3631419","DOIUrl":"https://doi.org/10.1145/3631419","url":null,"abstract":"Smart hospital patient rooms incorporate various smart devices to allow digital control of the entertainment --- such as TV and soundbar --- and the environment --- including lights, blinds, and thermostat. This technology can benefit patients by providing a more accessible, engaging, and personalized approach to their care. Many patients arrive at a rehabilitation hospital because they suffered a life-changing event such as a spinal cord injury or stroke. It can be challenging for patients to learn to cope with the changed abilities that are the new norm in their lives. This study explores ways smart patient rooms can support rehabilitation education to prepare patients for life outside the hospital's care. We conducted 20 contextual inquiries and four interviews with rehabilitation educators as they performed education sessions with patients and informal caregivers. Using thematic analysis, our findings offer insights into how smart patient rooms could revolutionize patient education by fostering better engagement with educational content, reducing interruptions during sessions, providing more agile education content management, and customizing therapy elements for each patient's unique needs. Lastly, we discuss design opportunities for future smart patient room implementations for a better educational experience in any healthcare context.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"2 8","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arvind Pillai, Trevor Cohen, Dror Ben-Zeev, Subigya Nepal, Weichen Wang, M. Nemesure, Michael Heinz, George Price, D. Lekkas, Amanda C. Collins, Tess Z Griffin, Benjamin Buck, S. Preum, Dror Nicholas Jacobson
Speech-based diaries from mobile phones can capture paralinguistic patterns that help detect mental illness symptoms such as suicidal ideation. However, previous studies have primarily evaluated machine learning models on a single dataset, making their performance unknown under distribution shifts. In this paper, we investigate the generalizability of speech-based suicidal ideation detection using mobile phones through cross-dataset experiments using four datasets with N=786 individuals experiencing major depressive disorder, auditory verbal hallucinations, persecutory thoughts, and students with suicidal thoughts. Our results show that machine and deep learning methods generalize poorly in many cases. Thus, we evaluate unsupervised domain adaptation (UDA) and semi-supervised domain adaptation (SSDA) to mitigate performance decreases owing to distribution shifts. While SSDA approaches showed superior performance, they are often ineffective, requiring large target datasets with limited labels for adversarial and contrastive training. Therefore, we propose sinusoidal similarity sub-sampling (S3), a method that selects optimal source subsets for the target domain by computing pair-wise scores using sinusoids. Compared to prior approaches, S3 does not use labeled target data or transform features. Fine-tuning using S3 improves the cross-dataset performance of deep models across the datasets, thus having implications in ubiquitous technology, mental health, and machine learning.
{"title":"Investigating Generalizability of Speech-based Suicidal Ideation Detection Using Mobile Phones","authors":"Arvind Pillai, Trevor Cohen, Dror Ben-Zeev, Subigya Nepal, Weichen Wang, M. Nemesure, Michael Heinz, George Price, D. Lekkas, Amanda C. Collins, Tess Z Griffin, Benjamin Buck, S. Preum, Dror Nicholas Jacobson","doi":"10.1145/3631452","DOIUrl":"https://doi.org/10.1145/3631452","url":null,"abstract":"Speech-based diaries from mobile phones can capture paralinguistic patterns that help detect mental illness symptoms such as suicidal ideation. However, previous studies have primarily evaluated machine learning models on a single dataset, making their performance unknown under distribution shifts. In this paper, we investigate the generalizability of speech-based suicidal ideation detection using mobile phones through cross-dataset experiments using four datasets with N=786 individuals experiencing major depressive disorder, auditory verbal hallucinations, persecutory thoughts, and students with suicidal thoughts. Our results show that machine and deep learning methods generalize poorly in many cases. Thus, we evaluate unsupervised domain adaptation (UDA) and semi-supervised domain adaptation (SSDA) to mitigate performance decreases owing to distribution shifts. While SSDA approaches showed superior performance, they are often ineffective, requiring large target datasets with limited labels for adversarial and contrastive training. Therefore, we propose sinusoidal similarity sub-sampling (S3), a method that selects optimal source subsets for the target domain by computing pair-wise scores using sinusoids. Compared to prior approaches, S3 does not use labeled target data or transform features. Fine-tuning using S3 improves the cross-dataset performance of deep models across the datasets, thus having implications in ubiquitous technology, mental health, and machine learning.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"2 1","pages":"1 - 38"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech enhancement is regarded as the key to the quality of digital communication and is gaining increasing attention in the research field of audio processing. In this paper, we present EarSE, the first robust, hands-free, multi-modal speech enhancement solution using commercial off-the-shelf headphones. The key idea of EarSE is a novel hardware setting---leveraging the form factor of headphones equipped with a boom microphone to establish a stable acoustic sensing field across the user's face. Furthermore, we designed a sensing methodology based on Frequency-Modulated Continuous-Wave, which is an ultrasonic modality sensitive to capture subtle facial articulatory gestures of users when speaking. Moreover, we design a fully attention-based deep neural network to self-adaptively solve the user diversity problem by introducing the Vision Transformer network. We enhance the collaboration between the speech and ultrasonic modalities using a multi-head attention mechanism and a Factorized Bilinear Pooling gate. Extensive experiments demonstrate that EarSE achieves remarkable performance as increasing SiSDR by 14.61 dB and reducing the word error rate of user speech recognition by 22.45--66.41% in real-world application. EarSE not only outperforms seven baselines by 38.0% in SiSNR, 12.4% in STOI, and 20.5% in PESQ on average but also maintains practicality.
{"title":"EarSE","authors":"Di Duan, Yongliang Chen, Weitao Xu, Tianxing Li","doi":"10.1145/3631447","DOIUrl":"https://doi.org/10.1145/3631447","url":null,"abstract":"Speech enhancement is regarded as the key to the quality of digital communication and is gaining increasing attention in the research field of audio processing. In this paper, we present EarSE, the first robust, hands-free, multi-modal speech enhancement solution using commercial off-the-shelf headphones. The key idea of EarSE is a novel hardware setting---leveraging the form factor of headphones equipped with a boom microphone to establish a stable acoustic sensing field across the user's face. Furthermore, we designed a sensing methodology based on Frequency-Modulated Continuous-Wave, which is an ultrasonic modality sensitive to capture subtle facial articulatory gestures of users when speaking. Moreover, we design a fully attention-based deep neural network to self-adaptively solve the user diversity problem by introducing the Vision Transformer network. We enhance the collaboration between the speech and ultrasonic modalities using a multi-head attention mechanism and a Factorized Bilinear Pooling gate. Extensive experiments demonstrate that EarSE achieves remarkable performance as increasing SiSDR by 14.61 dB and reducing the word error rate of user speech recognition by 22.45--66.41% in real-world application. EarSE not only outperforms seven baselines by 38.0% in SiSNR, 12.4% in STOI, and 20.5% in PESQ on average but also maintains practicality.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"10 9","pages":"1 - 33"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuning Su, Yuhua Jin, Zhengqing Wang, Yonghao Shi, Da-Yuan Huang, Teng Han, Xing-Dong Yang
We investigate the feasibility of a vibrotactile device that is both battery-free and electronic-free. Our approach leverages lasers as a wireless power transfer and haptic control mechanism, which can drive small actuators commonly used in AR/VR and mobile applications with DC or AC signals. To validate the feasibility of our method, we developed a proof-of-concept prototype that includes low-cost eccentric rotating mass (ERM) motors and linear resonant actuators (LRAs) connected to photovoltaic (PV) cells. This prototype enabled us to capture laser energy from any distance across a room and analyze the impact of critical parameters on the effectiveness of our approach. Through a user study, testing 16 different vibration patterns rendered using either a single motor or two motors, we demonstrate the effectiveness of our approach in generating vibration patterns of comparable quality to a baseline, which rendered the patterns using a signal generator.
{"title":"Laser-Powered Vibrotactile Rendering","authors":"Yuning Su, Yuhua Jin, Zhengqing Wang, Yonghao Shi, Da-Yuan Huang, Teng Han, Xing-Dong Yang","doi":"10.1145/3631449","DOIUrl":"https://doi.org/10.1145/3631449","url":null,"abstract":"We investigate the feasibility of a vibrotactile device that is both battery-free and electronic-free. Our approach leverages lasers as a wireless power transfer and haptic control mechanism, which can drive small actuators commonly used in AR/VR and mobile applications with DC or AC signals. To validate the feasibility of our method, we developed a proof-of-concept prototype that includes low-cost eccentric rotating mass (ERM) motors and linear resonant actuators (LRAs) connected to photovoltaic (PV) cells. This prototype enabled us to capture laser energy from any distance across a room and analyze the impact of critical parameters on the effectiveness of our approach. Through a user study, testing 16 different vibration patterns rendered using either a single motor or two motors, we demonstrate the effectiveness of our approach in generating vibration patterns of comparable quality to a baseline, which rendered the patterns using a signal generator.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"12 51","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mark Colley, Oliver Speidel, Jan Strohbeck, J. Rixen, Janina Belz, Enrico Rukzio
Automated vehicles are expected to improve safety, mobility, and inclusion. User acceptance is required for the successful introduction of this technology. One essential prerequisite for acceptance is appropriately trusting the vehicle's capabilities. System transparency via visualizing internal information could calibrate this trust by enabling the surveillance of the vehicle's detection and prediction capabilities, including its failures. Additionally, concurrently increased situation awareness could improve take-overs in case of emergency. This work reports the results of two online comparative video-based studies on visualizing prediction and maneuver-planning information. Effects on trust, cognitive load, and situation awareness were measured using a simulation (N=280) and state-of-the-art road user prediction and maneuver planning on a pre-recorded real-world video using a real prototype (N=238). Results show that color conveys uncertainty best, that the planned trajectory increased trust, and that the visualization of other predicted trajectories improved perceived safety.
{"title":"Effects of Uncertain Trajectory Prediction Visualization in Highly Automated Vehicles on Trust, Situation Awareness, and Cognitive Load","authors":"Mark Colley, Oliver Speidel, Jan Strohbeck, J. Rixen, Janina Belz, Enrico Rukzio","doi":"10.1145/3631408","DOIUrl":"https://doi.org/10.1145/3631408","url":null,"abstract":"Automated vehicles are expected to improve safety, mobility, and inclusion. User acceptance is required for the successful introduction of this technology. One essential prerequisite for acceptance is appropriately trusting the vehicle's capabilities. System transparency via visualizing internal information could calibrate this trust by enabling the surveillance of the vehicle's detection and prediction capabilities, including its failures. Additionally, concurrently increased situation awareness could improve take-overs in case of emergency. This work reports the results of two online comparative video-based studies on visualizing prediction and maneuver-planning information. Effects on trust, cognitive load, and situation awareness were measured using a simulation (N=280) and state-of-the-art road user prediction and maneuver planning on a pre-recorded real-world video using a real prototype (N=238). Results show that color conveys uncertainty best, that the planned trajectory increased trust, and that the visualization of other predicted trajectories improved perceived safety.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 12","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duo Zhang, Xusheng Zhang, Yaxiong Xie, Fusang Zhang, Xuanzhi Wang, Yang Li, Daqing Zhang
Millimeter wave (mmWave) radar excels in accurately estimating the distance, speed, and angle of the signal reflectors relative to the radar. However, for diverse sensing applications reliant on radar's tracking capability, these estimates must be transformed from radar to room coordinates. This transformation hinges on the mmWave radar's location attribute, encompassing its position and orientation in room coordinates. Traditional outdoor calibration solutions for autonomous driving utilize corner reflectors as static reference points to derive the location attribute. When deployed in the indoor environment, it is challenging, even for the mmWave radar with GHz bandwidth and a large antenna array, to separate the static reference points from other multipath reflectors. To tackle the static multipath, we propose to deploy a moving reference point (a moving robot) to fully harness the velocity resolution of mmWave radar. Specifically, we select a SLAM-capable robot to accurately obtain its locations under room coordinates during motion, without requiring human intervention. Accurately pairing the locations of the robot under two coordinate systems requires tight synchronization between the mmWave radar and the robot. We therefore propose a novel trajectory correspondence based calibration algorithm that takes the estimated trajectories of two systems as input, decoupling the operations of two systems to the maximum. Extensive experimental results demonstrate that the proposed calibration solution exhibits very high accuracy (1.74 cm and 0.43° accuracy for location and orientation respectively) and could ensure outstanding performance in three representative applications: fall detection, point cloud fusion, and long-distance human tracking.
毫米波(mmWave)雷达在准确估计信号反射器相对于雷达的距离、速度和角度方面表现出色。然而,对于依赖雷达跟踪能力的各种传感应用来说,这些估计值必须从雷达转换到室内坐标。这种转换取决于毫米波雷达的位置属性,包括其在房间坐标中的位置和方向。传统的自动驾驶室外校准解决方案利用角反射器作为静态参考点来推导位置属性。在室内环境中部署时,即使是具有 GHz 带宽和大型天线阵列的毫米波雷达,要将静态参考点与其他多径反射体分开也是一项挑战。为了解决静态多径问题,我们建议部署一个移动参考点(移动机器人),以充分利用毫米波雷达的速度分辨率。具体来说,我们选择一个具有 SLAM 功能的机器人,以便在运动过程中根据房间坐标准确获取其位置,而无需人工干预。在两个坐标系下精确配对机器人的位置需要毫米波雷达和机器人之间的紧密同步。因此,我们提出了一种基于轨迹对应的新型校准算法,该算法将两个系统的估计轨迹作为输入,最大限度地解耦了两个系统的操作。广泛的实验结果表明,所提出的校准解决方案具有极高的精度(定位和定向精度分别为 1.74 厘米和 0.43°),可确保在跌倒检测、点云融合和远距离人体跟踪这三个具有代表性的应用中发挥出色的性能。
{"title":"LoCal","authors":"Duo Zhang, Xusheng Zhang, Yaxiong Xie, Fusang Zhang, Xuanzhi Wang, Yang Li, Daqing Zhang","doi":"10.1145/3631436","DOIUrl":"https://doi.org/10.1145/3631436","url":null,"abstract":"Millimeter wave (mmWave) radar excels in accurately estimating the distance, speed, and angle of the signal reflectors relative to the radar. However, for diverse sensing applications reliant on radar's tracking capability, these estimates must be transformed from radar to room coordinates. This transformation hinges on the mmWave radar's location attribute, encompassing its position and orientation in room coordinates. Traditional outdoor calibration solutions for autonomous driving utilize corner reflectors as static reference points to derive the location attribute. When deployed in the indoor environment, it is challenging, even for the mmWave radar with GHz bandwidth and a large antenna array, to separate the static reference points from other multipath reflectors. To tackle the static multipath, we propose to deploy a moving reference point (a moving robot) to fully harness the velocity resolution of mmWave radar. Specifically, we select a SLAM-capable robot to accurately obtain its locations under room coordinates during motion, without requiring human intervention. Accurately pairing the locations of the robot under two coordinate systems requires tight synchronization between the mmWave radar and the robot. We therefore propose a novel trajectory correspondence based calibration algorithm that takes the estimated trajectories of two systems as input, decoupling the operations of two systems to the maximum. Extensive experimental results demonstrate that the proposed calibration solution exhibits very high accuracy (1.74 cm and 0.43° accuracy for location and orientation respectively) and could ensure outstanding performance in three representative applications: fall detection, point cloud fusion, and long-distance human tracking.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"12 42","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}