Kaiyan Ling, Hang Zhao, Xiangmin Fan, Xiaohui Niu, Wenchao Yin, Yue Liu, Cui Wang, Xiaojun Bi
Touch pointing is one of the primary interaction actions on mobile devices. In this research, we aim to (1) model touch pointing for people with Parkinson's Disease (PD), and (2) detect PD via touch pointing. We created a mobile game called MoleBuster in which a user performs a sequence of pointing actions. Our study with 40 participants shows that PD participants exhibited distinct pointing behavior. PD participants were much slower and had greater variances in movement time (MT), while their error rate was slightly lower than age-matched non-PD participants, indicating PD participants traded speed for accuracy. The nominal width Finger-Fitts law showed greater fitness than Fitts' law, suggesting this model should be adopted in lieu of Fitts' law to guide mobile interface design for PD users. We also proposed a CNN-Transformer-based neural network model to detect PD. Taking touch pointing data and comfort rating of finger movement as input, this model achieved an AUC of 0.97 and sensitivity of 0.95 in leave-one-user-out cross-validation. Overall, our research contributes models that reveal the temporal and spatial characteristics of touch pointing for PD users, and provide a new method (CNN-Transformer model) and a mobile game (MoleBuster) for convenient PD detection.
{"title":"Model Touch Pointing and Detect Parkinson's Disease via a Mobile Game","authors":"Kaiyan Ling, Hang Zhao, Xiangmin Fan, Xiaohui Niu, Wenchao Yin, Yue Liu, Cui Wang, Xiaojun Bi","doi":"10.1145/3659627","DOIUrl":"https://doi.org/10.1145/3659627","url":null,"abstract":"Touch pointing is one of the primary interaction actions on mobile devices. In this research, we aim to (1) model touch pointing for people with Parkinson's Disease (PD), and (2) detect PD via touch pointing. We created a mobile game called MoleBuster in which a user performs a sequence of pointing actions. Our study with 40 participants shows that PD participants exhibited distinct pointing behavior. PD participants were much slower and had greater variances in movement time (MT), while their error rate was slightly lower than age-matched non-PD participants, indicating PD participants traded speed for accuracy. The nominal width Finger-Fitts law showed greater fitness than Fitts' law, suggesting this model should be adopted in lieu of Fitts' law to guide mobile interface design for PD users. We also proposed a CNN-Transformer-based neural network model to detect PD. Taking touch pointing data and comfort rating of finger movement as input, this model achieved an AUC of 0.97 and sensitivity of 0.95 in leave-one-user-out cross-validation. Overall, our research contributes models that reveal the temporal and spatial characteristics of touch pointing for PD users, and provide a new method (CNN-Transformer model) and a mobile game (MoleBuster) for convenient PD detection.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140983604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meagan B. Loerakker, Jasmin Niess, Paweł W. Woźniak
Reflection is widely regarded as a key design goal for technologies for well-being. Yet, recent research shows that technologies for reflection may have negative consequences, in the form of rumination, i.e. negative thought cycles. Understanding how technologies support thinking about oneself, which can take the form of rumination and reflection, is key for future well-being technologies. To address this research gap, we developed the Reflection, Rumination and Thought in Technology (R2T2) scale. Contrary to past research, R2T2 addresses ways of self-focused thinking beyond reflection. This scale can quantify how a technology supports self-focused thinking and the rumination and reflection aspects of that thinking. We developed the scale through a systematic scale development process. We then evaluated the scale's test-retest reliability along with its concurrent and discriminant validity. R2T2 enables designers and researchers to compare technologies which embrace self-focused thinking and its facets as a design goal.
{"title":"Technology which Makes You Think","authors":"Meagan B. Loerakker, Jasmin Niess, Paweł W. Woźniak","doi":"10.1145/3659615","DOIUrl":"https://doi.org/10.1145/3659615","url":null,"abstract":"Reflection is widely regarded as a key design goal for technologies for well-being. Yet, recent research shows that technologies for reflection may have negative consequences, in the form of rumination, i.e. negative thought cycles. Understanding how technologies support thinking about oneself, which can take the form of rumination and reflection, is key for future well-being technologies. To address this research gap, we developed the Reflection, Rumination and Thought in Technology (R2T2) scale. Contrary to past research, R2T2 addresses ways of self-focused thinking beyond reflection. This scale can quantify how a technology supports self-focused thinking and the rumination and reflection aspects of that thinking. We developed the scale through a systematic scale development process. We then evaluated the scale's test-retest reliability along with its concurrent and discriminant validity. R2T2 enables designers and researchers to compare technologies which embrace self-focused thinking and its facets as a design goal.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140983199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Subject-aware vocal activity sensing on wearables, which specifically recognizes and monitors the wearer's distinct vocal activities, is essential in advancing personal health monitoring and enabling context-aware applications. While recent advancements in earables present new opportunities, the absence of relevant datasets and effective methods remains a significant challenge. In this paper, we introduce EarSAVAS, the first publicly available dataset constructed specifically for subject-aware human vocal activity sensing on earables. EarSAVAS encompasses eight distinct vocal activities from both the earphone wearer and bystanders, including synchronous two-channel audio and motion data collected from 42 participants totaling 44.5 hours. Further, we propose EarVAS, a lightweight multi-modal deep learning architecture that enables efficient subject-aware vocal activity recognition on earables. To validate the reliability of EarSAVAS and the efficiency of EarVAS, we implemented two advanced benchmark models. Evaluation results on EarSAVAS reveal EarVAS's effectiveness with an accuracy of 90.84% and a Macro-AUC of 89.03%. Comprehensive ablation experiments were conducted on benchmark models and demonstrated the effectiveness of feedback microphone audio and highlighted the potential value of sensor fusion in subject-aware vocal activity sensing on earables. We hope that the proposed EarSAVAS and benchmark models can inspire other researchers to further explore efficient subject-aware human vocal activity sensing on earables.
{"title":"The EarSAVAS Dataset","authors":"Xiyuxing Zhang, Yuntao Wang, Yuxuan Han, Chen Liang, Ishan Chatterjee, Jiankai Tang, Xin Yi, Shwetak Patel, Yuanchun Shi","doi":"10.1145/3659616","DOIUrl":"https://doi.org/10.1145/3659616","url":null,"abstract":"Subject-aware vocal activity sensing on wearables, which specifically recognizes and monitors the wearer's distinct vocal activities, is essential in advancing personal health monitoring and enabling context-aware applications. While recent advancements in earables present new opportunities, the absence of relevant datasets and effective methods remains a significant challenge. In this paper, we introduce EarSAVAS, the first publicly available dataset constructed specifically for subject-aware human vocal activity sensing on earables. EarSAVAS encompasses eight distinct vocal activities from both the earphone wearer and bystanders, including synchronous two-channel audio and motion data collected from 42 participants totaling 44.5 hours. Further, we propose EarVAS, a lightweight multi-modal deep learning architecture that enables efficient subject-aware vocal activity recognition on earables. To validate the reliability of EarSAVAS and the efficiency of EarVAS, we implemented two advanced benchmark models. Evaluation results on EarSAVAS reveal EarVAS's effectiveness with an accuracy of 90.84% and a Macro-AUC of 89.03%. Comprehensive ablation experiments were conducted on benchmark models and demonstrated the effectiveness of feedback microphone audio and highlighted the potential value of sensor fusion in subject-aware vocal activity sensing on earables. We hope that the proposed EarSAVAS and benchmark models can inspire other researchers to further explore efficient subject-aware human vocal activity sensing on earables.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140985947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziqi Yang, Xuhai Xu, Bingsheng Yao, Ethan Rogers, Shao Zhang, Stephen Intille, Nawar Shara, G. Gao, Dakuo Wang
Despite the plethora of telehealth applications to assist home-based older adults and healthcare providers, basic messaging and phone calls are still the most common communication methods, which suffer from limited availability, information loss, and process inefficiencies. One promising solution to facilitate patient-provider communication is to leverage large language models (LLMs) with their powerful natural conversation and summarization capability. However, there is a limited understanding of LLMs' role during the communication. We first conducted two interview studies with both older adults (N=10) and healthcare providers (N=9) to understand their needs and opportunities for LLMs in patient-provider asynchronous communication. Based on the insights, we built an LLM-powered communication system, Talk2Care, and designed interactive components for both groups: (1) For older adults, we leveraged the convenience and accessibility of voice assistants (VAs) and built an LLM-powered conversational interface for effective information collection. (2) For health providers, we built an LLM-based dashboard to summarize and present important health information based on older adults' conversations with the VA. We further conducted two user studies with older adults and providers to evaluate the usability of the system. The results showed that Talk2Care could facilitate the communication process, enrich the health information collected from older adults, and considerably save providers' efforts and time. We envision our work as an initial exploration of LLMs' capability in the intersection of healthcare and interpersonal communication.
{"title":"Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older Adults","authors":"Ziqi Yang, Xuhai Xu, Bingsheng Yao, Ethan Rogers, Shao Zhang, Stephen Intille, Nawar Shara, G. Gao, Dakuo Wang","doi":"10.1145/3659625","DOIUrl":"https://doi.org/10.1145/3659625","url":null,"abstract":"Despite the plethora of telehealth applications to assist home-based older adults and healthcare providers, basic messaging and phone calls are still the most common communication methods, which suffer from limited availability, information loss, and process inefficiencies. One promising solution to facilitate patient-provider communication is to leverage large language models (LLMs) with their powerful natural conversation and summarization capability. However, there is a limited understanding of LLMs' role during the communication. We first conducted two interview studies with both older adults (N=10) and healthcare providers (N=9) to understand their needs and opportunities for LLMs in patient-provider asynchronous communication. Based on the insights, we built an LLM-powered communication system, Talk2Care, and designed interactive components for both groups: (1) For older adults, we leveraged the convenience and accessibility of voice assistants (VAs) and built an LLM-powered conversational interface for effective information collection. (2) For health providers, we built an LLM-based dashboard to summarize and present important health information based on older adults' conversations with the VA. We further conducted two user studies with older adults and providers to evaluate the usability of the system. The results showed that Talk2Care could facilitate the communication process, enrich the health information collected from older adults, and considerably save providers' efforts and time. We envision our work as an initial exploration of LLMs' capability in the intersection of healthcare and interpersonal communication.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140982074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In contemporary campus environments, the provision of timely and efficient services is increasingly challenging due to limitations in accessibility and the complexity and openness of the environment. Existing service robots, while operational, often struggle with adaptability and dynamic task management, leading to inefficiencies. To overcome these limitations, we introduce CrowdBot, a robot management system that enhances service in campus environments. Our system leverages a hierarchical reinforcement learning-based cloud-edge hybrid scheduling framework (REDIS), for efficient online streaming task assignment and dynamic action scheduling. To verify the REDIS framework, we have developed a digital twin simulation platform, which integrates large language models and hot-swapping technology. This facilitates seamless human-robot interaction, efficient task allocation, and cost-effective execution through the reuse of robot equipment. Our comprehensive simulations corroborate the system's remarkable efficacy, demonstrating significant improvements with a 24.46% reduction in task completion times, a 9.37% decrease in travel distances, and up to a 3% savings in power usage. Additionally, the system achieves a 7.95% increase in the number of tasks completed and a 9.49% reduction in response time. Real-world case studies further affirm CrowdBot's capability to adeptly execute tasks and judiciously recycle resources, thereby offering a smart and viable solution for the streamlined management of campus services.
{"title":"CrowdBot: An Open-Environment Robot Management System for On-Campus Services","authors":"Yufei Wang, Wenting Zeng, Changzhen Liu, Zhuohan Ye, Jiawei Sun, Junxiang Ji, Zhihan Jiang, Xianyi Yan, Yongyi Wu, Yigao Wang, Dingqi Yang, Leye Wang, Daqing Zhang, Cheng Wang, Longbiao Chen","doi":"10.1145/3659601","DOIUrl":"https://doi.org/10.1145/3659601","url":null,"abstract":"In contemporary campus environments, the provision of timely and efficient services is increasingly challenging due to limitations in accessibility and the complexity and openness of the environment. Existing service robots, while operational, often struggle with adaptability and dynamic task management, leading to inefficiencies. To overcome these limitations, we introduce CrowdBot, a robot management system that enhances service in campus environments. Our system leverages a hierarchical reinforcement learning-based cloud-edge hybrid scheduling framework (REDIS), for efficient online streaming task assignment and dynamic action scheduling. To verify the REDIS framework, we have developed a digital twin simulation platform, which integrates large language models and hot-swapping technology. This facilitates seamless human-robot interaction, efficient task allocation, and cost-effective execution through the reuse of robot equipment. Our comprehensive simulations corroborate the system's remarkable efficacy, demonstrating significant improvements with a 24.46% reduction in task completion times, a 9.37% decrease in travel distances, and up to a 3% savings in power usage. Additionally, the system achieves a 7.95% increase in the number of tasks completed and a 9.49% reduction in response time. Real-world case studies further affirm CrowdBot's capability to adeptly execute tasks and judiciously recycle resources, thereby offering a smart and viable solution for the streamlined management of campus services.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140983283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chentao Li, Jinyang Yu, Ke He, Jianjiang Feng, Jie Zhou
Today, touchscreens stand as the most prevalent input devices of mobile computing devices (smartphones, tablets, smartwatches). Yet, compared with desktop or laptop computers, the limited shortcut keys and physical buttons on touchscreen devices, coupled with the fat finger problem, often lead to slower and more error-prone input and navigation, especially when dealing with text editing and other complex interaction tasks. We introduce an innovative gesture set based on finger rotations in the yaw, pitch, and roll directions on a touchscreen, diverging significantly from traditional two-dimensional interactions and promising to expand the gesture library. Despite active research in estimation of finger angles, however, the previous work faces substantial challenges, including significant estimation errors and unstable sequential outputs. Variability in user behavior further complicates the isolation of movements to a single rotational axis, leading to accidental disturbances and screen coordinate shifts that interfere with the existing sliding gestures. Consequently, the direct application of finger angle estimation algorithms for recognizing three-dimensional rotational gestures is impractical. SwivelTouch leverages the analysis of finger movement characteristics on the touchscreen captured through original capacitive image sequences, which aims to rapidly and accurately identify these advanced 3D gestures, clearly differentiating them from conventional touch interactions like tapping and sliding, thus enhancing user interaction with touch devices and meanwhile compatible with existing 2D gestures. User study further confirms that the implementation of SwivelTouch significantly enhances the efficiency of text editing on smartphones.
{"title":"SwivelTouch: Boosting Touchscreen Input with 3D Finger Rotation Gesture","authors":"Chentao Li, Jinyang Yu, Ke He, Jianjiang Feng, Jie Zhou","doi":"10.1145/3659584","DOIUrl":"https://doi.org/10.1145/3659584","url":null,"abstract":"Today, touchscreens stand as the most prevalent input devices of mobile computing devices (smartphones, tablets, smartwatches). Yet, compared with desktop or laptop computers, the limited shortcut keys and physical buttons on touchscreen devices, coupled with the fat finger problem, often lead to slower and more error-prone input and navigation, especially when dealing with text editing and other complex interaction tasks. We introduce an innovative gesture set based on finger rotations in the yaw, pitch, and roll directions on a touchscreen, diverging significantly from traditional two-dimensional interactions and promising to expand the gesture library. Despite active research in estimation of finger angles, however, the previous work faces substantial challenges, including significant estimation errors and unstable sequential outputs. Variability in user behavior further complicates the isolation of movements to a single rotational axis, leading to accidental disturbances and screen coordinate shifts that interfere with the existing sliding gestures. Consequently, the direct application of finger angle estimation algorithms for recognizing three-dimensional rotational gestures is impractical. SwivelTouch leverages the analysis of finger movement characteristics on the touchscreen captured through original capacitive image sequences, which aims to rapidly and accurately identify these advanced 3D gestures, clearly differentiating them from conventional touch interactions like tapping and sliding, thus enhancing user interaction with touch devices and meanwhile compatible with existing 2D gestures. User study further confirms that the implementation of SwivelTouch significantly enhances the efficiency of text editing on smartphones.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140983306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the growing popularity of smart speakers, numerous novel acoustic sensing applications have been proposed for low-frequency human speech and high-frequency inaudible sounds. Spatial information plays a crucial role in these acoustic applications, enabling various location-based services. However, typically commercial microphone arrays face limitations in spatial perception of inaudible sounds due to their sparse array geometries optimized for low-frequency speech. In this paper, we introduce MetaAng, a system designed to augment microphone arrays by enabling wideband spatial perception across both speech signals and inaudible sounds by leveraging the spatial encoding capabilities of acoustic metasurfaces. Our design is grounded in the fact that, while sensitive to high-frequency signals, acoustic metasurfaces are almost non-responsive to low-frequency speech due to significant wavelength discrepancy. This observation allows us to integrate acoustic metasurfaces with sparse array geometry, simultaneously enhancing the spatial perception of high-frequency and low-frequency acoustic signals. To achieve this, we first utilize acoustic metasurfaces and a configuration optimization algorithm to encode the unique features for each incident angle. Then, we propose an unrolling soft thresholding network that employs neural-enhanced priors and compressive sensing for high-accuracy, high-resolution multi-source angle estimation. We implement a prototype, and experimental results demonstrate that MetaAng maintains robustness across various scenarios, facilitating multiple applications, including localization and tracking.
{"title":"Pushing the Limits of Acoustic Spatial Perception via Incident Angle Encoding","authors":"Yongjian Fu, Yongzhao Zhang, Hao Pan, Yu Lu, Xinyi Li, Lili Chen, Ju Ren, Xiong Li, Xiaosong Zhang, Yaoxue Zhang","doi":"10.1145/3659583","DOIUrl":"https://doi.org/10.1145/3659583","url":null,"abstract":"With the growing popularity of smart speakers, numerous novel acoustic sensing applications have been proposed for low-frequency human speech and high-frequency inaudible sounds. Spatial information plays a crucial role in these acoustic applications, enabling various location-based services. However, typically commercial microphone arrays face limitations in spatial perception of inaudible sounds due to their sparse array geometries optimized for low-frequency speech. In this paper, we introduce MetaAng, a system designed to augment microphone arrays by enabling wideband spatial perception across both speech signals and inaudible sounds by leveraging the spatial encoding capabilities of acoustic metasurfaces. Our design is grounded in the fact that, while sensitive to high-frequency signals, acoustic metasurfaces are almost non-responsive to low-frequency speech due to significant wavelength discrepancy. This observation allows us to integrate acoustic metasurfaces with sparse array geometry, simultaneously enhancing the spatial perception of high-frequency and low-frequency acoustic signals. To achieve this, we first utilize acoustic metasurfaces and a configuration optimization algorithm to encode the unique features for each incident angle. Then, we propose an unrolling soft thresholding network that employs neural-enhanced priors and compressive sensing for high-accuracy, high-resolution multi-source angle estimation. We implement a prototype, and experimental results demonstrate that MetaAng maintains robustness across various scenarios, facilitating multiple applications, including localization and tracking.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140984062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Fan, Lei Xie, Wenhui Zhou, Chuyu Wang, Yanling Bu, Sanglu Lu
Previous mmWave sensing solutions assumed good signal quality. Ensuring an unblocked or strengthened LoS path is challenging. Therefore, finding an NLoS path is crucial to enhancing perceived signal quality. This paper proposes Trebsen, a Transmitter-REceiver collaboration-based Beamforming scheme SENsing using commercial mmWave radars. Specifically, we define the hybrid beamforming problem as an optimization challenge involving beamforming angle search based on transmitter-receiver collaboration. We derive a comprehensive expression for parameter optimization by modeling the signal attenuation variations resulting from the propagation path. To comprehensively assess the perception signal quality, we design a novel metric perceived signal-to-interference-plus-noise ratio (PSINR), combining the carrier signal and baseband signal to quantify the fine-grained sensing motion signal quality. Considering the high time cost of traversing or randomly searching methods, we employ a search method based on deep reinforcement learning to quickly explore optimal beamforming angles at both transmitter and receiver. We implement Trebsen and evaluate its performance in a fine-grained sensing application (i.e., heartbeat). Experimental results show that Trebsen significantly enhances heartbeat sensing performance in blocked or misaligned LoS scenes. Comparing non-beamforming, Trebsen demonstrates a reduction of 23.6% in HR error and 27.47% in IBI error. Moreover, comparing random search, Trebsen exhibits a 90% increase in search speed.
以前的毫米波传感解决方案假定信号质量良好。确保无阻塞或增强的 LoS 路径具有挑战性。因此,找到一条 NLoS 路径对于提高感知信号质量至关重要。本文提出的 Trebsen 是一种基于发射机-接收机协作的波束成形方案,利用商用毫米波雷达进行 SENsing。具体来说,我们将混合波束成形问题定义为一个优化挑战,涉及基于发射机-接收机协作的波束成形角度搜索。通过对传播路径造成的信号衰减变化进行建模,我们得出了参数优化的综合表达式。为了全面评估感知信号质量,我们设计了一种新的感知信号干扰加噪声比(PSINR)指标,结合载波信号和基带信号来量化细粒度感知运动信号质量。考虑到遍历或随机搜索方法的时间成本较高,我们采用了一种基于深度强化学习的搜索方法,以快速探索发射器和接收器的最佳波束成形角度。我们实现了 Trebsen,并评估了它在细粒度传感应用(即心跳)中的性能。实验结果表明,Trebsen 显著提高了在阻塞或错位 LoS 场景中的心跳传感性能。与非波束成形相比,Trebsen 将 HR 误差降低了 23.6%,将 IBI 误差降低了 27.47%。此外,与随机搜索相比,Trebsen 的搜索速度提高了 90%。
{"title":"Beamforming for Sensing: Hybrid Beamforming based on Transmitter-Receiver Collaboration for Millimeter-Wave Sensing","authors":"Long Fan, Lei Xie, Wenhui Zhou, Chuyu Wang, Yanling Bu, Sanglu Lu","doi":"10.1145/3659619","DOIUrl":"https://doi.org/10.1145/3659619","url":null,"abstract":"Previous mmWave sensing solutions assumed good signal quality. Ensuring an unblocked or strengthened LoS path is challenging. Therefore, finding an NLoS path is crucial to enhancing perceived signal quality. This paper proposes Trebsen, a Transmitter-REceiver collaboration-based Beamforming scheme SENsing using commercial mmWave radars. Specifically, we define the hybrid beamforming problem as an optimization challenge involving beamforming angle search based on transmitter-receiver collaboration. We derive a comprehensive expression for parameter optimization by modeling the signal attenuation variations resulting from the propagation path. To comprehensively assess the perception signal quality, we design a novel metric perceived signal-to-interference-plus-noise ratio (PSINR), combining the carrier signal and baseband signal to quantify the fine-grained sensing motion signal quality. Considering the high time cost of traversing or randomly searching methods, we employ a search method based on deep reinforcement learning to quickly explore optimal beamforming angles at both transmitter and receiver. We implement Trebsen and evaluate its performance in a fine-grained sensing application (i.e., heartbeat). Experimental results show that Trebsen significantly enhances heartbeat sensing performance in blocked or misaligned LoS scenes. Comparing non-beamforming, Trebsen demonstrates a reduction of 23.6% in HR error and 27.47% in IBI error. Moreover, comparing random search, Trebsen exhibits a 90% increase in search speed.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140985882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prasoon Patidar, Tricia J. Ngoon, John Zimmerman, Amy Ogan, Yuvraj Agarwal
Ambient classroom sensing systems offer a scalable and non-intrusive way to find connections between instructor actions and student behaviors, creating data that can improve teaching and learning. While these systems effectively provide aggregate data, getting reliable individual student-level information is difficult due to occlusion or movements. Individual data can help in understanding equitable student participation, but it requires identifiable data or individual instrumentation. We propose ClassID, a data attribution method for within a class session and across multiple sessions of a course without these constraints. For within-session, our approach assigns unique identifiers to 98% of students with 95% accuracy. It significantly reduces multiple ID assignments compared to the baseline approach (3 vs. 167) based on our testing on data from 15 classroom sessions. For across-session attributions, our approach, combined with student attendance, shows higher precision than the state-of-the-art approach (85% vs. 44%) on three courses. Finally, we present a set of four use cases to demonstrate how individual behavior attribution can enable a rich set of learning analytics, which is not possible with aggregate data alone.
{"title":"ClassID: Enabling Student Behavior Attribution from Ambient Classroom Sensing Systems","authors":"Prasoon Patidar, Tricia J. Ngoon, John Zimmerman, Amy Ogan, Yuvraj Agarwal","doi":"10.1145/3659586","DOIUrl":"https://doi.org/10.1145/3659586","url":null,"abstract":"Ambient classroom sensing systems offer a scalable and non-intrusive way to find connections between instructor actions and student behaviors, creating data that can improve teaching and learning. While these systems effectively provide aggregate data, getting reliable individual student-level information is difficult due to occlusion or movements. Individual data can help in understanding equitable student participation, but it requires identifiable data or individual instrumentation. We propose ClassID, a data attribution method for within a class session and across multiple sessions of a course without these constraints. For within-session, our approach assigns unique identifiers to 98% of students with 95% accuracy. It significantly reduces multiple ID assignments compared to the baseline approach (3 vs. 167) based on our testing on data from 15 classroom sessions. For across-session attributions, our approach, combined with student attendance, shows higher precision than the state-of-the-art approach (85% vs. 44%) on three courses. Finally, we present a set of four use cases to demonstrate how individual behavior attribution can enable a rich set of learning analytics, which is not possible with aggregate data alone.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140984360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziqi Xu, Jingwen Zhang, Jacob K Greenberg, Madelyn Frumkin, Saad Javeed, Justin K. Zhang, Braeden Benedict, Kathleen Botterbush, Thomas L. Rodebaugh, Wilson Z. Ray, Chenyang Lu
Pre-operative prediction of post-surgical recovery for patients is vital for clinical decision-making and personalized treatments, especially with lumbar spine surgery, where patients exhibit highly heterogeneous outcomes. Existing predictive tools mainly rely on traditional Patient-Reported Outcome Measures (PROMs), which fail to capture the long-term dynamics of patient conditions before the surgery. Moreover, existing studies focus on predicting a single surgical outcome. However, recovery from spine surgery is multi-dimensional, including multiple distinctive but interrelated outcomes, such as pain interference, physical function, and quality of recovery. In recent years, the emergence of smartphones and wearable devices has presented new opportunities to capture longitudinal and dynamic information regarding patients' conditions outside the hospital. This paper proposes a novel machine learning approach, Multi-Modal Multi-Task Learning (M3TL), using smartphones and wristbands to predict multiple surgical outcomes after lumbar spine surgeries. We formulate the prediction of pain interference, physical function, and quality of recovery as a multi-task learning (MTL) problem. We leverage multi-modal data to capture the static and dynamic characteristics of patients, including (1) traditional features from PROMs and Electronic Health Records (EHR), (2) Ecological Momentary Assessment (EMA) collected from smartphones, and (3) sensing data from wristbands. Moreover, we introduce new features derived from the correlation of EMA and wearable features measured within the same time frame, effectively enhancing predictive performance by capturing the interdependencies between the two data modalities. Our model interpretation uncovers the complementary nature of the different data modalities and their distinctive contributions toward multiple surgical outcomes. Furthermore, through individualized decision analysis, our model identifies personal high risk factors to aid clinical decision making and approach personalized treatments. In a clinical study involving 122 patients undergoing lumbar spine surgery, our M3TL model outperforms a diverse set of baseline methods in predictive performance, demonstrating the value of integrating multi-modal data and learning from multiple surgical outcomes. This work contributes to advancing personalized peri-operative care with accurate pre-operative predictions of multi-dimensional outcomes.
术前预测患者的术后恢复情况对于临床决策和个性化治疗至关重要,尤其是腰椎手术,因为患者的术后恢复情况千差万别。现有的预测工具主要依赖于传统的患者报告结果指标(PROMs),而这些指标无法捕捉到患者术前病情的长期动态变化。此外,现有研究侧重于预测单一的手术结果。然而,脊柱手术后的恢复是多维度的,包括疼痛干扰、身体功能和恢复质量等多个不同但相互关联的结果。近年来,智能手机和可穿戴设备的出现为在医院外获取有关患者病情的纵向动态信息提供了新的机遇。本文提出了一种新颖的机器学习方法--多模态多任务学习(M3TL),利用智能手机和腕带来预测腰椎手术后的多种手术结果。我们将疼痛干扰、身体功能和恢复质量的预测制定为一个多任务学习(MTL)问题。我们利用多模态数据来捕捉患者的静态和动态特征,包括:(1)来自PROM和电子健康记录(EHR)的传统特征;(2)从智能手机收集的生态瞬间评估(EMA);以及(3)来自腕带的传感数据。此外,我们还引入了在同一时间段内测量的 EMA 和可穿戴设备特征的相关性所产生的新特征,通过捕捉两种数据模式之间的相互依存关系,有效提高了预测性能。我们的模型解释揭示了不同数据模式的互补性及其对多种手术结果的独特贡献。此外,通过个性化决策分析,我们的模型还能识别个人高风险因素,以帮助临床决策和个性化治疗。在一项涉及 122 名腰椎手术患者的临床研究中,我们的 M3TL 模型在预测性能方面优于各种基线方法,证明了整合多模态数据并从多种手术结果中学习的价值。这项工作有助于通过对多维结果进行准确的术前预测来推进个性化围手术期护理。
{"title":"Predicting Multi-dimensional Surgical Outcomes with Multi-modal Mobile Sensing","authors":"Ziqi Xu, Jingwen Zhang, Jacob K Greenberg, Madelyn Frumkin, Saad Javeed, Justin K. Zhang, Braeden Benedict, Kathleen Botterbush, Thomas L. Rodebaugh, Wilson Z. Ray, Chenyang Lu","doi":"10.1145/3659628","DOIUrl":"https://doi.org/10.1145/3659628","url":null,"abstract":"Pre-operative prediction of post-surgical recovery for patients is vital for clinical decision-making and personalized treatments, especially with lumbar spine surgery, where patients exhibit highly heterogeneous outcomes. Existing predictive tools mainly rely on traditional Patient-Reported Outcome Measures (PROMs), which fail to capture the long-term dynamics of patient conditions before the surgery. Moreover, existing studies focus on predicting a single surgical outcome. However, recovery from spine surgery is multi-dimensional, including multiple distinctive but interrelated outcomes, such as pain interference, physical function, and quality of recovery. In recent years, the emergence of smartphones and wearable devices has presented new opportunities to capture longitudinal and dynamic information regarding patients' conditions outside the hospital. This paper proposes a novel machine learning approach, Multi-Modal Multi-Task Learning (M3TL), using smartphones and wristbands to predict multiple surgical outcomes after lumbar spine surgeries. We formulate the prediction of pain interference, physical function, and quality of recovery as a multi-task learning (MTL) problem. We leverage multi-modal data to capture the static and dynamic characteristics of patients, including (1) traditional features from PROMs and Electronic Health Records (EHR), (2) Ecological Momentary Assessment (EMA) collected from smartphones, and (3) sensing data from wristbands. Moreover, we introduce new features derived from the correlation of EMA and wearable features measured within the same time frame, effectively enhancing predictive performance by capturing the interdependencies between the two data modalities. Our model interpretation uncovers the complementary nature of the different data modalities and their distinctive contributions toward multiple surgical outcomes. Furthermore, through individualized decision analysis, our model identifies personal high risk factors to aid clinical decision making and approach personalized treatments. In a clinical study involving 122 patients undergoing lumbar spine surgery, our M3TL model outperforms a diverse set of baseline methods in predictive performance, demonstrating the value of integrating multi-modal data and learning from multiple surgical outcomes. This work contributes to advancing personalized peri-operative care with accurate pre-operative predictions of multi-dimensional outcomes.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140984997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}