Manasa Kalanadhabhatta, Adrelys Mateo Santana, Zhongyang Zhang, Deepa Ganesan, Adam S. Grabell, Tauhidur Rahman
Emotion dysregulation in early childhood is known to be associated with a higher risk of several psychopathological conditions, such as ADHD and mood and anxiety disorders. In developmental neuroscience research, emotion dysregulation is characterized by low neural activation in the prefrontal cortex during frustration. In this work, we report on an exploratory study with 94 participants aged 3.5 to 5 years, investigating whether behavioral measures automatically extracted from facial videos can predict frustration-related neural activation and differentiate between low- and high-risk individuals. We propose a novel multi-scale instance fusion framework to develop EarlyScreen – a set of classifiers trained on behavioral markers during emotion regulation. Our model successfully predicts activation levels in the prefrontal cortex with an area under the receiver operating characteristic (ROC) curve of 0.85, which is on par with widely-used clinical assessment tools. Further, we classify clinical and non-clinical subjects based on their psychopathological risk with an area under the ROC curve of 0.80. Our model’s predictions are consistent with standardized psychometric assessment scales, supporting its applicability as a screening procedure for emotion regulation-related psychopathological disorders. To the best of our knowledge, EarlyScreen is the first work to use automatically extracted behavioral features to characterize both neural activity and the diagnostic status of emotion regulation-related disorders in young children. We present insights from mental health professionals supporting the utility of EarlyScreen and discuss considerations for its subsequent deployment. CCS Concepts: • Human-centered computing → Ubiquitous and mobile computing systems and tools ; • Computing methodologies → Machine learning ; • Applied computing → Psychology Health informatics . Multi-scale Neural Psychopathology
{"title":"EarlyScreen: Multi-scale Instance Fusion for Predicting Neural Activation and Psychopathology in Preschool Children","authors":"Manasa Kalanadhabhatta, Adrelys Mateo Santana, Zhongyang Zhang, Deepa Ganesan, Adam S. Grabell, Tauhidur Rahman","doi":"10.1145/3534583","DOIUrl":"https://doi.org/10.1145/3534583","url":null,"abstract":"Emotion dysregulation in early childhood is known to be associated with a higher risk of several psychopathological conditions, such as ADHD and mood and anxiety disorders. In developmental neuroscience research, emotion dysregulation is characterized by low neural activation in the prefrontal cortex during frustration. In this work, we report on an exploratory study with 94 participants aged 3.5 to 5 years, investigating whether behavioral measures automatically extracted from facial videos can predict frustration-related neural activation and differentiate between low- and high-risk individuals. We propose a novel multi-scale instance fusion framework to develop EarlyScreen – a set of classifiers trained on behavioral markers during emotion regulation. Our model successfully predicts activation levels in the prefrontal cortex with an area under the receiver operating characteristic (ROC) curve of 0.85, which is on par with widely-used clinical assessment tools. Further, we classify clinical and non-clinical subjects based on their psychopathological risk with an area under the ROC curve of 0.80. Our model’s predictions are consistent with standardized psychometric assessment scales, supporting its applicability as a screening procedure for emotion regulation-related psychopathological disorders. To the best of our knowledge, EarlyScreen is the first work to use automatically extracted behavioral features to characterize both neural activity and the diagnostic status of emotion regulation-related disorders in young children. We present insights from mental health professionals supporting the utility of EarlyScreen and discuss considerations for its subsequent deployment. CCS Concepts: • Human-centered computing → Ubiquitous and mobile computing systems and tools ; • Computing methodologies → Machine learning ; • Applied computing → Psychology Health informatics . Multi-scale Neural Psychopathology","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"53 1","pages":"60:1-60:39"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86518460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoying Yang, Jacob Sayono, Jess Xu, Jiahao Li, Josiah D. Hester, Yang Zhang
Automating operations of objects has made life easier and more convenient for billions of people, especially those with limited motor capabilities. On the other hand, even able-bodied users might not always be able to perform manual operations (e.g., both hands are occupied), and manual operations might be undesirable for hygiene purposes (e.g., contactless devices). As a result, automation systems like motion-triggered doors, remote-control window shades, contactless toilet lids have become increasingly popular in private and public environments. Yet, these systems are hampered by complex building wiring or short battery lifetimes, negating their positive benefits for accessibility, energy saving, healthcare, and other domains. In this paper we explore how these types of objects can be powered in perpetuity by the energy generated from a unique energy source – user interactions, specifically, the manual manipulations of objects by users who can afford them when they can afford them. Our assumption is that users’ capabilities for object operations are heterogeneous, there are desires for both manual and automatic operations in most environments, and that automatic operations are often not needed as frequently – for example, an automatic door in a public space is often manually opened many times before a need for automatic operation shows up. The energy harvested by those manual operations would be sufficient to power that one automatic operation. We instantiate this idea by upcycling common everyday objects with devices which have various mechanical designs powered by a general-purpose backbone embedded system. We call these devices, MiniKers . We built a custom driver circuit that can enable motor mechanisms to toggle between generating powers (i.e., manual operation) and actuating objects (i.e., automatic operation). We designed a wide variety of mechanical mechanisms to retrofit existing objects and evaluated our system with a 48-hour deployment study, which proves the efficacy of MiniKers as well as shedding light into this people-as-power approach as a feasible solution to address energy needed for smart environment automation.
{"title":"MiniKers: Interaction-Powered Smart Environment Automation","authors":"Xiaoying Yang, Jacob Sayono, Jess Xu, Jiahao Li, Josiah D. Hester, Yang Zhang","doi":"10.1145/3550287","DOIUrl":"https://doi.org/10.1145/3550287","url":null,"abstract":"Automating operations of objects has made life easier and more convenient for billions of people, especially those with limited motor capabilities. On the other hand, even able-bodied users might not always be able to perform manual operations (e.g., both hands are occupied), and manual operations might be undesirable for hygiene purposes (e.g., contactless devices). As a result, automation systems like motion-triggered doors, remote-control window shades, contactless toilet lids have become increasingly popular in private and public environments. Yet, these systems are hampered by complex building wiring or short battery lifetimes, negating their positive benefits for accessibility, energy saving, healthcare, and other domains. In this paper we explore how these types of objects can be powered in perpetuity by the energy generated from a unique energy source – user interactions, specifically, the manual manipulations of objects by users who can afford them when they can afford them. Our assumption is that users’ capabilities for object operations are heterogeneous, there are desires for both manual and automatic operations in most environments, and that automatic operations are often not needed as frequently – for example, an automatic door in a public space is often manually opened many times before a need for automatic operation shows up. The energy harvested by those manual operations would be sufficient to power that one automatic operation. We instantiate this idea by upcycling common everyday objects with devices which have various mechanical designs powered by a general-purpose backbone embedded system. We call these devices, MiniKers . We built a custom driver circuit that can enable motor mechanisms to toggle between generating powers (i.e., manual operation) and actuating objects (i.e., automatic operation). We designed a wide variety of mechanical mechanisms to retrofit existing objects and evaluated our system with a 48-hour deployment study, which proves the efficacy of MiniKers as well as shedding light into this people-as-power approach as a feasible solution to address energy needed for smart environment automation.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"7 1","pages":"149:1-149:22"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78626382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunji Liang, Yuchen Qin, Qi Li, Xiaokai Yan, Zhiwen Yu, Bin Guo, S. Samtani, Yanyong Zhang
The built-in loudspeakers of mobile devices (e.g., smartphones, smartwatches, and tablets) play significant roles in human-machine interaction, such as playing music, making phone calls, and enabling voice-based interaction. Prior studies have pointed out that it is feasible to eavesdrop on the speaker via motion sensors, but whether it is possible to synthesize speech from non-acoustic signals with sub-Nyquist sampling frequency has not been studied. In this paper, we present an end-to-end model to reconstruct the acoustic waveforms that are playing on the loudspeaker through the vibration captured by the built-in accelerometer. Specifically, we present an end-to-end speech synthesis framework dubbed AccMyrinx to eavesdrop on the speaker using the built-in low-resolution accelerometer of mobile devices. AccMyrinx takes advantage of the coexistence of an accelerometer with the loudspeaker on the same motherboard and compromises the loudspeaker by the solid-borne vibrations captured by the accelerometer. Low-resolution vibration signals are fed to a wavelet-based MelGAN to generate intelligible acoustic waveforms. We conducted extensive experiments on a large-scale dataset created based on audio clips downloaded from Voice of America (VOA). The experimental results show that AccMyrinx is capable of reconstructing intelligible acoustic signals that are playing on the loudspeaker with a smoothed word error rate (SWER) of 42.67%. The quality of synthesized speeches could be severely affected by several factors including gender, speech rate, and volume.
{"title":"AccMyrinx: Speech Synthesis with Non-Acoustic Sensor","authors":"Yunji Liang, Yuchen Qin, Qi Li, Xiaokai Yan, Zhiwen Yu, Bin Guo, S. Samtani, Yanyong Zhang","doi":"10.1145/3550338","DOIUrl":"https://doi.org/10.1145/3550338","url":null,"abstract":"The built-in loudspeakers of mobile devices (e.g., smartphones, smartwatches, and tablets) play significant roles in human-machine interaction, such as playing music, making phone calls, and enabling voice-based interaction. Prior studies have pointed out that it is feasible to eavesdrop on the speaker via motion sensors, but whether it is possible to synthesize speech from non-acoustic signals with sub-Nyquist sampling frequency has not been studied. In this paper, we present an end-to-end model to reconstruct the acoustic waveforms that are playing on the loudspeaker through the vibration captured by the built-in accelerometer. Specifically, we present an end-to-end speech synthesis framework dubbed AccMyrinx to eavesdrop on the speaker using the built-in low-resolution accelerometer of mobile devices. AccMyrinx takes advantage of the coexistence of an accelerometer with the loudspeaker on the same motherboard and compromises the loudspeaker by the solid-borne vibrations captured by the accelerometer. Low-resolution vibration signals are fed to a wavelet-based MelGAN to generate intelligible acoustic waveforms. We conducted extensive experiments on a large-scale dataset created based on audio clips downloaded from Voice of America (VOA). The experimental results show that AccMyrinx is capable of reconstructing intelligible acoustic signals that are playing on the loudspeaker with a smoothed word error rate (SWER) of 42.67%. The quality of synthesized speeches could be severely affected by several factors including gender, speech rate, and volume.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"60 1","pages":"127:1-127:24"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75016533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As cameras and Wi-Fi access points are widely deployed in public places, new mobile applications and services can be developed by connecting live video analytics to the mobile Wi-Fi-enabled devices of the relevant users. To achieve this, a critical challenge is to identify the person who carries a device in the video with the mobile device’s network ID, e.g., MAC address. To address this issue, we propose RFCam, a system for human identification with a fusion of Wi-Fi and camera data . RFCam uses a multi-antenna Wi-Fi radio to collect CSI of Wi-Fi packets sent by mobile devices, and a camera to monitor users in the area. With low sampling rate CSI data, RFCam derives heterogeneous embedding features on location, motion, and user activity for each device over time, and fuses them with visual user features generated from video analytics to find the best matches. To mitigate the impacts of multi-user environments on wireless sensing, we develop video-assisted learning models for different features and quantify their uncertainties, and incorporate them with video analytics to rank moments and features for robust and efficient fusion. RFCam is implemented and tested in indoor environments for over 800minutes with 25 volunteers, and extensive evaluation results demonstrate that RFCam achieves real-time identification average accuracy of 97 . 01% in all experiments with up to ten users, significantly outperforming existing solutions. and how to estimate different features in device profiles with uncertainties quantification. Then, we describe the visual sensing of user profiles with an uncertainty-aware video analytics approach to identify contextually important moments.
{"title":"RFCam: Uncertainty-aware Fusion of Camera and Wi-Fi for Real-time Human Identification with Mobile Devices","authors":"Hongkai Chen, Sirajum Munir, Shane Lin","doi":"10.1145/3534588","DOIUrl":"https://doi.org/10.1145/3534588","url":null,"abstract":"As cameras and Wi-Fi access points are widely deployed in public places, new mobile applications and services can be developed by connecting live video analytics to the mobile Wi-Fi-enabled devices of the relevant users. To achieve this, a critical challenge is to identify the person who carries a device in the video with the mobile device’s network ID, e.g., MAC address. To address this issue, we propose RFCam, a system for human identification with a fusion of Wi-Fi and camera data . RFCam uses a multi-antenna Wi-Fi radio to collect CSI of Wi-Fi packets sent by mobile devices, and a camera to monitor users in the area. With low sampling rate CSI data, RFCam derives heterogeneous embedding features on location, motion, and user activity for each device over time, and fuses them with visual user features generated from video analytics to find the best matches. To mitigate the impacts of multi-user environments on wireless sensing, we develop video-assisted learning models for different features and quantify their uncertainties, and incorporate them with video analytics to rank moments and features for robust and efficient fusion. RFCam is implemented and tested in indoor environments for over 800minutes with 25 volunteers, and extensive evaluation results demonstrate that RFCam achieves real-time identification average accuracy of 97 . 01% in all experiments with up to ten users, significantly outperforming existing solutions. and how to estimate different features in device profiles with uncertainties quantification. Then, we describe the visual sensing of user profiles with an uncertainty-aware video analytics approach to identify contextually important moments.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"28 1","pages":"47:1-47:29"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81105008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jin Zhang, Zhuangzhuang Chen, Chengwen Luo, Bo Wei, S. Kanhere, Jian-qiang Li
Human has an unique gait and prior works show increasing potentials in using WiFi signals to capture the unique signature of individuals’ gait. However, existing WiFi-based human identification (HI) systems have not been ready for real-world deployment due to various strong assumptions including identification of known users and sufficient training data captured in predefined domains such as fixed walking trajectory/orientation, WiFi layout (receivers locations) and multipath environment (deployment time and site). In this paper, we propose a WiFi-based HI system, MetaGanFi, which is able to accurately identify unseen individuals in uncontrolled domain with only one or few samples. To achieve this, the MetaGanFi proposes a domain unification model, CCG-GAN that utilizes a conditional cycle generative adversarial networks to filter out irrelevant perturbations incurred by interfering domains. Moreover, the MetaGanFi proposes a domain-agnostic meta learning model, DA-Meta that could quickly adapt from one/few data samples to accurately recognize unseen individuals. The comprehensive evaluation applied on a real-world dataset show that the MetaGanFi can identify unseen individuals with average accuracies of 87.25% and 93.50% for 1 and 5 available data samples (shot) cases, captured in varying trajectory and multipath environment, 86.84% and 91.25% for 1 and 5-shot cases in varying WiFi layout scenarios, while the overall inference process of domain unification and identification takes about 0.1 second per sample.
{"title":"MetaGanFi: Cross-Domain Unseen Individual Identification Using WiFi Signals","authors":"Jin Zhang, Zhuangzhuang Chen, Chengwen Luo, Bo Wei, S. Kanhere, Jian-qiang Li","doi":"10.1145/3550306","DOIUrl":"https://doi.org/10.1145/3550306","url":null,"abstract":"Human has an unique gait and prior works show increasing potentials in using WiFi signals to capture the unique signature of individuals’ gait. However, existing WiFi-based human identification (HI) systems have not been ready for real-world deployment due to various strong assumptions including identification of known users and sufficient training data captured in predefined domains such as fixed walking trajectory/orientation, WiFi layout (receivers locations) and multipath environment (deployment time and site). In this paper, we propose a WiFi-based HI system, MetaGanFi, which is able to accurately identify unseen individuals in uncontrolled domain with only one or few samples. To achieve this, the MetaGanFi proposes a domain unification model, CCG-GAN that utilizes a conditional cycle generative adversarial networks to filter out irrelevant perturbations incurred by interfering domains. Moreover, the MetaGanFi proposes a domain-agnostic meta learning model, DA-Meta that could quickly adapt from one/few data samples to accurately recognize unseen individuals. The comprehensive evaluation applied on a real-world dataset show that the MetaGanFi can identify unseen individuals with average accuracies of 87.25% and 93.50% for 1 and 5 available data samples (shot) cases, captured in varying trajectory and multipath environment, 86.84% and 91.25% for 1 and 5-shot cases in varying WiFi layout scenarios, while the overall inference process of domain unification and identification takes about 0.1 second per sample.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"20 1","pages":"152:1-152:21"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77181300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vimal Mollyn, Karan Ahuja, Dhruv Verma, Chris Harrison, Mayank Goel
Despite in and human activity recognition systems, a practical, power-efficient, and privacy-sensitive activity recognition system has remained elusive. State-of-the-art activity recognition systems often require power-hungry and privacy-invasive audio data. This is especially challenging for resource-constrained wearables, such as smartwatches. To counter the need audio-based activity system, we make use of compute-optimized IMUs sampled 50 Hz to act for detecting activity events. detected, multimodal deep augments the data captured on a smartwatch. subsample this 1 spoken unintelligible, power consumption on mobile devices. multimodal deep recognition of 92 2% 26 activities
{"title":"SAMoSA: Sensing Activities with Motion and Subsampled Audio","authors":"Vimal Mollyn, Karan Ahuja, Dhruv Verma, Chris Harrison, Mayank Goel","doi":"10.1145/3550284","DOIUrl":"https://doi.org/10.1145/3550284","url":null,"abstract":"Despite in and human activity recognition systems, a practical, power-efficient, and privacy-sensitive activity recognition system has remained elusive. State-of-the-art activity recognition systems often require power-hungry and privacy-invasive audio data. This is especially challenging for resource-constrained wearables, such as smartwatches. To counter the need audio-based activity system, we make use of compute-optimized IMUs sampled 50 Hz to act for detecting activity events. detected, multimodal deep augments the data captured on a smartwatch. subsample this 1 spoken unintelligible, power consumption on mobile devices. multimodal deep recognition of 92 2% 26 activities","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"18 1","pages":"132:1-132:19"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75962254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongfang Guo, GU Chaojie, Linshan Jiang, W. Luo, Rui Tan, Dongfang Guo, Chaojie Gu, Linshan Jiang, W. Luo
LoRaWAN is a narrowband wireless technology for ubiquitous connectivity. For various applications, it is desirable to localize LoRaWAN devices based on their uplink frames that convey application data. This localization service operates in an unobtrusive manner, in that it requires no special software instrumentation to the LoRaWAN devices. This paper investigates the feasibility of unobtrusive localization for LoRaWAN devices in hall-size indoor spaces like warehouses, airport terminals, sports centers, museum halls, etc. We study the TDoA-based approach, which needs to address two challenges of poor timing performance of LoRaWAN narrowband signal and nanosecond-level clock synchronization among anchors. We propose the ILLOC system featuring two LoRaWAN-specific techniques: (1) the cross-correlation among the differential phase sequences received by two anchors to estimate TDoA and (2) the just-in-time synchronization enabled by a specially deployed LoRaWAN end device providing time reference upon detecting a target device’s transmission. In a long tunnel corridor, a 70 × 32m 2 sports hall, and a 110 × 70m 2 indoor plaza with extensive non-line-of-sight propagation paths, ILLOC achieves median localization errors of 6m (with 2 anchors), 8 . 36m (with 6 anchors), and 15 . 16m (with 6 anchors and frame fusion), respectively. The achieved accuracy makes ILLOC useful for applications including zone-level asset tracking, misplacement detection, airport trolley management, and cybersecurity enforcement like detecting impersonation attacks launched by remote radios. . the design and evaluation of a TDoA-based, unobtrusive in-hall LoRaWAN localization (ILLOC) system for any off-the-shelf LoRaWAN end devices. ILLOC deploys multiple anchors with known positions and estimates the position of an end device based on the anchors’ TDoA measurements regarding any single uplink frame from the end device. The anchors are based on software-defined radios (SDRs) to access the physical layer. We prototype ILLOC using both Universal Software Radio Peripheral (USRP) and LimeSDR that is about 10x cheaper than USRP as the anchor. The design of ILLOC the following
{"title":"ILLOC: In-Hall Localization with Standard LoRaWAN Uplink Frames","authors":"Dongfang Guo, GU Chaojie, Linshan Jiang, W. Luo, Rui Tan, Dongfang Guo, Chaojie Gu, Linshan Jiang, W. Luo","doi":"10.1145/3517245","DOIUrl":"https://doi.org/10.1145/3517245","url":null,"abstract":"LoRaWAN is a narrowband wireless technology for ubiquitous connectivity. For various applications, it is desirable to localize LoRaWAN devices based on their uplink frames that convey application data. This localization service operates in an unobtrusive manner, in that it requires no special software instrumentation to the LoRaWAN devices. This paper investigates the feasibility of unobtrusive localization for LoRaWAN devices in hall-size indoor spaces like warehouses, airport terminals, sports centers, museum halls, etc. We study the TDoA-based approach, which needs to address two challenges of poor timing performance of LoRaWAN narrowband signal and nanosecond-level clock synchronization among anchors. We propose the ILLOC system featuring two LoRaWAN-specific techniques: (1) the cross-correlation among the differential phase sequences received by two anchors to estimate TDoA and (2) the just-in-time synchronization enabled by a specially deployed LoRaWAN end device providing time reference upon detecting a target device’s transmission. In a long tunnel corridor, a 70 × 32m 2 sports hall, and a 110 × 70m 2 indoor plaza with extensive non-line-of-sight propagation paths, ILLOC achieves median localization errors of 6m (with 2 anchors), 8 . 36m (with 6 anchors), and 15 . 16m (with 6 anchors and frame fusion), respectively. The achieved accuracy makes ILLOC useful for applications including zone-level asset tracking, misplacement detection, airport trolley management, and cybersecurity enforcement like detecting impersonation attacks launched by remote radios. . the design and evaluation of a TDoA-based, unobtrusive in-hall LoRaWAN localization (ILLOC) system for any off-the-shelf LoRaWAN end devices. ILLOC deploys multiple anchors with known positions and estimates the position of an end device based on the anchors’ TDoA measurements regarding any single uplink frame from the end device. The anchors are based on software-defined radios (SDRs) to access the physical layer. We prototype ILLOC using both Universal Software Radio Peripheral (USRP) and LimeSDR that is about 10x cheaper than USRP as the anchor. The design of ILLOC the following","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"53 1","pages":"13:1-13:26"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76308819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sirat Samyoun, Md. Mofijul Islam, Tariq Iqbal, J. Stankovic
Modern smartwatches or wrist wearables having multiple physiological sensing modalities have emerged as a subtle way to detect different mental health conditions, such as anxiety, emotions, and stress. However, affect detection models depending on wrist sensors data often provide poor performance due to inconsistent or inaccurate signals and scarcity of labeled data representing a condition. Although learning representations based on the physiological similarities of the affective tasks offer a possibility to solve this problem, existing approaches fail to effectively generate representations that will work across these multiple tasks. Moreover, the problem becomes more challenging due to the large domain gap among these affective applications and the discrepancies among the multiple sensing modalities. We present M3Sense, a multi-task, multimodal representation learning framework that effectively learns the affect-agnostic physiological representations from limited labeled data and uses a novel domain alignment technique to utilize the unlabeled data from the other affective tasks to accurately detect these mental health conditions using wrist sensors only. We apply M3Sense to 3 mental health applications, and quantify the achieved performance boost compared to the state-of-the-art using extensive evaluations and ablation studies on publicly available and collected datasets. Moreover, we extensively investigate what combination of tasks and modalities aids in developing a robust Multitask Learning model for affect recognition. Our analysis shows that incorporating emotion detection in the learning models degrades the performance of anxiety and stress detection, whereas stress detection helps to boost the emotion detection performance. Our results also show that M3Sense provides consistent performance across all affective tasks and available modalities and also improves the performance of representation learning models on unseen affective tasks by 5% − 60%.
{"title":"M3Sense: Affect-Agnostic Multitask Representation Learning Using Multimodal Wearable Sensors","authors":"Sirat Samyoun, Md. Mofijul Islam, Tariq Iqbal, J. Stankovic","doi":"10.1145/3534600","DOIUrl":"https://doi.org/10.1145/3534600","url":null,"abstract":"Modern smartwatches or wrist wearables having multiple physiological sensing modalities have emerged as a subtle way to detect different mental health conditions, such as anxiety, emotions, and stress. However, affect detection models depending on wrist sensors data often provide poor performance due to inconsistent or inaccurate signals and scarcity of labeled data representing a condition. Although learning representations based on the physiological similarities of the affective tasks offer a possibility to solve this problem, existing approaches fail to effectively generate representations that will work across these multiple tasks. Moreover, the problem becomes more challenging due to the large domain gap among these affective applications and the discrepancies among the multiple sensing modalities. We present M3Sense, a multi-task, multimodal representation learning framework that effectively learns the affect-agnostic physiological representations from limited labeled data and uses a novel domain alignment technique to utilize the unlabeled data from the other affective tasks to accurately detect these mental health conditions using wrist sensors only. We apply M3Sense to 3 mental health applications, and quantify the achieved performance boost compared to the state-of-the-art using extensive evaluations and ablation studies on publicly available and collected datasets. Moreover, we extensively investigate what combination of tasks and modalities aids in developing a robust Multitask Learning model for affect recognition. Our analysis shows that incorporating emotion detection in the learning models degrades the performance of anxiety and stress detection, whereas stress detection helps to boost the emotion detection performance. Our results also show that M3Sense provides consistent performance across all affective tasks and available modalities and also improves the performance of representation learning models on unseen affective tasks by 5% − 60%.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"249 1","pages":"73:1-73:32"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79608339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent works have shown that wearable or implanted devices attached at different locations of the body can generate an identical security key from their independent measurements of the same gait. This has created an opportunity to realize highly secured data exchange to and from critical implanted devices. In this paper, we first demonstrate that vision can be used to easily attack such gait-based key generations; an attacker with a commodity camera can measure the gait from a distance and generate a security key with any target wearable or implanted device faster than other legitimate devices worn at different locations of the subject’s body. To counter the attack, we propose a firewall to stop video-based gait measurements to proceed with key generation, but letting measurements from inertial measurement units (IMUs) that are widely used in wearable devices to measure the gait accelerations from the body to proceed. We implement the firewall concept with an IMU-vs-Video binary classifier that combines InceptionTime, an ensemble of deep Convolutional Neural Network (CNN) models for effective feature extraction from gait measurements, to a Generative Adversarial Network (GAN) that can generalize the classifier across subjects. Comprehensive evaluation with a real-world dataset shows that our proposed classifier can perform with an accuracy of 97.82%. Given that an attacker has to fool the classifier for multiple consecutive gait cycles to generate the complete key, the high single-cycle classification accuracy results in an extremely low probability for a video attacker to successfully pair with a target wearable device. More precisely, a video attacker would have one in a billion chance to successfully generate a 128-bit key, which would require the attacker to observe the subject for thousands of years.
{"title":"SafeGait: Safeguarding Gait-based Key Generation against Vision-based Side Channel Attack Using Generative Adversarial Network","authors":"Yuezhong Wu, Mahbub Hassan, Wen Hu","doi":"10.1145/3534607","DOIUrl":"https://doi.org/10.1145/3534607","url":null,"abstract":"Recent works have shown that wearable or implanted devices attached at different locations of the body can generate an identical security key from their independent measurements of the same gait. This has created an opportunity to realize highly secured data exchange to and from critical implanted devices. In this paper, we first demonstrate that vision can be used to easily attack such gait-based key generations; an attacker with a commodity camera can measure the gait from a distance and generate a security key with any target wearable or implanted device faster than other legitimate devices worn at different locations of the subject’s body. To counter the attack, we propose a firewall to stop video-based gait measurements to proceed with key generation, but letting measurements from inertial measurement units (IMUs) that are widely used in wearable devices to measure the gait accelerations from the body to proceed. We implement the firewall concept with an IMU-vs-Video binary classifier that combines InceptionTime, an ensemble of deep Convolutional Neural Network (CNN) models for effective feature extraction from gait measurements, to a Generative Adversarial Network (GAN) that can generalize the classifier across subjects. Comprehensive evaluation with a real-world dataset shows that our proposed classifier can perform with an accuracy of 97.82%. Given that an attacker has to fool the classifier for multiple consecutive gait cycles to generate the complete key, the high single-cycle classification accuracy results in an extremely low probability for a video attacker to successfully pair with a target wearable device. More precisely, a video attacker would have one in a billion chance to successfully generate a 128-bit key, which would require the attacker to observe the subject for thousands of years.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"54 1","pages":"80:1-80:27"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78266666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present EFRing, an index-finger-worn ring-form device for detecting thumb-to-index-finger (T2I) microgestures through the approach of electric-field (EF) sensing. Based on the signal change induced by the T2I motions, we proposed two machine-learning-based data-processing pipelines: one for recognizing/classifying discrete T2I microgestures, and the other for tracking continuous 1D T2I movements. Our experiments on the EFRing microgesture classification showed an average within-user accuracy of 89.5% and an average cross-user accuracy of 85.2%, for 9 discrete T2I microgestures. For the continuous tracking of 1D T2I movements, our method can achieve the mean-square error of 3.5% for the generic model and 2.3% for the personalized model. Our 1D-Fitts’-Law target-selection study shows that the proposed tracking method with EFRing is intuitive and accurate for real-time usage. Lastly, we proposed and discussed the potential applications for EFRing.
{"title":"EFRing: Enabling Thumb-to-Index-Finger Microgesture Interaction through Electric Field Sensing Using Single Smart Ring","authors":"Taizhou Chen, Tianpei Li, Xingyu Yang, Kening Zhu","doi":"10.1145/3569478","DOIUrl":"https://doi.org/10.1145/3569478","url":null,"abstract":"We present EFRing, an index-finger-worn ring-form device for detecting thumb-to-index-finger (T2I) microgestures through the approach of electric-field (EF) sensing. Based on the signal change induced by the T2I motions, we proposed two machine-learning-based data-processing pipelines: one for recognizing/classifying discrete T2I microgestures, and the other for tracking continuous 1D T2I movements. Our experiments on the EFRing microgesture classification showed an average within-user accuracy of 89.5% and an average cross-user accuracy of 85.2%, for 9 discrete T2I microgestures. For the continuous tracking of 1D T2I movements, our method can achieve the mean-square error of 3.5% for the generic model and 2.3% for the personalized model. Our 1D-Fitts’-Law target-selection study shows that the proposed tracking method with EFRing is intuitive and accurate for real-time usage. Lastly, we proposed and discussed the potential applications for EFRing.","PeriodicalId":20463,"journal":{"name":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.","volume":"20 1","pages":"161:1-161:31"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84723972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}