Tooth brushing monitors have the potential to enhance oral hygiene and encourage the development of healthy brushing habits. However, previous studies fall short of recognizing each tooth due to limitations in external sensors and variations among users. To address these challenges, we present ToothFairy, a real-time tooth-by-tooth brushing monitor that uses earphone reverse signals captured within the oral cavity to identify each tooth during brushing. The key component of ToothFairy is a novel bone-conducted acoustic attenuation model, which quantifies sound propagation within the oral cavity. This model eliminates the need for machine learning and can be calibrated with just one second of brushing data for each tooth by a new user. ToothFairy also addresses practical issues such as brushing detection and tooth region determination. Results from extensive experiments, involving 10 volunteers and 25 combinations of five commercial off-the-shelf toothbrush and earphone models each, show that ToothFairy achieves tooth recognition with an average accuracy of 90.5%.
{"title":"ToothFairy","authors":"Yang Wang, Feng Hong, Yufei Jiang, Chenyu Bao, Chao Liu, Zhongwen Guo","doi":"10.1145/3631412","DOIUrl":"https://doi.org/10.1145/3631412","url":null,"abstract":"Tooth brushing monitors have the potential to enhance oral hygiene and encourage the development of healthy brushing habits. However, previous studies fall short of recognizing each tooth due to limitations in external sensors and variations among users. To address these challenges, we present ToothFairy, a real-time tooth-by-tooth brushing monitor that uses earphone reverse signals captured within the oral cavity to identify each tooth during brushing. The key component of ToothFairy is a novel bone-conducted acoustic attenuation model, which quantifies sound propagation within the oral cavity. This model eliminates the need for machine learning and can be calibrated with just one second of brushing data for each tooth by a new user. ToothFairy also addresses practical issues such as brushing detection and tooth region determination. Results from extensive experiments, involving 10 volunteers and 25 combinations of five commercial off-the-shelf toothbrush and earphone models each, show that ToothFairy achieves tooth recognition with an average accuracy of 90.5%.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anran Xu, Zhongyi Zhou, Kakeru Miyazaki, Ryo Yoshikawa, S. Hosio, Koji Yatani
The world today is increasingly visual. Many of the most popular online social networking services are largely powered by images, making image privacy protection a critical research topic in the fields of ubiquitous computing, usable security, and human-computer interaction (HCI). One topical issue is understanding privacy-threatening content in images that are shared online. This dataset article introduces DIPA2, an open-sourced image dataset that offers object-level annotations with high-level reasoning properties to show perceptions of privacy among different cultures. DIPA2 provides 5,897 annotations describing perceived privacy risks of 3,347 objects in 1,304 images. The annotations contain the type of the object and four additional privacy metrics: 1) information type indicating what kind of information may leak if the image containing the object is shared, 2) a 7-point Likert item estimating the perceived severity of privacy leakages, and 3) intended recipient scopes when annotators assume they are either image owners or allowing others to repost the image. Our dataset contains unique data from two cultures: We recruited annotators from both Japan and the U.K. to demonstrate the impact of culture on object-level privacy perceptions. In this paper, we first illustrate how we designed and performed the construction of DIPA2, along with data analysis of the collected annotations. Second, we provide two machine-learning baselines to demonstrate how DIPA2 challenges the current image privacy recognition task. DIPA2 facilitates various types of research on image privacy, including machine learning methods inferring privacy threats in complex scenarios, quantitative analysis of cultural influences on privacy preferences, understanding of image sharing behaviors, and promotion of cyber hygiene for general user populations.
{"title":"DIPA2","authors":"Anran Xu, Zhongyi Zhou, Kakeru Miyazaki, Ryo Yoshikawa, S. Hosio, Koji Yatani","doi":"10.1145/3631439","DOIUrl":"https://doi.org/10.1145/3631439","url":null,"abstract":"The world today is increasingly visual. Many of the most popular online social networking services are largely powered by images, making image privacy protection a critical research topic in the fields of ubiquitous computing, usable security, and human-computer interaction (HCI). One topical issue is understanding privacy-threatening content in images that are shared online. This dataset article introduces DIPA2, an open-sourced image dataset that offers object-level annotations with high-level reasoning properties to show perceptions of privacy among different cultures. DIPA2 provides 5,897 annotations describing perceived privacy risks of 3,347 objects in 1,304 images. The annotations contain the type of the object and four additional privacy metrics: 1) information type indicating what kind of information may leak if the image containing the object is shared, 2) a 7-point Likert item estimating the perceived severity of privacy leakages, and 3) intended recipient scopes when annotators assume they are either image owners or allowing others to repost the image. Our dataset contains unique data from two cultures: We recruited annotators from both Japan and the U.K. to demonstrate the impact of culture on object-level privacy perceptions. In this paper, we first illustrate how we designed and performed the construction of DIPA2, along with data analysis of the collected annotations. Second, we provide two machine-learning baselines to demonstrate how DIPA2 challenges the current image privacy recognition task. DIPA2 facilitates various types of research on image privacy, including machine learning methods inferring privacy threats in complex scenarios, quantitative analysis of cultural influences on privacy preferences, understanding of image sharing behaviors, and promotion of cyber hygiene for general user populations.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasmine Djebrouni, Nawel Benarba, Ousmane Touat, Pasquale De Rosa, Sara Bouchenak, Angela Bonifati, Pascal Felber, Vania Marangozova, V. Schiavoni
Federated learning (FL) is a distributed machine learning paradigm that enables data owners to collaborate on training models while preserving data privacy. As FL effectively leverages decentralized and sensitive data sources, it is increasingly used in ubiquitous computing including remote healthcare, activity recognition, and mobile applications. However, FL raises ethical and social concerns as it may introduce bias with regard to sensitive attributes such as race, gender, and location. Mitigating FL bias is thus a major research challenge. In this paper, we propose Astral, a novel bias mitigation system for FL. Astral provides a novel model aggregation approach to select the most effective aggregation weights to combine FL clients' models. It guarantees a predefined fairness objective by constraining bias below a given threshold while keeping model accuracy as high as possible. Astral handles the bias of single and multiple sensitive attributes and supports all bias metrics. Our comprehensive evaluation on seven real-world datasets with three popular bias metrics shows that Astral outperforms state-of-the-art FL bias mitigation techniques in terms of bias mitigation and model accuracy. Moreover, we show that Astral is robust against data heterogeneity and scalable in terms of data size and number of FL clients. Astral's code base is publicly available.
{"title":"Bias Mitigation in Federated Learning for Edge Computing","authors":"Yasmine Djebrouni, Nawel Benarba, Ousmane Touat, Pasquale De Rosa, Sara Bouchenak, Angela Bonifati, Pascal Felber, Vania Marangozova, V. Schiavoni","doi":"10.1145/3631455","DOIUrl":"https://doi.org/10.1145/3631455","url":null,"abstract":"Federated learning (FL) is a distributed machine learning paradigm that enables data owners to collaborate on training models while preserving data privacy. As FL effectively leverages decentralized and sensitive data sources, it is increasingly used in ubiquitous computing including remote healthcare, activity recognition, and mobile applications. However, FL raises ethical and social concerns as it may introduce bias with regard to sensitive attributes such as race, gender, and location. Mitigating FL bias is thus a major research challenge. In this paper, we propose Astral, a novel bias mitigation system for FL. Astral provides a novel model aggregation approach to select the most effective aggregation weights to combine FL clients' models. It guarantees a predefined fairness objective by constraining bias below a given threshold while keeping model accuracy as high as possible. Astral handles the bias of single and multiple sensitive attributes and supports all bias metrics. Our comprehensive evaluation on seven real-world datasets with three popular bias metrics shows that Astral outperforms state-of-the-art FL bias mitigation techniques in terms of bias mitigation and model accuracy. Moreover, we show that Astral is robust against data heterogeneity and scalable in terms of data size and number of FL clients. Astral's code base is publicly available.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep Learning models are a standard solution for sensor-based Human Activity Recognition (HAR), but their deployment is often limited by labeled data scarcity and models' opacity. Neuro-Symbolic AI (NeSy) provides an interesting research direction to mitigate these issues by infusing knowledge about context information into HAR deep learning classifiers. However, existing NeSy methods for context-aware HAR require computationally expensive symbolic reasoners during classification, making them less suitable for deployment on resource-constrained devices (e.g., mobile devices). Additionally, NeSy approaches for context-aware HAR have never been evaluated on in-the-wild datasets, and their generalization capabilities in real-world scenarios are questionable. In this work, we propose a novel approach based on a semantic loss function that infuses knowledge constraints in the HAR model during the training phase, avoiding symbolic reasoning during classification. Our results on scripted and in-the-wild datasets show the impact of different semantic loss functions in outperforming a purely data-driven model. We also compare our solution with existing NeSy methods and analyze each approach's strengths and weaknesses. Our semantic loss remains the only NeSy solution that can be deployed as a single DNN without the need for symbolic reasoning modules, reaching recognition rates close (and better in some cases) to existing approaches.
深度学习模型是基于传感器的人类活动识别(HAR)的标准解决方案,但其部署往往受到标记数据稀缺和模型不透明的限制。神经符号人工智能(NeSy)提供了一个有趣的研究方向,通过将上下文信息知识注入 HAR 深度学习分类器来缓解这些问题。然而,现有的用于上下文感知 HAR 的 NeSy 方法在分类过程中需要计算昂贵的符号推理器,因此不太适合部署在资源受限的设备(如移动设备)上。此外,用于上下文感知 HAR 的 NeSy 方法从未在实际数据集上进行过评估,其在真实世界场景中的泛化能力也值得怀疑。在这项工作中,我们提出了一种基于语义损失函数的新方法,在训练阶段将知识约束注入 HAR 模型,避免了分类过程中的符号推理。我们在脚本数据集和野生数据集上的研究结果表明,不同的语义损失函数对超越纯数据驱动模型的影响。我们还将我们的解决方案与现有的 NeSy 方法进行了比较,并分析了每种方法的优缺点。我们的语义损失仍然是唯一可以作为单一 DNN 部署的 NeSy 解决方案,无需符号推理模块,识别率接近(在某些情况下甚至更高)现有方法。
{"title":"Semantic Loss","authors":"Luca Arrotta, Gabriele Civitarese, Claudio Bettini","doi":"10.1145/3631407","DOIUrl":"https://doi.org/10.1145/3631407","url":null,"abstract":"Deep Learning models are a standard solution for sensor-based Human Activity Recognition (HAR), but their deployment is often limited by labeled data scarcity and models' opacity. Neuro-Symbolic AI (NeSy) provides an interesting research direction to mitigate these issues by infusing knowledge about context information into HAR deep learning classifiers. However, existing NeSy methods for context-aware HAR require computationally expensive symbolic reasoners during classification, making them less suitable for deployment on resource-constrained devices (e.g., mobile devices). Additionally, NeSy approaches for context-aware HAR have never been evaluated on in-the-wild datasets, and their generalization capabilities in real-world scenarios are questionable. In this work, we propose a novel approach based on a semantic loss function that infuses knowledge constraints in the HAR model during the training phase, avoiding symbolic reasoning during classification. Our results on scripted and in-the-wild datasets show the impact of different semantic loss functions in outperforming a purely data-driven model. We also compare our solution with existing NeSy methods and analyze each approach's strengths and weaknesses. Our semantic loss remains the only NeSy solution that can be deployed as a single DNN without the need for symbolic reasoning modules, reaching recognition rates close (and better in some cases) to existing approaches.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139438019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. U. Demirel, Ting Dang, Khaldoon Al-Naimi, F. Kawsar, A. Montanari
Earables (in-ear wearables) are gaining increasing attention for sensing applications and healthcare research thanks to their ergonomy and non-invasive nature. However, air leakages between the device and the user's ear, resulting from daily activities or wearing variabilities, can decrease the performance of applications, interfere with calibrations, and reduce the robustness of the overall system. Existing literature lacks established methods for estimating the degree of air leaks (i.e., seal integrity) to provide information for the earable applications. In this work, we proposed a novel unobtrusive method for estimating the air leakage level of earbuds based on an in-ear microphone. The proposed method aims to estimate the magnitude of distortions, reflections, and external noise in the ear canal while excluding the speaker output by learning the speaker-to-microphone transfer function which allows us to perform the task unobtrusively. Using the obtained residual signal in the ear canal, we extract three features and deploy a machine-learning model for estimating the air leakage level. We investigated our system under various conditions to validate its robustness and resilience against the motion and other artefacts. Our extensive experimental evaluation shows that the proposed method can track air leakage levels under different daily activities. "The best computer is a quiet, invisible servant." ~Mark Weiser
{"title":"Unobtrusive Air Leakage Estimation for Earables with In-ear Microphones","authors":"B. U. Demirel, Ting Dang, Khaldoon Al-Naimi, F. Kawsar, A. Montanari","doi":"10.1145/3631405","DOIUrl":"https://doi.org/10.1145/3631405","url":null,"abstract":"Earables (in-ear wearables) are gaining increasing attention for sensing applications and healthcare research thanks to their ergonomy and non-invasive nature. However, air leakages between the device and the user's ear, resulting from daily activities or wearing variabilities, can decrease the performance of applications, interfere with calibrations, and reduce the robustness of the overall system. Existing literature lacks established methods for estimating the degree of air leaks (i.e., seal integrity) to provide information for the earable applications. In this work, we proposed a novel unobtrusive method for estimating the air leakage level of earbuds based on an in-ear microphone. The proposed method aims to estimate the magnitude of distortions, reflections, and external noise in the ear canal while excluding the speaker output by learning the speaker-to-microphone transfer function which allows us to perform the task unobtrusively. Using the obtained residual signal in the ear canal, we extract three features and deploy a machine-learning model for estimating the air leakage level. We investigated our system under various conditions to validate its robustness and resilience against the motion and other artefacts. Our extensive experimental evaluation shows that the proposed method can track air leakage levels under different daily activities. \"The best computer is a quiet, invisible servant.\" ~Mark Weiser","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The proliferation of the Internet of Things is calling for new modalities that enable human interaction with smart objects. Recent research has explored RFID tags as passive sensors to detect finger touch. However, existing approaches either rely on custom-built RFID readers or are limited to pre-trained finger-swiping gestures. In this paper, we introduce KeyStub, which can discriminate multiple discrete keystrokes on an RFID tag. KeyStub interfaces with commodity RFID ICs with multiple microwave-band resonant stubs as keys. Each stub's geometry is designed to create a predefined impedance mismatch to the RFID IC upon a keystroke, which in turn translates into a known amplitude and phase shift, remotely detectable by an RFID reader. KeyStub combines two ICs' signals through a single common-mode antenna and performs differential detection to evade the need for calibration and ensure reliability in heavy multi-path environments. Our experiments using a commercial-off-the-shelf RFID reader and ICs show that up to 8 buttons can be detected and decoded with accuracy greater than 95%. KeyStub points towards a novel way of using resonant stubs to augment RF antenna structures, thus enabling new passive wireless interaction modalities.
物联网的普及要求采用新的模式来实现人类与智能物体的互动。最近的研究探索了将 RFID 标签作为被动传感器来检测手指触摸。然而,现有的方法要么依赖于定制的 RFID 阅读器,要么仅限于预先训练好的手指滑动手势。在本文中,我们介绍了 KeyStub,它可以分辨 RFID 标签上的多个离散按键。KeyStub 与商品 RFID IC 相连接,以多个微波带谐振存根作为按键。每个谐振块的几何形状都经过设计,可在按键时对 RFID IC 产生预定义的阻抗失配,进而转化为已知的振幅和相移,由 RFID 阅读器远程检测。KeyStub 通过一根共模天线将两个集成电路的信号结合在一起,并进行差分检测,从而避免了校准的需要,确保了在多路径环境下的可靠性。我们使用现成的商用 RFID 阅读器和集成电路进行的实验表明,最多可检测和解码 8 个按钮,准确率超过 95%。KeyStub指出了一种使用谐振存根增强射频天线结构的新方法,从而实现了新的无源无线交互模式。
{"title":"KeyStub","authors":"John Nolan, Kun Qian, Xinyu Zhang","doi":"10.1145/3631442","DOIUrl":"https://doi.org/10.1145/3631442","url":null,"abstract":"The proliferation of the Internet of Things is calling for new modalities that enable human interaction with smart objects. Recent research has explored RFID tags as passive sensors to detect finger touch. However, existing approaches either rely on custom-built RFID readers or are limited to pre-trained finger-swiping gestures. In this paper, we introduce KeyStub, which can discriminate multiple discrete keystrokes on an RFID tag. KeyStub interfaces with commodity RFID ICs with multiple microwave-band resonant stubs as keys. Each stub's geometry is designed to create a predefined impedance mismatch to the RFID IC upon a keystroke, which in turn translates into a known amplitude and phase shift, remotely detectable by an RFID reader. KeyStub combines two ICs' signals through a single common-mode antenna and performs differential detection to evade the need for calibration and ensure reliability in heavy multi-path environments. Our experiments using a commercial-off-the-shelf RFID reader and ICs show that up to 8 buttons can be detected and decoded with accuracy greater than 95%. KeyStub points towards a novel way of using resonant stubs to augment RF antenna structures, thus enabling new passive wireless interaction modalities.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a study on the touch precision of an eye-free, body-based interface using on-body and near-body touch methods with and without skin contact. We evaluate user touch accuracy on four different button layouts. These layouts progressively increase the number of buttons between adjacent body joints, resulting in 12, 20, 28, and 36 touch buttons distributed across the body. Our study indicates that the on-body method achieved an accuracy beyond 95% for the 12- and 20-button layouts, whereas the near-body method only for the 12-button layout. Investigating user touch patterns, we applied SVM classifiers, which boost both the on-body and near-body methods to support up to the 28-button layouts by learning individual touch patterns. However, using generalized touch patterns did not significantly improve accuracy for more complex layouts, highlighting considerable differences in individual touch habits. When evaluating user experience metrics such as workload perception, confidence, convenience, and willingness-to-use, users consistently favored the 20-button layout regardless of the touch technique used. Remarkably, the 20-button layout, when applied to on-body touch methods, does not necessitate personal touch patterns, showcasing an optimal balance of practicality, effectiveness, and user experience without the need for trained models. In contrast, the near-body touch targeting the 20-button layout needs a personalized model; otherwise, the 12-button layout offers the best immediate practicality.
{"title":"BodyTouch","authors":"Wen-Wei Cheng, Liwei Chan","doi":"10.1145/3631426","DOIUrl":"https://doi.org/10.1145/3631426","url":null,"abstract":"This paper presents a study on the touch precision of an eye-free, body-based interface using on-body and near-body touch methods with and without skin contact. We evaluate user touch accuracy on four different button layouts. These layouts progressively increase the number of buttons between adjacent body joints, resulting in 12, 20, 28, and 36 touch buttons distributed across the body. Our study indicates that the on-body method achieved an accuracy beyond 95% for the 12- and 20-button layouts, whereas the near-body method only for the 12-button layout. Investigating user touch patterns, we applied SVM classifiers, which boost both the on-body and near-body methods to support up to the 28-button layouts by learning individual touch patterns. However, using generalized touch patterns did not significantly improve accuracy for more complex layouts, highlighting considerable differences in individual touch habits. When evaluating user experience metrics such as workload perception, confidence, convenience, and willingness-to-use, users consistently favored the 20-button layout regardless of the touch technique used. Remarkably, the 20-button layout, when applied to on-body touch methods, does not necessitate personal touch patterns, showcasing an optimal balance of practicality, effectiveness, and user experience without the need for trained models. In contrast, the near-body touch targeting the 20-button layout needs a personalized model; otherwise, the 12-button layout offers the best immediate practicality.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A variety of consumer Augmented Reality (AR) applications have been released on mobile devices and novel immersive headsets over the last five years, creating a breadth of new AR-enabled experiences. However, these applications, particularly those designed for immersive headsets, require users to employ unfamiliar gestural input and adopt novel interaction paradigms. To better understand how everyday users discover gestures and classify the types of interaction challenges they face, we observed how 25 novices from diverse backgrounds and technical knowledge used four different AR applications requiring a range of interaction techniques. A detailed analysis of gesture interaction traces showed that users struggled to discover the correct gestures, with the majority of errors occurring when participants could not determine the correct sequence of actions to perform or could not evaluate their actions. To further reflect on the prevalence of our findings, we carried out an expert validation study with eight professional AR designers, engineers, and researchers. We discuss implications for designing discoverable gestural input techniques that align with users' mental models, inventing AR-specific onboarding and help systems, and enhancing system-level machine recognition.
过去五年来,移动设备和新型沉浸式头戴设备上发布了各种消费类增强现实(AR)应用,创造了大量新的增强现实体验。然而,这些应用,尤其是那些为沉浸式头显设计的应用,需要用户使用陌生的手势输入并采用新颖的交互模式。为了更好地了解日常用户如何发现手势并对他们所面临的交互挑战类型进行分类,我们观察了 25 位来自不同背景和技术知识的新手如何使用四种不同的 AR 应用程序,这些应用程序需要一系列的交互技术。对手势交互痕迹的详细分析显示,用户在发现正确的手势方面困难重重,大多数错误发生在参与者无法确定正确的操作顺序或无法评估自己的操作时。为了进一步反思我们研究结果的普遍性,我们与八位专业 AR 设计师、工程师和研究人员开展了一项专家验证研究。我们讨论了设计符合用户心理模型的可发现手势输入技术、发明AR专用入门和帮助系统以及增强系统级机器识别的意义。
{"title":"Do I Just Tap My Headset?","authors":"Anjali Khurana, Michael Glueck, Parmit K. Chilana","doi":"10.1145/3631451","DOIUrl":"https://doi.org/10.1145/3631451","url":null,"abstract":"A variety of consumer Augmented Reality (AR) applications have been released on mobile devices and novel immersive headsets over the last five years, creating a breadth of new AR-enabled experiences. However, these applications, particularly those designed for immersive headsets, require users to employ unfamiliar gestural input and adopt novel interaction paradigms. To better understand how everyday users discover gestures and classify the types of interaction challenges they face, we observed how 25 novices from diverse backgrounds and technical knowledge used four different AR applications requiring a range of interaction techniques. A detailed analysis of gesture interaction traces showed that users struggled to discover the correct gestures, with the majority of errors occurring when participants could not determine the correct sequence of actions to perform or could not evaluate their actions. To further reflect on the prevalence of our findings, we carried out an expert validation study with eight professional AR designers, engineers, and researchers. We discuss implications for designing discoverable gestural input techniques that align with users' mental models, inventing AR-specific onboarding and help systems, and enhancing system-level machine recognition.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenqiang Chen, Yexin Hu, Wei Song, Yingcheng Liu, Antonio Torralba, Wojciech Matusik
Human mesh reconstruction is essential for various applications, including virtual reality, motion capture, sports performance analysis, and healthcare monitoring. In healthcare contexts such as nursing homes, it is crucial to employ plausible and non-invasive methods for human mesh reconstruction that preserve privacy and dignity. Traditional vision-based techniques encounter challenges related to occlusion, viewpoint limitations, lighting conditions, and privacy concerns. In this research, we present CAvatar, a real-time human mesh reconstruction approach that innovatively utilizes pressure maps recorded by a tactile carpet as input. This advanced, non-intrusive technology obviates the need for cameras during usage, thereby safeguarding privacy. Our approach addresses several challenges, such as the limited spatial resolution of tactile sensors, extracting meaningful information from noisy pressure maps, and accommodating user variations and multiple users. We have developed an attention-based deep learning network, complemented by a discriminator network, to predict 3D human pose and shape from 2D pressure maps with notable accuracy. Our model demonstrates promising results, with a mean per joint position error (MPJPE) of 5.89 cm and a per vertex error (PVE) of 6.88 cm. To the best of our knowledge, we are the first to generate 3D mesh of human activities solely using tactile carpet signals, offering a novel approach that addresses privacy concerns and surpasses the limitations of existing vision-based and wearable solutions. The demonstration of CAvatar is shown at https://youtu.be/ZpO3LEsgV7Y.
{"title":"CAvatar","authors":"Wenqiang Chen, Yexin Hu, Wei Song, Yingcheng Liu, Antonio Torralba, Wojciech Matusik","doi":"10.1145/3631424","DOIUrl":"https://doi.org/10.1145/3631424","url":null,"abstract":"Human mesh reconstruction is essential for various applications, including virtual reality, motion capture, sports performance analysis, and healthcare monitoring. In healthcare contexts such as nursing homes, it is crucial to employ plausible and non-invasive methods for human mesh reconstruction that preserve privacy and dignity. Traditional vision-based techniques encounter challenges related to occlusion, viewpoint limitations, lighting conditions, and privacy concerns. In this research, we present CAvatar, a real-time human mesh reconstruction approach that innovatively utilizes pressure maps recorded by a tactile carpet as input. This advanced, non-intrusive technology obviates the need for cameras during usage, thereby safeguarding privacy. Our approach addresses several challenges, such as the limited spatial resolution of tactile sensors, extracting meaningful information from noisy pressure maps, and accommodating user variations and multiple users. We have developed an attention-based deep learning network, complemented by a discriminator network, to predict 3D human pose and shape from 2D pressure maps with notable accuracy. Our model demonstrates promising results, with a mean per joint position error (MPJPE) of 5.89 cm and a per vertex error (PVE) of 6.88 cm. To the best of our knowledge, we are the first to generate 3D mesh of human activities solely using tactile carpet signals, offering a novel approach that addresses privacy concerns and surpasses the limitations of existing vision-based and wearable solutions. The demonstration of CAvatar is shown at https://youtu.be/ZpO3LEsgV7Y.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objects engaged by users' hands contain rich contextual information for their strong correlation with user activities. Tools such as toothbrushes and wipes indicate cleansing and sanitation, while mice and keyboards imply work. Much research has been endeavored to sense hand-engaged objects to supply wearables with implicit interactions or ambient computing with personal informatics. We propose TextureSight, a smart-ring sensor that detects hand-engaged objects by detecting their distinctive surface textures using laser speckle imaging on a ring form factor. We conducted a two-day experience sampling study to investigate the unicity and repeatability of the object-texture combinations across routine objects. We grounded our sensing with a theoretical model and simulations, powered it with state-of-the-art deep neural net techniques, and evaluated it with a user study. TextureSight constitutes a valuable addition to the literature for its capability to sense passive objects without emission of EMI or vibration and its elimination of lens for preserving user privacy, leading to a new, practical method for activity recognition and context-aware computing.
{"title":"TextureSight","authors":"Xue Wang, Yang Zhang","doi":"10.1145/3631413","DOIUrl":"https://doi.org/10.1145/3631413","url":null,"abstract":"Objects engaged by users' hands contain rich contextual information for their strong correlation with user activities. Tools such as toothbrushes and wipes indicate cleansing and sanitation, while mice and keyboards imply work. Much research has been endeavored to sense hand-engaged objects to supply wearables with implicit interactions or ambient computing with personal informatics. We propose TextureSight, a smart-ring sensor that detects hand-engaged objects by detecting their distinctive surface textures using laser speckle imaging on a ring form factor. We conducted a two-day experience sampling study to investigate the unicity and repeatability of the object-texture combinations across routine objects. We grounded our sensing with a theoretical model and simulations, powered it with state-of-the-art deep neural net techniques, and evaluated it with a user study. TextureSight constitutes a valuable addition to the literature for its capability to sense passive objects without emission of EMI or vibration and its elimination of lens for preserving user privacy, leading to a new, practical method for activity recognition and context-aware computing.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}