As AI systems become more advanced, ensuring their alignment with a diverse range of individuals and societal values becomes increasingly critical. But how can we capture fundamental human values and assess the degree to which AI systems align with them? We introduce ValueCompass, a framework of fundamental values, grounded in psychological theory and a systematic review, to identify and evaluate human-AI alignment. We apply ValueCompass to measure the value alignment of humans and language models (LMs) across four real-world vignettes: collaborative writing, education, public sectors, and healthcare. Our findings uncover risky misalignment between humans and LMs, such as LMs agreeing with values like "Choose Own Goals", which are largely disagreed by humans. We also observe values vary across vignettes, underscoring the necessity for context-aware AI alignment strategies. This work provides insights into the design space of human-AI alignment, offering foundations for developing AI that responsibly reflects societal values and ethics.
{"title":"ValueCompass: A Framework of Fundamental Values for Human-AI Alignment","authors":"Hua Shen, Tiffany Knearem, Reshmi Ghosh, Yu-Ju Yang, Tanushree Mitra, Yun Huang","doi":"arxiv-2409.09586","DOIUrl":"https://doi.org/arxiv-2409.09586","url":null,"abstract":"As AI systems become more advanced, ensuring their alignment with a diverse\u0000range of individuals and societal values becomes increasingly critical. But how\u0000can we capture fundamental human values and assess the degree to which AI\u0000systems align with them? We introduce ValueCompass, a framework of fundamental\u0000values, grounded in psychological theory and a systematic review, to identify\u0000and evaluate human-AI alignment. We apply ValueCompass to measure the value\u0000alignment of humans and language models (LMs) across four real-world vignettes:\u0000collaborative writing, education, public sectors, and healthcare. Our findings\u0000uncover risky misalignment between humans and LMs, such as LMs agreeing with\u0000values like \"Choose Own Goals\", which are largely disagreed by humans. We also\u0000observe values vary across vignettes, underscoring the necessity for\u0000context-aware AI alignment strategies. This work provides insights into the\u0000design space of human-AI alignment, offering foundations for developing AI that\u0000responsibly reflects societal values and ethics.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianyi Zhang, Shiquan Zhang, Le Fang, Hong Jia, Vassilis Kostakos, Simon D'Alfonso
Journaling offers significant benefits, including fostering self-reflection, enhancing writing skills, and aiding in mood monitoring. However, many people abandon the practice because traditional journaling is time-consuming, and detailed life events may be overlooked if not recorded promptly. Given that smartphones are the most widely used devices for entertainment, work, and socialization, they present an ideal platform for innovative approaches to journaling. Despite their ubiquity, the potential of using digital phenotyping, a method of unobtrusively collecting data from digital devices to gain insights into psychological and behavioral patterns, for automated journal generation has been largely underexplored. In this study, we propose AutoJournaling, the first-of-its-kind system that automatically generates journals by collecting and analyzing screenshots from smartphones. This system captures life events and corresponding emotions, offering a novel approach to digital phenotyping. We evaluated AutoJournaling by collecting screenshots every 3 seconds from three students over five days, demonstrating its feasibility and accuracy. AutoJournaling is the first framework to utilize seamlessly collected screenshots for journal generation, providing new insights into psychological states through digital phenotyping.
{"title":"AutoJournaling: A Context-Aware Journaling System Leveraging MLLMs on Smartphone Screenshots","authors":"Tianyi Zhang, Shiquan Zhang, Le Fang, Hong Jia, Vassilis Kostakos, Simon D'Alfonso","doi":"arxiv-2409.09696","DOIUrl":"https://doi.org/arxiv-2409.09696","url":null,"abstract":"Journaling offers significant benefits, including fostering self-reflection,\u0000enhancing writing skills, and aiding in mood monitoring. However, many people\u0000abandon the practice because traditional journaling is time-consuming, and\u0000detailed life events may be overlooked if not recorded promptly. Given that\u0000smartphones are the most widely used devices for entertainment, work, and\u0000socialization, they present an ideal platform for innovative approaches to\u0000journaling. Despite their ubiquity, the potential of using digital phenotyping,\u0000a method of unobtrusively collecting data from digital devices to gain insights\u0000into psychological and behavioral patterns, for automated journal generation\u0000has been largely underexplored. In this study, we propose AutoJournaling, the\u0000first-of-its-kind system that automatically generates journals by collecting\u0000and analyzing screenshots from smartphones. This system captures life events\u0000and corresponding emotions, offering a novel approach to digital phenotyping.\u0000We evaluated AutoJournaling by collecting screenshots every 3 seconds from\u0000three students over five days, demonstrating its feasibility and accuracy.\u0000AutoJournaling is the first framework to utilize seamlessly collected\u0000screenshots for journal generation, providing new insights into psychological\u0000states through digital phenotyping.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
d/Deaf and hearing song-signers become prevalent on video-sharing platforms, but translating songs into sign language remains cumbersome and inaccessible. Our formative study revealed the challenges song-signers face, including semantic, syntactic, expressive, and rhythmic considerations in translations. We present ELMI, an accessible song-signing tool that assists in translating lyrics into sign language. ELMI enables users to edit glosses line-by-line, with real-time synced lyric highlighting and music video snippets. Users can also chat with a large language model-driven AI to discuss meaning, glossing, emoting, and timing. Through an exploratory study with 13 song-signers, we examined how ELMI facilitates their workflows and how song-signers leverage and receive an LLM-driven chat for translation. Participants successfully adopted ELMI to song-signing, with active discussions on the fly. They also reported improved confidence and independence in their translations, finding ELMI encouraging, constructive, and informative. We discuss design implications for leveraging LLMs in culturally sensitive song-signing translations.
我们的形成性研究揭示了歌曲手语者所面临的挑战,包括翻译中的语义、句法、表达和节奏方面的考虑。ELMI 使用户能够逐行编辑词汇,并实时同步歌词高亮和音乐视频片段。用户还可以与大型语言模型驱动的人工智能聊天,讨论含义、词汇、情感和时机。通过对 13 位歌曲署名者的探索性研究,我们考察了 ELMI 如何促进他们的工作流程,以及歌曲署名者如何利用和接收 LLM 驱动的聊天翻译。参与者成功地将 ELMI 应用到了歌曲翻译中,并进行了积极的即时讨论。他们还报告说,他们在翻译中的自信心和独立性都得到了提高,并发现 ELMI 具有鼓励性、建设性和信息性。我们讨论了在文化敏感的歌曲签名翻译中利用 LLM 的设计意义。
{"title":"ELMI: Interactive and Intelligent Sign Language Translation of Lyrics for Song Signing","authors":"Suhyeon Yoo, Khai N. Truong, Young-Ho Kim","doi":"arxiv-2409.09760","DOIUrl":"https://doi.org/arxiv-2409.09760","url":null,"abstract":"d/Deaf and hearing song-signers become prevalent on video-sharing platforms,\u0000but translating songs into sign language remains cumbersome and inaccessible.\u0000Our formative study revealed the challenges song-signers face, including\u0000semantic, syntactic, expressive, and rhythmic considerations in translations.\u0000We present ELMI, an accessible song-signing tool that assists in translating\u0000lyrics into sign language. ELMI enables users to edit glosses line-by-line,\u0000with real-time synced lyric highlighting and music video snippets. Users can\u0000also chat with a large language model-driven AI to discuss meaning, glossing,\u0000emoting, and timing. Through an exploratory study with 13 song-signers, we\u0000examined how ELMI facilitates their workflows and how song-signers leverage and\u0000receive an LLM-driven chat for translation. Participants successfully adopted\u0000ELMI to song-signing, with active discussions on the fly. They also reported\u0000improved confidence and independence in their translations, finding ELMI\u0000encouraging, constructive, and informative. We discuss design implications for\u0000leveraging LLMs in culturally sensitive song-signing translations.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mateusz Dolata, Birgit Schenk, Jara Fuhrer, Alina Marti, Gerhard Schwabe
Case and knowledge management systems are spread at the frontline across public agencies. However, such systems are dedicated for the collaboration within the agency rather than for the face-to-face interaction with the clients. If used as a collaborative resource at the frontline, case and knowledge management systems might disturb the service provision by displaying unfiltered internal information, disclosing private data of other clients, or revealing the limits of frontline employees' competence (if they cannot explain something) or their authority (if they cannot override something). Observation in the German Public Employment Agency shows that employment consultants make use of various coping strategies during face-to-face consultations to extend existing boundaries set by the case and knowledge management systems and by the rules considering their usage. The analysis of these coping strategies unveils the forces that shape the conduct of employment consultants during their contacts with clients: the consultants' own understanding of work, the actual and the perceived needs of the clients, and the political mission as well as the internal rules of the employment agency. The findings form a twofold contribution: First, they contribute to the discourse on work in employment agencies by illustrating how the complexities of social welfare apparatus demonstrate themselves in singular behavioural patterns. Second, they contribute to the discourse on screen-level bureaucracy by depicting the consultants as active and conscious mediators rather than passive interfaces between the system and the client.
{"title":"When the System does not Fit: Coping Strategies of Employment Consultants","authors":"Mateusz Dolata, Birgit Schenk, Jara Fuhrer, Alina Marti, Gerhard Schwabe","doi":"arxiv-2409.09457","DOIUrl":"https://doi.org/arxiv-2409.09457","url":null,"abstract":"Case and knowledge management systems are spread at the frontline across\u0000public agencies. However, such systems are dedicated for the collaboration\u0000within the agency rather than for the face-to-face interaction with the\u0000clients. If used as a collaborative resource at the frontline, case and\u0000knowledge management systems might disturb the service provision by displaying\u0000unfiltered internal information, disclosing private data of other clients, or\u0000revealing the limits of frontline employees' competence (if they cannot explain\u0000something) or their authority (if they cannot override something). Observation\u0000in the German Public Employment Agency shows that employment consultants make\u0000use of various coping strategies during face-to-face consultations to extend\u0000existing boundaries set by the case and knowledge management systems and by the\u0000rules considering their usage. The analysis of these coping strategies unveils\u0000the forces that shape the conduct of employment consultants during their\u0000contacts with clients: the consultants' own understanding of work, the actual\u0000and the perceived needs of the clients, and the political mission as well as\u0000the internal rules of the employment agency. The findings form a twofold\u0000contribution: First, they contribute to the discourse on work in employment\u0000agencies by illustrating how the complexities of social welfare apparatus\u0000demonstrate themselves in singular behavioural patterns. Second, they\u0000contribute to the discourse on screen-level bureaucracy by depicting the\u0000consultants as active and conscious mediators rather than passive interfaces\u0000between the system and the client.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mateusz Dolata, Doris Agotai, Simon Schubiger, Gerhard Schwabe
Advisory services are ritualized encounters between an expert and an advisee. Empathetic, high-touch relationship between those two parties was identified as the key aspect of a successful advisory encounter. To facilitate the high-touch interaction, advisors established rituals which stress the unique, individual character of each client and each single encounter. Simultaneously, organizations like banks or insurances rolled out tools and technologies for use in advisory services to offer a uniform experience and consistent quality across branches and advisors. As a consequence, advisors were caught between the high-touch and high-tech aspects of an advisory service. This manuscript presents a system that accommodates for high-touch rituals and practices and combines them with high-tech collaboration. The proposed solution augments pen-and-paper practices with digital content and affords new material performances coherent with the existing rituals. The evaluation in realistic mortgage advisory services unveils the potential of mixed reality approaches for application in professional, institutional settings. The blow-by-blow analysis of the conversations reveals how an advisory service can become equally high-tech and high-touch thanks to a careful ritual-oriented system design. As a consequence, this paper presents a solution to the tension between the high-touch and high-tech tendencies in advisory services.
{"title":"Pen-and-paper Rituals in Service Interaction: Combining High-touch and High-tech in Financial Advisory Encounters","authors":"Mateusz Dolata, Doris Agotai, Simon Schubiger, Gerhard Schwabe","doi":"arxiv-2409.09462","DOIUrl":"https://doi.org/arxiv-2409.09462","url":null,"abstract":"Advisory services are ritualized encounters between an expert and an advisee.\u0000Empathetic, high-touch relationship between those two parties was identified as\u0000the key aspect of a successful advisory encounter. To facilitate the high-touch\u0000interaction, advisors established rituals which stress the unique, individual\u0000character of each client and each single encounter. Simultaneously,\u0000organizations like banks or insurances rolled out tools and technologies for\u0000use in advisory services to offer a uniform experience and consistent quality\u0000across branches and advisors. As a consequence, advisors were caught between\u0000the high-touch and high-tech aspects of an advisory service. This manuscript\u0000presents a system that accommodates for high-touch rituals and practices and\u0000combines them with high-tech collaboration. The proposed solution augments\u0000pen-and-paper practices with digital content and affords new material\u0000performances coherent with the existing rituals. The evaluation in realistic\u0000mortgage advisory services unveils the potential of mixed reality approaches\u0000for application in professional, institutional settings. The blow-by-blow\u0000analysis of the conversations reveals how an advisory service can become\u0000equally high-tech and high-touch thanks to a careful ritual-oriented system\u0000design. As a consequence, this paper presents a solution to the tension between\u0000the high-touch and high-tech tendencies in advisory services.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Persuasion can be a complex process. Persuaders may need to use a high degree of sensitivity to understand a persuadee's states, traits, and values. They must navigate the nuanced field of human interaction. Research on persuasive systems often overlooks the delicate nature of persuasion, favoring "one-size-fits-all" approaches and risking the alienation of certain users. This study examines the considerations made by professional burglary prevention advisors when persuading clients to enhance their home security. It illustrates how advisors adapt their approaches based on each advisee's states and traits. Specifically, the study reveals how advisors deviate from intended and technologically supported practices to accommodate the individual attributes of their advisees. It identifies multiple advisee-specific aspects likely to moderate the effectiveness of persuasive efforts and suggests strategies for addressing these differences. These findings are relevant for designing personalized persuasive systems that rely on conversational modes of persuasion.
{"title":"How persuade's psychological states and traits shape digital persuasion: Lessons learnt from mobile burglary prevention encounters","authors":"Mateusz Dolata, Robert O. Briggs, Gerhard Schwabe","doi":"arxiv-2409.09453","DOIUrl":"https://doi.org/arxiv-2409.09453","url":null,"abstract":"Persuasion can be a complex process. Persuaders may need to use a high degree\u0000of sensitivity to understand a persuadee's states, traits, and values. They\u0000must navigate the nuanced field of human interaction. Research on persuasive\u0000systems often overlooks the delicate nature of persuasion, favoring\u0000\"one-size-fits-all\" approaches and risking the alienation of certain users.\u0000This study examines the considerations made by professional burglary prevention\u0000advisors when persuading clients to enhance their home security. It illustrates\u0000how advisors adapt their approaches based on each advisee's states and traits.\u0000Specifically, the study reveals how advisors deviate from intended and\u0000technologically supported practices to accommodate the individual attributes of\u0000their advisees. It identifies multiple advisee-specific aspects likely to\u0000moderate the effectiveness of persuasive efforts and suggests strategies for\u0000addressing these differences. These findings are relevant for designing\u0000personalized persuasive systems that rely on conversational modes of\u0000persuasion.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Security and privacy perspectives of people in a multi-user home are a growing area of research, with many researchers reflecting on the complicated power imbalance and challenging access control issues of the devices involved. However, these studies primarily focused on the multi-user scenarios in traditional family home settings, leaving other types of multi-user home environments, such as homes shared by co-habitants without a familial relationship, under-studied. This paper closes this research gap via quantitative and qualitative analysis of results from an online survey and content analysis of sampled online posts on Reddit. It explores the complex roles of shared home users, which depend on various factors unique to the shared home environment, e.g., who owns what home devices, how home devices are used by multiple users, and more complicated relationships between the landlord and people in the shared home and among co-habitants. Half (50.7%) of our survey participants thought that devices in a shared home are less secure than in a traditional family home. This perception was found statistically significantly associated with factors such as the fear of devices being tampered with in their absence and (lack of) trust in other co-habitants and their visitors. Our study revealed new user types and relationships in a multi-user environment such as ExternalPrimary-InternalPrimary while analysing the landlord and shared home resident relationship with regard to shared home device use. We propose a threat actor model for shared home environments, which has a focus on possible malicious behaviours of current and past co-habitants of a shared home, as a special type of insider threat in a home environment. We also recommend further research to understand the complex roles co-habitants can play in navigating and adapting to a shared home environment's security and privacy landscape.
{"title":"Security and Privacy Perspectives of People Living in Shared Home Environments","authors":"Nandita Pattnaik, Shujun Li, Jason R. C. Nurse","doi":"arxiv-2409.09363","DOIUrl":"https://doi.org/arxiv-2409.09363","url":null,"abstract":"Security and privacy perspectives of people in a multi-user home are a\u0000growing area of research, with many researchers reflecting on the complicated\u0000power imbalance and challenging access control issues of the devices involved.\u0000However, these studies primarily focused on the multi-user scenarios in\u0000traditional family home settings, leaving other types of multi-user home\u0000environments, such as homes shared by co-habitants without a familial\u0000relationship, under-studied. This paper closes this research gap via\u0000quantitative and qualitative analysis of results from an online survey and\u0000content analysis of sampled online posts on Reddit. It explores the complex\u0000roles of shared home users, which depend on various factors unique to the\u0000shared home environment, e.g., who owns what home devices, how home devices are\u0000used by multiple users, and more complicated relationships between the landlord\u0000and people in the shared home and among co-habitants. Half (50.7%) of our\u0000survey participants thought that devices in a shared home are less secure than\u0000in a traditional family home. This perception was found statistically\u0000significantly associated with factors such as the fear of devices being\u0000tampered with in their absence and (lack of) trust in other co-habitants and\u0000their visitors. Our study revealed new user types and relationships in a\u0000multi-user environment such as ExternalPrimary-InternalPrimary while analysing\u0000the landlord and shared home resident relationship with regard to shared home\u0000device use. We propose a threat actor model for shared home environments, which\u0000has a focus on possible malicious behaviours of current and past co-habitants\u0000of a shared home, as a special type of insider threat in a home environment. We\u0000also recommend further research to understand the complex roles co-habitants\u0000can play in navigating and adapting to a shared home environment's security and\u0000privacy landscape.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wearable medical sensors (WMSs) are revolutionizing smart healthcare by enabling continuous, real-time monitoring of user physiological signals, especially in the field of consumer healthcare. The integration of WMSs and modern machine learning (ML) enables unprecedented solutions to efficient early-stage disease detection. Despite the success of Transformers in various fields, their application to sensitive domains, such as smart healthcare, remains underexplored due to limited data accessibility and privacy concerns. To bridge the gap between Transformer-based foundation models and WMS-based disease detection, we propose COMFORT, a continual fine-tuning framework for foundation models targeted at consumer healthcare. COMFORT introduces a novel approach for pre-training a Transformer-based foundation model on a large dataset of physiological signals exclusively collected from healthy individuals with commercially available WMSs. We adopt a masked data modeling (MDM) objective to pre-train this health foundation model. We then fine-tune the model using various parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and its variants, to adapt it to various downstream disease detection tasks that rely on WMS data. In addition, COMFORT continually stores the low-rank decomposition matrices obtained from the PEFT algorithms to construct a library for multi-disease detection. The COMFORT library enables scalable and memory-efficient disease detection on edge devices. Our experimental results demonstrate that COMFORT achieves highly competitive performance while reducing memory overhead by up to 52% relative to conventional methods. Thus, COMFORT paves the way for personalized and proactive solutions to efficient and effective early-stage disease detection for consumer healthcare.
{"title":"COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare","authors":"Chia-Hao Li, Niraj K. Jha","doi":"arxiv-2409.09549","DOIUrl":"https://doi.org/arxiv-2409.09549","url":null,"abstract":"Wearable medical sensors (WMSs) are revolutionizing smart healthcare by\u0000enabling continuous, real-time monitoring of user physiological signals,\u0000especially in the field of consumer healthcare. The integration of WMSs and\u0000modern machine learning (ML) enables unprecedented solutions to efficient\u0000early-stage disease detection. Despite the success of Transformers in various\u0000fields, their application to sensitive domains, such as smart healthcare,\u0000remains underexplored due to limited data accessibility and privacy concerns.\u0000To bridge the gap between Transformer-based foundation models and WMS-based\u0000disease detection, we propose COMFORT, a continual fine-tuning framework for\u0000foundation models targeted at consumer healthcare. COMFORT introduces a novel\u0000approach for pre-training a Transformer-based foundation model on a large\u0000dataset of physiological signals exclusively collected from healthy individuals\u0000with commercially available WMSs. We adopt a masked data modeling (MDM)\u0000objective to pre-train this health foundation model. We then fine-tune the\u0000model using various parameter-efficient fine-tuning (PEFT) methods, such as\u0000low-rank adaptation (LoRA) and its variants, to adapt it to various downstream\u0000disease detection tasks that rely on WMS data. In addition, COMFORT continually\u0000stores the low-rank decomposition matrices obtained from the PEFT algorithms to\u0000construct a library for multi-disease detection. The COMFORT library enables\u0000scalable and memory-efficient disease detection on edge devices. Our\u0000experimental results demonstrate that COMFORT achieves highly competitive\u0000performance while reducing memory overhead by up to 52% relative to\u0000conventional methods. Thus, COMFORT paves the way for personalized and\u0000proactive solutions to efficient and effective early-stage disease detection\u0000for consumer healthcare.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katie Davis, Morgan Anderson, Chia-chen Yang, Sophia Choukas-Bradley, Beth T. Bell, Petr Slovak
This paper provides a broad, multi-disciplinary overview of key insights, persistent gaps, and future paths in youth digital well-being research from the perspectives of researchers who are conducting this work.
{"title":"Cross-Disciplinary Perspectives on Youth Digital Well-Being Research: Identifying Notable Developments, Persistent Gaps, and Future Directions","authors":"Katie Davis, Morgan Anderson, Chia-chen Yang, Sophia Choukas-Bradley, Beth T. Bell, Petr Slovak","doi":"arxiv-2409.09267","DOIUrl":"https://doi.org/arxiv-2409.09267","url":null,"abstract":"This paper provides a broad, multi-disciplinary overview of key insights,\u0000persistent gaps, and future paths in youth digital well-being research from the\u0000perspectives of researchers who are conducting this work.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Charles Ma, Kevin Hyekang Joo, Alexandria K. Vail, Sunreeta Bhattacharya, Álvaro Fernández García, Kailana Baker-Matsuoka, Sheryl Mathew, Lori L. Holt, Fernando De la Torre
Over the past decade, wearable computing devices (``smart glasses'') have undergone remarkable advancements in sensor technology, design, and processing power, ushering in a new era of opportunity for high-density human behavior data. Equipped with wearable cameras, these glasses offer a unique opportunity to analyze non-verbal behavior in natural settings as individuals interact. Our focus lies in predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion. Leveraging such analyses may revolutionize our understanding of human communication, foster more effective collaboration in professional environments, provide better mental health support through empathetic virtual interactions, and enhance accessibility for those with communication barriers. In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation. We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a ``multimodal transcript'' that can be processed by an LLM for behavioral reasoning tasks. Remarkably, this method achieves performance comparable to established fusion techniques even in its preliminary implementation, indicating strong potential for further research and optimization. This fusion method is one of the first to approach ``reasoning'' about real-world human behavior through a language model. Smart glasses provide us the ability to unobtrusively gather high-density multimodal data on human behavior, paving the way for new approaches to understanding and improving human communication with the potential for important societal benefits. The features and data collected during the studies will be made publicly available to promote further research.
{"title":"Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation","authors":"Cheng Charles Ma, Kevin Hyekang Joo, Alexandria K. Vail, Sunreeta Bhattacharya, Álvaro Fernández García, Kailana Baker-Matsuoka, Sheryl Mathew, Lori L. Holt, Fernando De la Torre","doi":"arxiv-2409.09135","DOIUrl":"https://doi.org/arxiv-2409.09135","url":null,"abstract":"Over the past decade, wearable computing devices (``smart glasses'') have\u0000undergone remarkable advancements in sensor technology, design, and processing\u0000power, ushering in a new era of opportunity for high-density human behavior\u0000data. Equipped with wearable cameras, these glasses offer a unique opportunity\u0000to analyze non-verbal behavior in natural settings as individuals interact. Our\u0000focus lies in predicting engagement in dyadic interactions by scrutinizing\u0000verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.\u0000Leveraging such analyses may revolutionize our understanding of human\u0000communication, foster more effective collaboration in professional\u0000environments, provide better mental health support through empathetic virtual\u0000interactions, and enhance accessibility for those with communication barriers. In this work, we collect a dataset featuring 34 participants engaged in\u0000casual dyadic conversations, each providing self-reported engagement ratings at\u0000the end of each conversation. We introduce a novel fusion strategy using Large\u0000Language Models (LLMs) to integrate multiple behavior modalities into a\u0000``multimodal transcript'' that can be processed by an LLM for behavioral\u0000reasoning tasks. Remarkably, this method achieves performance comparable to\u0000established fusion techniques even in its preliminary implementation,\u0000indicating strong potential for further research and optimization. This fusion\u0000method is one of the first to approach ``reasoning'' about real-world human\u0000behavior through a language model. Smart glasses provide us the ability to\u0000unobtrusively gather high-density multimodal data on human behavior, paving the\u0000way for new approaches to understanding and improving human communication with\u0000the potential for important societal benefits. The features and data collected\u0000during the studies will be made publicly available to promote further research.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}