Decision-making is increasingly supported by machine recommendations. In healthcare, for example, a clinical decision support system is used by the physician to find a treatment option for a patient. In doing so, people can rely too much on these systems, which impairs their own reasoning process. The European AI Act addresses the risk of over-reliance and postulates in Article 14 on human oversight that people should be able "to remain aware of the possible tendency of automatically relying or over-relying on the output". Similarly, the EU High-Level Expert Group identifies human agency and oversight as the first of seven key requirements for trustworthy AI. The following position paper proposes a conceptual approach to generate machine questions about the decision at hand, in order to promote decision-making autonomy. This engagement in turn allows for oversight of recommender systems. The systematic and interdisciplinary investigation (e.g., machine learning, user experience design, psychology, philosophy of technology) of human-machine interaction in relation to decision-making provides insights to questions like: how to increase human oversight and calibrate over- and under-reliance on machine recommendations; how to increase decision-making autonomy and remain aware of other possibilities beyond automated suggestions that repeat the status-quo?
{"title":"Questioning AI: Promoting Decision-Making Autonomy Through Reflection","authors":"Simon WS Fischer","doi":"arxiv-2409.10250","DOIUrl":"https://doi.org/arxiv-2409.10250","url":null,"abstract":"Decision-making is increasingly supported by machine recommendations. In\u0000healthcare, for example, a clinical decision support system is used by the\u0000physician to find a treatment option for a patient. In doing so, people can\u0000rely too much on these systems, which impairs their own reasoning process. The\u0000European AI Act addresses the risk of over-reliance and postulates in Article\u000014 on human oversight that people should be able \"to remain aware of the\u0000possible tendency of automatically relying or over-relying on the output\".\u0000Similarly, the EU High-Level Expert Group identifies human agency and oversight\u0000as the first of seven key requirements for trustworthy AI. The following\u0000position paper proposes a conceptual approach to generate machine questions\u0000about the decision at hand, in order to promote decision-making autonomy. This\u0000engagement in turn allows for oversight of recommender systems. The systematic\u0000and interdisciplinary investigation (e.g., machine learning, user experience\u0000design, psychology, philosophy of technology) of human-machine interaction in\u0000relation to decision-making provides insights to questions like: how to\u0000increase human oversight and calibrate over- and under-reliance on machine\u0000recommendations; how to increase decision-making autonomy and remain aware of\u0000other possibilities beyond automated suggestions that repeat the status-quo?","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"101 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper provides a comprehensive survey of sentiment analysis within the context of artificial intelligence (AI) and large language models (LLMs). Sentiment analysis, a critical aspect of natural language processing (NLP), has evolved significantly from traditional rule-based methods to advanced deep learning techniques. This study examines the historical development of sentiment analysis, highlighting the transition from lexicon-based and pattern-based approaches to more sophisticated machine learning and deep learning models. Key challenges are discussed, including handling bilingual texts, detecting sarcasm, and addressing biases. The paper reviews state-of-the-art approaches, identifies emerging trends, and outlines future research directions to advance the field. By synthesizing current methodologies and exploring future opportunities, this survey aims to understand sentiment analysis in the AI and LLM context thoroughly.
{"title":"Comprehensive Study on Sentiment Analysis: From Rule-based to modern LLM based system","authors":"Shailja Gupta, Rajesh Ranjan, Surya Narayan Singh","doi":"arxiv-2409.09989","DOIUrl":"https://doi.org/arxiv-2409.09989","url":null,"abstract":"This paper provides a comprehensive survey of sentiment analysis within the\u0000context of artificial intelligence (AI) and large language models (LLMs).\u0000Sentiment analysis, a critical aspect of natural language processing (NLP), has\u0000evolved significantly from traditional rule-based methods to advanced deep\u0000learning techniques. This study examines the historical development of\u0000sentiment analysis, highlighting the transition from lexicon-based and\u0000pattern-based approaches to more sophisticated machine learning and deep\u0000learning models. Key challenges are discussed, including handling bilingual\u0000texts, detecting sarcasm, and addressing biases. The paper reviews\u0000state-of-the-art approaches, identifies emerging trends, and outlines future\u0000research directions to advance the field. By synthesizing current methodologies\u0000and exploring future opportunities, this survey aims to understand sentiment\u0000analysis in the AI and LLM context thoroughly.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"208 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommender systems, while a powerful decision making tool, are often operationalized as black box models, such that their AI algorithms are not accessible or interpretable by human operators. This in turn can cause confusion and frustration for the operator and result in unsatisfactory outcomes. While the field of explainable AI has made remarkable strides in addressing this challenge by focusing on interpreting and explaining the algorithms to human operators, there are remaining gaps in the human's understanding of the recommender system. This paper investigates the relative impact of using context, properties of the decision making task and environment, to align human and AI algorithm understanding of the state of the world, i.e. judgment, to improve joint human-recommender performance as compared to utilizing post-hoc algorithmic explanations. We conducted an empirical, between-subjects experiment in which participants were asked to work with an automated recommender system to complete a decision making task. We manipulated the method of transparency (shared contextual information to support shared judgment vs algorithmic explanations) and record the human's understanding of the task, the recommender system, and their overall performance. We found that both techniques yielded equivalent agreement on final decisions. However, those who saw task context had less tendency to over-rely on the recommender system and were able to better pinpoint in what conditions the AI erred. Both methods improved participants' confidence in their own decision making, and increased mental demand equally and frustration negligibly. These results present an alternative approach to improving team performance to post-hoc explanations and illustrate the impact of judgment on human cognition in working with recommender systems.
{"title":"Aligning Judgment Using Task Context and Explanations to Improve Human-Recommender System Performance","authors":"Divya Srivastava, Karen M. Feigh","doi":"arxiv-2409.10717","DOIUrl":"https://doi.org/arxiv-2409.10717","url":null,"abstract":"Recommender systems, while a powerful decision making tool, are often\u0000operationalized as black box models, such that their AI algorithms are not\u0000accessible or interpretable by human operators. This in turn can cause\u0000confusion and frustration for the operator and result in unsatisfactory\u0000outcomes. While the field of explainable AI has made remarkable strides in\u0000addressing this challenge by focusing on interpreting and explaining the\u0000algorithms to human operators, there are remaining gaps in the human's\u0000understanding of the recommender system. This paper investigates the relative\u0000impact of using context, properties of the decision making task and\u0000environment, to align human and AI algorithm understanding of the state of the\u0000world, i.e. judgment, to improve joint human-recommender performance as\u0000compared to utilizing post-hoc algorithmic explanations. We conducted an\u0000empirical, between-subjects experiment in which participants were asked to work\u0000with an automated recommender system to complete a decision making task. We\u0000manipulated the method of transparency (shared contextual information to\u0000support shared judgment vs algorithmic explanations) and record the human's\u0000understanding of the task, the recommender system, and their overall\u0000performance. We found that both techniques yielded equivalent agreement on\u0000final decisions. However, those who saw task context had less tendency to\u0000over-rely on the recommender system and were able to better pinpoint in what\u0000conditions the AI erred. Both methods improved participants' confidence in\u0000their own decision making, and increased mental demand equally and frustration\u0000negligibly. These results present an alternative approach to improving team\u0000performance to post-hoc explanations and illustrate the impact of judgment on\u0000human cognition in working with recommender systems.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrick Paetzold, David Hägele, Marina Evers, Daniel Weiskopf, Oliver Deussen
Current research provides methods to communicate uncertainty and adapts classical algorithms of the visualization pipeline to take the uncertainty into account. Various existing visualization frameworks include methods to present uncertain data but do not offer transformation techniques tailored to uncertain data. Therefore, we propose a software package for uncertainty-aware data analysis in Python (UADAPy) offering methods for uncertain data along the visualization pipeline. We aim to provide a platform that is the foundation for further integration of uncertainty algorithms and visualizations. It provides common utility functionality to support research in uncertainty-aware visualization algorithms and makes state-of-the-art research results accessible to the end user. The project is available at https://github.com/UniStuttgart-VISUS/uadapy.
{"title":"UADAPy: An Uncertainty-Aware Visualization and Analysis Toolbox","authors":"Patrick Paetzold, David Hägele, Marina Evers, Daniel Weiskopf, Oliver Deussen","doi":"arxiv-2409.10217","DOIUrl":"https://doi.org/arxiv-2409.10217","url":null,"abstract":"Current research provides methods to communicate uncertainty and adapts\u0000classical algorithms of the visualization pipeline to take the uncertainty into\u0000account. Various existing visualization frameworks include methods to present\u0000uncertain data but do not offer transformation techniques tailored to uncertain\u0000data. Therefore, we propose a software package for uncertainty-aware data\u0000analysis in Python (UADAPy) offering methods for uncertain data along the\u0000visualization pipeline. We aim to provide a platform that is the foundation for\u0000further integration of uncertainty algorithms and visualizations. It provides\u0000common utility functionality to support research in uncertainty-aware\u0000visualization algorithms and makes state-of-the-art research results accessible\u0000to the end user. The project is available at\u0000https://github.com/UniStuttgart-VISUS/uadapy.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"208 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TOTTA outlines the spatial position and rotation guidance of a real/virtual tool (TO) towards a real/virtual target (TA), which is a key task in Mixed Reality applications. The task error can have critical consequences regarding safety, performance, and quality, such as in surgical implantology or industrial maintenance scenarios. The TOTTA problem lacks a dedicated study and is scattered across different domains with isolated designs. This work contributes to a systematic review of the TOTTA visual widgets, studying 70 unique designs from 24 papers. TOTTA is commonly guided by visual overlap an intuitive, pre-attentive 'collimation' feedback of simple-shaped widgets: Box, 3D Axes, 3D Model, 2D Crosshair, Globe, Tetrahedron, Line, and Plane. Our research discovers that TO and TA are often represented with the same shape. They are distinguished by topological elements (e.g., edges, vertices, faces), colors, transparency levels, and added shapes, widget quantity, and size. Meanwhile, some designs provide continuous 'during manipulation feedback' relative to the distance between TO and TA by text, dynamic color, sonification, and amplified graphical visualization. Some approaches trigger discrete 'TA reached feedback,' such as color alteration, added sound, TA shape change, and added text. We found a lack of golden standards, including in testing procedures, as current ones are limited to partial sets with different and incomparable setups (different target configurations, avatar, background, etc.). We also found a bias in participants: right-handed, young male, non-color impaired.
{"title":"Precise Tool to Target Positioning Widgets (TOTTA) in Spatial Environments: A Systematic Review","authors":"Mine Dastan, Michele Fiorentino, Antonio E. Uva","doi":"arxiv-2409.10239","DOIUrl":"https://doi.org/arxiv-2409.10239","url":null,"abstract":"TOTTA outlines the spatial position and rotation guidance of a real/virtual\u0000tool (TO) towards a real/virtual target (TA), which is a key task in Mixed\u0000Reality applications. The task error can have critical consequences regarding\u0000safety, performance, and quality, such as in surgical implantology or\u0000industrial maintenance scenarios. The TOTTA problem lacks a dedicated study and\u0000is scattered across different domains with isolated designs. This work\u0000contributes to a systematic review of the TOTTA visual widgets, studying 70\u0000unique designs from 24 papers. TOTTA is commonly guided by visual overlap an\u0000intuitive, pre-attentive 'collimation' feedback of simple-shaped widgets: Box,\u00003D Axes, 3D Model, 2D Crosshair, Globe, Tetrahedron, Line, and Plane. Our\u0000research discovers that TO and TA are often represented with the same shape.\u0000They are distinguished by topological elements (e.g., edges, vertices, faces),\u0000colors, transparency levels, and added shapes, widget quantity, and size.\u0000Meanwhile, some designs provide continuous 'during manipulation feedback'\u0000relative to the distance between TO and TA by text, dynamic color,\u0000sonification, and amplified graphical visualization. Some approaches trigger\u0000discrete 'TA reached feedback,' such as color alteration, added sound, TA shape\u0000change, and added text. We found a lack of golden standards, including in\u0000testing procedures, as current ones are limited to partial sets with different\u0000and incomparable setups (different target configurations, avatar, background,\u0000etc.). We also found a bias in participants: right-handed, young male,\u0000non-color impaired.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiangzhe Yuan, Jiajun Wang, Siying Hu, Andrew Cheung, Zhicong Lu
As the demand for computer science (CS) skills grows, mastering foundational concepts is crucial yet challenging for novice learners. To address this challenge, we present KoroT-3E, an AI-based system that creates personalized musical mnemonics to enhance both memory retention and understanding of concepts in CS. KoroT-3E enables users to transform complex concepts into memorable lyrics and compose melodies that suit their musical preferences. We conducted semi-structured interviews (n=12) to investigate why novice learners find it challenging to memorize and understand CS concepts. The findings, combined with constructivist learning theory, established our initial design, which was then refined following consultations with CS education experts. An empirical experiment(n=36) showed that those using KoroT-3E (n=18) significantly outperformed the control group (n=18), with improved memory efficiency, increased motivation, and a positive learning experience. These findings demonstrate the effectiveness of integrating multimodal generative AI into CS education to create personalized and interactive learning experiences.
{"title":"KoroT-3E: A Personalized Musical Mnemonics Tool for Enhancing Memory Retention of Complex Computer Science Concepts","authors":"Xiangzhe Yuan, Jiajun Wang, Siying Hu, Andrew Cheung, Zhicong Lu","doi":"arxiv-2409.10446","DOIUrl":"https://doi.org/arxiv-2409.10446","url":null,"abstract":"As the demand for computer science (CS) skills grows, mastering foundational\u0000concepts is crucial yet challenging for novice learners. To address this\u0000challenge, we present KoroT-3E, an AI-based system that creates personalized\u0000musical mnemonics to enhance both memory retention and understanding of\u0000concepts in CS. KoroT-3E enables users to transform complex concepts into\u0000memorable lyrics and compose melodies that suit their musical preferences. We\u0000conducted semi-structured interviews (n=12) to investigate why novice learners\u0000find it challenging to memorize and understand CS concepts. The findings,\u0000combined with constructivist learning theory, established our initial design,\u0000which was then refined following consultations with CS education experts. An\u0000empirical experiment(n=36) showed that those using KoroT-3E (n=18)\u0000significantly outperformed the control group (n=18), with improved memory\u0000efficiency, increased motivation, and a positive learning experience. These\u0000findings demonstrate the effectiveness of integrating multimodal generative AI\u0000into CS education to create personalized and interactive learning experiences.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large Language Models (LLMs) are widely used in healthcare, but limitations like hallucinations, incomplete information, and bias hinder their reliability. To address these, researchers released the Build Your Own expert Bot (BYOeB) platform, enabling developers to create LLM-powered chatbots with integrated expert verification. CataractBot, its first implementation, provides expert-verified responses to cataract surgery questions. A pilot evaluation showed its potential; however the study had a small sample size and was primarily qualitative. In this work, we conducted a large-scale 24-week deployment of CataractBot involving 318 patients and attendants who sent 1,992 messages, with 91.71% of responses verified by seven experts. Analysis of interaction logs revealed that medical questions significantly outnumbered logistical ones, hallucinations were negligible, and experts rated 84.52% of medical answers as accurate. As the knowledge base expanded with expert corrections, system performance improved by 19.02%, reducing expert workload. These insights guide the design of future LLM-powered chatbots.
{"title":"Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot","authors":"Bhuvan Sachdeva, Pragnya Ramjee, Geeta Fulari, Kaushik Murali, Mohit Jain","doi":"arxiv-2409.10354","DOIUrl":"https://doi.org/arxiv-2409.10354","url":null,"abstract":"Large Language Models (LLMs) are widely used in healthcare, but limitations\u0000like hallucinations, incomplete information, and bias hinder their reliability.\u0000To address these, researchers released the Build Your Own expert Bot (BYOeB)\u0000platform, enabling developers to create LLM-powered chatbots with integrated\u0000expert verification. CataractBot, its first implementation, provides\u0000expert-verified responses to cataract surgery questions. A pilot evaluation\u0000showed its potential; however the study had a small sample size and was\u0000primarily qualitative. In this work, we conducted a large-scale 24-week\u0000deployment of CataractBot involving 318 patients and attendants who sent 1,992\u0000messages, with 91.71% of responses verified by seven experts. Analysis of\u0000interaction logs revealed that medical questions significantly outnumbered\u0000logistical ones, hallucinations were negligible, and experts rated 84.52% of\u0000medical answers as accurate. As the knowledge base expanded with expert\u0000corrections, system performance improved by 19.02%, reducing expert workload.\u0000These insights guide the design of future LLM-powered chatbots.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Fu, Shunan Guo, Jane Hoffswell, Victor S. Bursztyn, Ryan Rossi, John Stasko
Fact-checking data claims requires data evidence retrieval and analysis, which can become tedious and intractable when done manually. This work presents Aletheia, an automated fact-checking prototype designed to facilitate data claims verification and enhance data evidence communication. For verification, we utilize a pre-trained LLM to parse the semantics for evidence retrieval. To effectively communicate the data evidence, we design representations in two forms: data tables and visualizations, tailored to various data fact types. Additionally, we design interactions that showcase a real-world application of these techniques. We evaluate the performance of two core NLP tasks with a curated dataset comprising 400 data claims and compare the two representation forms regarding viewers' assessment time, confidence, and preference via a user study with 20 participants. The evaluation offers insights into the feasibility and bottlenecks of using LLMs for data fact-checking tasks, potential advantages and disadvantages of using visualizations over data tables, and design recommendations for presenting data evidence.
{"title":"\"The Data Says Otherwise\"-Towards Automated Fact-checking and Communication of Data Claims","authors":"Yu Fu, Shunan Guo, Jane Hoffswell, Victor S. Bursztyn, Ryan Rossi, John Stasko","doi":"arxiv-2409.10713","DOIUrl":"https://doi.org/arxiv-2409.10713","url":null,"abstract":"Fact-checking data claims requires data evidence retrieval and analysis,\u0000which can become tedious and intractable when done manually. This work presents\u0000Aletheia, an automated fact-checking prototype designed to facilitate data\u0000claims verification and enhance data evidence communication. For verification,\u0000we utilize a pre-trained LLM to parse the semantics for evidence retrieval. To\u0000effectively communicate the data evidence, we design representations in two\u0000forms: data tables and visualizations, tailored to various data fact types.\u0000Additionally, we design interactions that showcase a real-world application of\u0000these techniques. We evaluate the performance of two core NLP tasks with a\u0000curated dataset comprising 400 data claims and compare the two representation\u0000forms regarding viewers' assessment time, confidence, and preference via a user\u0000study with 20 participants. The evaluation offers insights into the feasibility\u0000and bottlenecks of using LLMs for data fact-checking tasks, potential\u0000advantages and disadvantages of using visualizations over data tables, and\u0000design recommendations for presenting data evidence.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mental health disorders are among the most prevalent diseases worldwide, affecting nearly one in four people. Despite their widespread impact, the intervention rate remains below 25%, largely due to the significant cooperation required from patients for both diagnosis and intervention. The core issue behind this low treatment rate is stigma, which discourages over half of those affected from seeking help. This paper presents MindGuard, an accessible, stigma-free, and professional mobile mental healthcare system designed to provide mental health first aid. The heart of MindGuard is an innovative edge LLM, equipped with professional mental health knowledge, that seamlessly integrates objective mobile sensor data with subjective Ecological Momentary Assessment records to deliver personalized screening and intervention conversations. We conduct a broad evaluation of MindGuard using open datasets spanning four years and real-world deployment across various mobile devices involving 20 subjects for two weeks. Remarkably, MindGuard achieves results comparable to GPT-4 and outperforms its counterpart with more than 10 times the model size. We believe that MindGuard paves the way for mobile LLM applications, potentially revolutionizing mental healthcare practices by substituting self-reporting and intervention conversations with passive, integrated monitoring within daily life, thus ensuring accessible and stigma-free mental health support.
{"title":"MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid via Edge LLM","authors":"Sijie Ji, Xinzhe Zheng, Jiawei Sun, Renqi Chen, Wei Gao, Mani Srivastava","doi":"arxiv-2409.10064","DOIUrl":"https://doi.org/arxiv-2409.10064","url":null,"abstract":"Mental health disorders are among the most prevalent diseases worldwide,\u0000affecting nearly one in four people. Despite their widespread impact, the\u0000intervention rate remains below 25%, largely due to the significant cooperation\u0000required from patients for both diagnosis and intervention. The core issue\u0000behind this low treatment rate is stigma, which discourages over half of those\u0000affected from seeking help. This paper presents MindGuard, an accessible,\u0000stigma-free, and professional mobile mental healthcare system designed to\u0000provide mental health first aid. The heart of MindGuard is an innovative edge\u0000LLM, equipped with professional mental health knowledge, that seamlessly\u0000integrates objective mobile sensor data with subjective Ecological Momentary\u0000Assessment records to deliver personalized screening and intervention\u0000conversations. We conduct a broad evaluation of MindGuard using open datasets\u0000spanning four years and real-world deployment across various mobile devices\u0000involving 20 subjects for two weeks. Remarkably, MindGuard achieves results\u0000comparable to GPT-4 and outperforms its counterpart with more than 10 times the\u0000model size. We believe that MindGuard paves the way for mobile LLM\u0000applications, potentially revolutionizing mental healthcare practices by\u0000substituting self-reporting and intervention conversations with passive,\u0000integrated monitoring within daily life, thus ensuring accessible and\u0000stigma-free mental health support.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minsuk Chang, Soohyun Lee, Aeri Cho, Hyeon Jeon, Seokhyeon Park, Cindy Xiong Bearfield, Jinwook Seo
We introduce a novel crowdsourcing method for identifying important areas in graphical images through punch-hole labeling. Traditional methods, such as gaze trackers and mouse-based annotations, which generate continuous data, can be impractical in crowdsourcing scenarios. They require many participants, and the outcome data can be noisy. In contrast, our method first segments the graphical image with a grid and drops a portion of the patches (punch holes). Then, we iteratively ask the labeler to validate each annotation with holes, narrowing down the annotation only having the most important area. This approach aims to reduce annotation noise in crowdsourcing by standardizing the annotations while enhancing labeling efficiency and reliability. Preliminary findings from fundamental charts demonstrate that punch-hole labeling can effectively pinpoint critical regions. This also highlights its potential for broader application in visualization research, particularly in studying large-scale users' graphical perception. Our future work aims to enhance the algorithm to achieve faster labeling speed and prove its utility through large-scale experiments.
{"title":"Efficiently Crowdsourcing Visual Importance with Punch-Hole Annotation","authors":"Minsuk Chang, Soohyun Lee, Aeri Cho, Hyeon Jeon, Seokhyeon Park, Cindy Xiong Bearfield, Jinwook Seo","doi":"arxiv-2409.10459","DOIUrl":"https://doi.org/arxiv-2409.10459","url":null,"abstract":"We introduce a novel crowdsourcing method for identifying important areas in\u0000graphical images through punch-hole labeling. Traditional methods, such as gaze\u0000trackers and mouse-based annotations, which generate continuous data, can be\u0000impractical in crowdsourcing scenarios. They require many participants, and the\u0000outcome data can be noisy. In contrast, our method first segments the graphical\u0000image with a grid and drops a portion of the patches (punch holes). Then, we\u0000iteratively ask the labeler to validate each annotation with holes, narrowing\u0000down the annotation only having the most important area. This approach aims to\u0000reduce annotation noise in crowdsourcing by standardizing the annotations while\u0000enhancing labeling efficiency and reliability. Preliminary findings from\u0000fundamental charts demonstrate that punch-hole labeling can effectively\u0000pinpoint critical regions. This also highlights its potential for broader\u0000application in visualization research, particularly in studying large-scale\u0000users' graphical perception. Our future work aims to enhance the algorithm to\u0000achieve faster labeling speed and prove its utility through large-scale\u0000experiments.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}