We contribute an analysis of situated visualizations in motion in video games for different types of data, with a focus on quantitative and categorical data representations. Video games convey a lot of data to players, to help them succeed in the game. These visualizations frequently move across the screen due to camera changes or because the game elements themselves move. Our ultimate goal is to understand how motion factors affect visualization readability in video games and subsequently the players' performance in the game. We started our work by surveying the characteristics of how motion currently influences which kind of data representations in video games. We conducted a systematic review of 160 visualizations in motion in video games and extracted patterns and considerations regarding was what, and how visualizations currently exhibit motion factors in video games.
{"title":"Visualization in Motion in Video Games for Different Types of Data","authors":"Federica Bucchieri, Lijie Yao, Petra Isenberg","doi":"arxiv-2409.07696","DOIUrl":"https://doi.org/arxiv-2409.07696","url":null,"abstract":"We contribute an analysis of situated visualizations in motion in video games\u0000for different types of data, with a focus on quantitative and categorical data\u0000representations. Video games convey a lot of data to players, to help them\u0000succeed in the game. These visualizations frequently move across the screen due\u0000to camera changes or because the game elements themselves move. Our ultimate\u0000goal is to understand how motion factors affect visualization readability in\u0000video games and subsequently the players' performance in the game. We started\u0000our work by surveying the characteristics of how motion currently influences\u0000which kind of data representations in video games. We conducted a systematic\u0000review of 160 visualizations in motion in video games and extracted patterns\u0000and considerations regarding was what, and how visualizations currently exhibit\u0000motion factors in video games.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiashuo Cao, Wujie Gao, Yun Suen Pai, Simon Hoermann, Chen Li, Nilufar Baghaei, Mark Billinghurst
The advent of technology-enhanced interventions has significantly transformed mental health services, offering new opportunities for delivering psychotherapy, particularly in remote settings. This paper reports on a pilot study exploring the use of Virtual Reality (VR) as a medium for remote counselling. The study involved four experienced psychotherapists who evaluated three different virtual environments designed to support remote counselling. Through thematic analysis of interviews and feedback, we identified key factors that could be critical for designing effective virtual environments for counselling. These include the creation of clear boundaries, customization to meet specific therapeutic needs, and the importance of aligning the environment with various therapeutic approaches. Our findings suggest that VR can enhance the sense of presence and engagement in remote therapy, potentially improving the therapeutic relationship. In the paper we also outline areas for future research based on these pilot study results.
{"title":"Explorations in Designing Virtual Environments for Remote Counselling","authors":"Jiashuo Cao, Wujie Gao, Yun Suen Pai, Simon Hoermann, Chen Li, Nilufar Baghaei, Mark Billinghurst","doi":"arxiv-2409.07765","DOIUrl":"https://doi.org/arxiv-2409.07765","url":null,"abstract":"The advent of technology-enhanced interventions has significantly transformed\u0000mental health services, offering new opportunities for delivering\u0000psychotherapy, particularly in remote settings. This paper reports on a pilot\u0000study exploring the use of Virtual Reality (VR) as a medium for remote\u0000counselling. The study involved four experienced psychotherapists who evaluated\u0000three different virtual environments designed to support remote counselling.\u0000Through thematic analysis of interviews and feedback, we identified key factors\u0000that could be critical for designing effective virtual environments for\u0000counselling. These include the creation of clear boundaries, customization to\u0000meet specific therapeutic needs, and the importance of aligning the environment\u0000with various therapeutic approaches. Our findings suggest that VR can enhance\u0000the sense of presence and engagement in remote therapy, potentially improving\u0000the therapeutic relationship. In the paper we also outline areas for future\u0000research based on these pilot study results.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As data visualization gains popularity and projects become more interdisciplinary, there is a growing need for methods that foster creative collaboration and inform diverse audiences about data visualisation. In this paper, we introduce Co-Badge, a 90-minute design activity where participants collaboratively construct visualizations by ideating and prioritizing relevant data types, mapping them to visual variables, and constructing data badges with stationery materials. We conducted three workshops in diverse settings with participants of different backgrounds. Our findings indicate that Co-badge facilitates a playful and engaging way to gain awareness about data visualization design principles without formal training while navigating the challenges of collaboration. Our work contributes to the field of data visualization education for diverse actors. We believe Co-Badge can serve as an engaging activity that introduces basic concepts of data visualization and collaboration.
{"title":"Co-badge: An Activity for Collaborative Engagement with Data Visualization Design Concepts","authors":"Damla Çay, Mary Karyda, Kitti Butter","doi":"arxiv-2409.08175","DOIUrl":"https://doi.org/arxiv-2409.08175","url":null,"abstract":"As data visualization gains popularity and projects become more\u0000interdisciplinary, there is a growing need for methods that foster creative\u0000collaboration and inform diverse audiences about data visualisation. In this\u0000paper, we introduce Co-Badge, a 90-minute design activity where participants\u0000collaboratively construct visualizations by ideating and prioritizing relevant\u0000data types, mapping them to visual variables, and constructing data badges with\u0000stationery materials. We conducted three workshops in diverse settings with\u0000participants of different backgrounds. Our findings indicate that Co-badge\u0000facilitates a playful and engaging way to gain awareness about data\u0000visualization design principles without formal training while navigating the\u0000challenges of collaboration. Our work contributes to the field of data\u0000visualization education for diverse actors. We believe Co-Badge can serve as an\u0000engaging activity that introduces basic concepts of data visualization and\u0000collaboration.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanqiu Wang, Zihao Zhan, Haoqi Shan, Siqi Dai, Max Panoff, Shuo Wang
The advent and growing popularity of Virtual Reality (VR) and Mixed Reality (MR) solutions have revolutionized the way we interact with digital platforms. The cutting-edge gaze-controlled typing methods, now prevalent in high-end models of these devices, e.g., Apple Vision Pro, have not only improved user experience but also mitigated traditional keystroke inference attacks that relied on hand gestures, head movements and acoustic side-channels. However, this advancement has paradoxically given birth to a new, potentially more insidious cyber threat, GAZEploit. In this paper, we unveil GAZEploit, a novel eye-tracking based attack specifically designed to exploit these eye-tracking information by leveraging the common use of virtual appearances in VR applications. This widespread usage significantly enhances the practicality and feasibility of our attack compared to existing methods. GAZEploit takes advantage of this vulnerability to remotely extract gaze estimations and steal sensitive keystroke information across various typing scenarios-including messages, passwords, URLs, emails, and passcodes. Our research, involving 30 participants, achieved over 80% accuracy in keystroke inference. Alarmingly, our study also identified over 15 top-rated apps in the Apple Store as vulnerable to the GAZEploit attack, emphasizing the urgent need for bolstered security measures for this state-of-the-art VR/MR text entry method.
{"title":"GAZEploit: Remote Keystroke Inference Attack by Gaze Estimation from Avatar Views in VR/MR Devices","authors":"Hanqiu Wang, Zihao Zhan, Haoqi Shan, Siqi Dai, Max Panoff, Shuo Wang","doi":"arxiv-2409.08122","DOIUrl":"https://doi.org/arxiv-2409.08122","url":null,"abstract":"The advent and growing popularity of Virtual Reality (VR) and Mixed Reality\u0000(MR) solutions have revolutionized the way we interact with digital platforms.\u0000The cutting-edge gaze-controlled typing methods, now prevalent in high-end\u0000models of these devices, e.g., Apple Vision Pro, have not only improved user\u0000experience but also mitigated traditional keystroke inference attacks that\u0000relied on hand gestures, head movements and acoustic side-channels. However,\u0000this advancement has paradoxically given birth to a new, potentially more\u0000insidious cyber threat, GAZEploit. In this paper, we unveil GAZEploit, a novel eye-tracking based attack\u0000specifically designed to exploit these eye-tracking information by leveraging\u0000the common use of virtual appearances in VR applications. This widespread usage\u0000significantly enhances the practicality and feasibility of our attack compared\u0000to existing methods. GAZEploit takes advantage of this vulnerability to\u0000remotely extract gaze estimations and steal sensitive keystroke information\u0000across various typing scenarios-including messages, passwords, URLs, emails,\u0000and passcodes. Our research, involving 30 participants, achieved over 80%\u0000accuracy in keystroke inference. Alarmingly, our study also identified over 15\u0000top-rated apps in the Apple Store as vulnerable to the GAZEploit attack,\u0000emphasizing the urgent need for bolstered security measures for this\u0000state-of-the-art VR/MR text entry method.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in eXplainable AI (XAI) for education have highlighted a critical challenge: ensuring that explanations for state-of-the-art AI models are understandable for non-technical users such as educators and students. In response, we introduce iLLuMinaTE, a zero-shot, chain-of-prompts LLM-XAI pipeline inspired by Miller's cognitive model of explanation. iLLuMinaTE is designed to deliver theory-driven, actionable feedback to students in online courses. iLLuMinaTE navigates three main stages - causal connection, explanation selection, and explanation presentation - with variations drawing from eight social science theories (e.g. Abnormal Conditions, Pearl's Model of Explanation, Necessity and Robustness Selection, Contrastive Explanation). We extensively evaluate 21,915 natural language explanations of iLLuMinaTE extracted from three LLMs (GPT-4o, Gemma2-9B, Llama3-70B), with three different underlying XAI methods (LIME, Counterfactuals, MC-LIME), across students from three diverse online courses. Our evaluation involves analyses of explanation alignment to the social science theory, understandability of the explanation, and a real-world user preference study with 114 university students containing a novel actionability simulation. We find that students prefer iLLuMinaTE explanations over traditional explainers 89.52% of the time. Our work provides a robust, ready-to-use framework for effectively communicating hybrid XAI-driven insights in education, with significant generalization potential for other human-centric fields.
{"title":"From Explanations to Action: A Zero-Shot, Theory-Driven LLM Framework for Student Performance Feedback","authors":"Vinitra Swamy, Davide Romano, Bhargav Srinivasa Desikan, Oana-Maria Camburu, Tanja Käser","doi":"arxiv-2409.08027","DOIUrl":"https://doi.org/arxiv-2409.08027","url":null,"abstract":"Recent advances in eXplainable AI (XAI) for education have highlighted a\u0000critical challenge: ensuring that explanations for state-of-the-art AI models\u0000are understandable for non-technical users such as educators and students. In\u0000response, we introduce iLLuMinaTE, a zero-shot, chain-of-prompts LLM-XAI\u0000pipeline inspired by Miller's cognitive model of explanation. iLLuMinaTE is\u0000designed to deliver theory-driven, actionable feedback to students in online\u0000courses. iLLuMinaTE navigates three main stages - causal connection,\u0000explanation selection, and explanation presentation - with variations drawing\u0000from eight social science theories (e.g. Abnormal Conditions, Pearl's Model of\u0000Explanation, Necessity and Robustness Selection, Contrastive Explanation). We\u0000extensively evaluate 21,915 natural language explanations of iLLuMinaTE\u0000extracted from three LLMs (GPT-4o, Gemma2-9B, Llama3-70B), with three different\u0000underlying XAI methods (LIME, Counterfactuals, MC-LIME), across students from\u0000three diverse online courses. Our evaluation involves analyses of explanation\u0000alignment to the social science theory, understandability of the explanation,\u0000and a real-world user preference study with 114 university students containing\u0000a novel actionability simulation. We find that students prefer iLLuMinaTE\u0000explanations over traditional explainers 89.52% of the time. Our work provides\u0000a robust, ready-to-use framework for effectively communicating hybrid\u0000XAI-driven insights in education, with significant generalization potential for\u0000other human-centric fields.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eike Schneiders, Tina Seabrooke, Joshua Krook, Richard Hyde, Natalie Leesakul, Jeremie Clos, Joel Fischer
Large Language Models (LLMs) are seemingly infiltrating every domain, and the legal context is no exception. In this paper, we present the results of three experiments (total N=288) that investigated lay people's willingness to act upon, and their ability to discriminate between, LLM- and lawyer-generated legal advice. In Experiment 1, participants judged their willingness to act on legal advice when the source of the advice was either known or unknown. When the advice source was unknown, participants indicated that they were significantly more willing to act on the LLM-generated advice. This result was replicated in Experiment 2. Intriguingly, despite participants indicating higher willingness to act on LLM-generated advice in Experiments 1 and 2, participants discriminated between the LLM- and lawyer-generated texts significantly above chance-level in Experiment 3. Lastly, we discuss potential explanations and risks of our findings, limitations and future work, and the importance of language complexity and real-world comparability.
{"title":"Objection Overruled! Lay People can Distinguish Large Language Models from Lawyers, but still Favour Advice from an LLM","authors":"Eike Schneiders, Tina Seabrooke, Joshua Krook, Richard Hyde, Natalie Leesakul, Jeremie Clos, Joel Fischer","doi":"arxiv-2409.07871","DOIUrl":"https://doi.org/arxiv-2409.07871","url":null,"abstract":"Large Language Models (LLMs) are seemingly infiltrating every domain, and the\u0000legal context is no exception. In this paper, we present the results of three\u0000experiments (total N=288) that investigated lay people's willingness to act\u0000upon, and their ability to discriminate between, LLM- and lawyer-generated\u0000legal advice. In Experiment 1, participants judged their willingness to act on\u0000legal advice when the source of the advice was either known or unknown. When\u0000the advice source was unknown, participants indicated that they were\u0000significantly more willing to act on the LLM-generated advice. This result was\u0000replicated in Experiment 2. Intriguingly, despite participants indicating\u0000higher willingness to act on LLM-generated advice in Experiments 1 and 2,\u0000participants discriminated between the LLM- and lawyer-generated texts\u0000significantly above chance-level in Experiment 3. Lastly, we discuss potential\u0000explanations and risks of our findings, limitations and future work, and the\u0000importance of language complexity and real-world comparability.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Service robots are increasingly deployed in public spaces, performing functional tasks such as making deliveries. To better integrate them into our social environment and enhance their adoption, we consider integrating social identities within delivery robots along with their functional identity. We conducted a virtual reality-based pilot study to explore people's perceptions and acceptance of delivery robots that perform prosocial behavior. Preliminary findings from thematic analysis of semi-structured interviews illustrate people's ambivalence about dual identity. We discussed the emerging themes in light of social identity theory, framing effect, and human-robot intergroup dynamics. Building on these insights, we propose that the next generation of delivery robots should use peer-based framing, an updated value proposition, and an interactive design that places greater emphasis on expressing intentionality and emotional responses.
{"title":"More than just a Tool: People's Perception and Acceptance of Prosocial Delivery Robots as Fellow Road Users","authors":"Vivienne Bihe Chi, Elise Ulwelling, Kevin Salubre, Shashank Mehrotra, Teruhisa Misu, Kumar Akash","doi":"arxiv-2409.07815","DOIUrl":"https://doi.org/arxiv-2409.07815","url":null,"abstract":"Service robots are increasingly deployed in public spaces, performing\u0000functional tasks such as making deliveries. To better integrate them into our\u0000social environment and enhance their adoption, we consider integrating social\u0000identities within delivery robots along with their functional identity. We\u0000conducted a virtual reality-based pilot study to explore people's perceptions\u0000and acceptance of delivery robots that perform prosocial behavior. Preliminary\u0000findings from thematic analysis of semi-structured interviews illustrate\u0000people's ambivalence about dual identity. We discussed the emerging themes in\u0000light of social identity theory, framing effect, and human-robot intergroup\u0000dynamics. Building on these insights, we propose that the next generation of\u0000delivery robots should use peer-based framing, an updated value proposition,\u0000and an interactive design that places greater emphasis on expressing\u0000intentionality and emotional responses.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sijie Zhuo, Robert Biddle, Jared Daniel Recomendable, Giovanni Russello, Danielle Lottridge
Phishing emails typically masquerade themselves as reputable identities to trick people into providing sensitive information and credentials. Despite advancements in cybersecurity, attackers continuously adapt, posing ongoing threats to individuals and organisations. While email users are the last line of defence, they are not always well-prepared to detect phishing emails. This study examines how workload affects susceptibility to phishing, using eye-tracking technology to observe participants' reading patterns and interactions with tailored phishing emails. Incorporating both quantitative and qualitative analysis, we investigate users' attention to two phishing indicators, email sender and hyperlink URLs, and their reasons for assessing the trustworthiness of emails and falling for phishing emails. Our results provide concrete evidence that attention to the email sender can reduce phishing susceptibility. While we found no evidence that attention to the actual URL in the browser influences phishing detection, attention to the text masking links can increase phishing susceptibility. We also highlight how email relevance, familiarity, and visual presentation impact first impressions of email trustworthiness and phishing susceptibility.
{"title":"Eyes on the Phish(er): Towards Understanding Users' Email Processing Pattern and Mental Models in Phishing Detection","authors":"Sijie Zhuo, Robert Biddle, Jared Daniel Recomendable, Giovanni Russello, Danielle Lottridge","doi":"arxiv-2409.07717","DOIUrl":"https://doi.org/arxiv-2409.07717","url":null,"abstract":"Phishing emails typically masquerade themselves as reputable identities to\u0000trick people into providing sensitive information and credentials. Despite\u0000advancements in cybersecurity, attackers continuously adapt, posing ongoing\u0000threats to individuals and organisations. While email users are the last line\u0000of defence, they are not always well-prepared to detect phishing emails. This\u0000study examines how workload affects susceptibility to phishing, using\u0000eye-tracking technology to observe participants' reading patterns and\u0000interactions with tailored phishing emails. Incorporating both quantitative and\u0000qualitative analysis, we investigate users' attention to two phishing\u0000indicators, email sender and hyperlink URLs, and their reasons for assessing\u0000the trustworthiness of emails and falling for phishing emails. Our results\u0000provide concrete evidence that attention to the email sender can reduce\u0000phishing susceptibility. While we found no evidence that attention to the\u0000actual URL in the browser influences phishing detection, attention to the text\u0000masking links can increase phishing susceptibility. We also highlight how email\u0000relevance, familiarity, and visual presentation impact first impressions of\u0000email trustworthiness and phishing susceptibility.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiahao Nick LiJerry, ZhuohaoJerry, Zhang, Jiaju Ma
People often capture memories through photos, screenshots, and videos. While existing AI-based tools enable querying this data using natural language, they mostly only support retrieving individual pieces of information like certain objects in photos and struggle with answering more complex queries that involve interpreting interconnected memories like event sequences. We conducted a one-month diary study to collect realistic user queries and generated a taxonomy of necessary contextual information for integrating with captured memories. We then introduce OmniQuery, a novel system that is able to answer complex personal memory-related questions that require extracting and inferring contextual information. OmniQuery augments single captured memories through integrating scattered contextual information from multiple interconnected memories, retrieves relevant memories, and uses a large language model (LLM) to comprehensive answers. In human evaluations, we show the effectiveness of OmniQuery with an accuracy of 71.5%, and it outperformed a conventional RAG system, winning or tying in 74.5% of the time.
{"title":"OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering","authors":"Jiahao Nick LiJerry, ZhuohaoJerry, Zhang, Jiaju Ma","doi":"arxiv-2409.08250","DOIUrl":"https://doi.org/arxiv-2409.08250","url":null,"abstract":"People often capture memories through photos, screenshots, and videos. While\u0000existing AI-based tools enable querying this data using natural language, they\u0000mostly only support retrieving individual pieces of information like certain\u0000objects in photos and struggle with answering more complex queries that involve\u0000interpreting interconnected memories like event sequences. We conducted a\u0000one-month diary study to collect realistic user queries and generated a\u0000taxonomy of necessary contextual information for integrating with captured\u0000memories. We then introduce OmniQuery, a novel system that is able to answer\u0000complex personal memory-related questions that require extracting and inferring\u0000contextual information. OmniQuery augments single captured memories through\u0000integrating scattered contextual information from multiple interconnected\u0000memories, retrieves relevant memories, and uses a large language model (LLM) to\u0000comprehensive answers. In human evaluations, we show the effectiveness of\u0000OmniQuery with an accuracy of 71.5%, and it outperformed a conventional RAG\u0000system, winning or tying in 74.5% of the time.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper explores the intersection of AI art and blindness, as existing AI research has primarily focused on AI art's reception and impact, on sighted artists and consumers. To address this gap, the researcher interviewed six blind artists from various visual art mediums and levels of blindness about the generative AI image platform Midjourney. The participants shared text prompts and discussed their reactions to the generated images with the sighted researcher. The findings highlight blind artists' interest in AI images as a collaborative tool but express concerns about cultural perceptions and labeling of AI-generated art. They also underscore unique challenges, such as potential misunderstandings and stereotypes about blindness leading to exclusion. The study advocates for greater inclusion of blind individuals in AI art, emphasizing the need to address their specific needs and experiences in developing AI art technologies.
{"title":"Exploring Use and Perceptions of Generative AI Art Tools by Blind Artists","authors":"Gayatri Raman, Erin Brady","doi":"arxiv-2409.08226","DOIUrl":"https://doi.org/arxiv-2409.08226","url":null,"abstract":"The paper explores the intersection of AI art and blindness, as existing AI\u0000research has primarily focused on AI art's reception and impact, on sighted\u0000artists and consumers. To address this gap, the researcher interviewed six\u0000blind artists from various visual art mediums and levels of blindness about the\u0000generative AI image platform Midjourney. The participants shared text prompts\u0000and discussed their reactions to the generated images with the sighted\u0000researcher. The findings highlight blind artists' interest in AI images as a\u0000collaborative tool but express concerns about cultural perceptions and labeling\u0000of AI-generated art. They also underscore unique challenges, such as potential\u0000misunderstandings and stereotypes about blindness leading to exclusion. The\u0000study advocates for greater inclusion of blind individuals in AI art,\u0000emphasizing the need to address their specific needs and experiences in\u0000developing AI art technologies.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}