This article summarizes the author's presentation in the New Faculty Highlight at the Thirty-Eighth AAAI Conference on Artificial Intelligence. It discusses the desired properties of representations for enabling robust human–robot interaction. Examples from the author's work are presented to show how to build these properties into models for performing tasks with natural language guidance and engaging in social interactions with other agents.
{"title":"Learning representations for robust human–robot interaction","authors":"Yen-Ling Kuo","doi":"10.1002/aaai.12197","DOIUrl":"https://doi.org/10.1002/aaai.12197","url":null,"abstract":"<p>This article summarizes the author's presentation in the New Faculty Highlight at the Thirty-Eighth AAAI Conference on Artificial Intelligence. It discusses the desired properties of representations for enabling robust human–robot interaction. Examples from the author's work are presented to show how to build these properties into models for performing tasks with natural language guidance and engaging in social interactions with other agents.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 4","pages":"561-568"},"PeriodicalIF":2.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12197","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zainab Akhtar, Umair Qazi, Aya El-Sakka, Rizwan Sadiq, Ferda Ofli, Muhammad Imran
The absence of comprehensive situational awareness information poses a significant challenge for humanitarian organizations during their response efforts. We present Flood Insights, an end-to-end system, that ingests data from multiple nontraditional data sources such as remote sensing, social sensing, and geospatial data. We employ state-of-the-art natural language processing and computer vision models to identify flood exposure, ground-level damage and flood reports, and most importantly, urgent needs of affected people. We deploy and test the system during a recent real-world catastrophe, the 2022 Pakistan floods, to surface critical situational and damage information at the district level. We validated the system's effectiveness through various statistical analyses using official ground-truth data, showcasing its strong performance and explanatory power of integrating multiple data sources. Moreover, the system was commended by the United Nations Development Programme stationed in Pakistan, as well as local authorities, for pinpointing hard-hit districts and enhancing disaster response.
{"title":"Fusing remote and social sensing data for flood impact mapping","authors":"Zainab Akhtar, Umair Qazi, Aya El-Sakka, Rizwan Sadiq, Ferda Ofli, Muhammad Imran","doi":"10.1002/aaai.12196","DOIUrl":"https://doi.org/10.1002/aaai.12196","url":null,"abstract":"<p>The absence of comprehensive situational awareness information poses a significant challenge for humanitarian organizations during their response efforts. We present Flood Insights, an end-to-end system, that ingests data from multiple nontraditional data sources such as remote sensing, social sensing, and geospatial data. We employ state-of-the-art natural language processing and computer vision models to identify flood exposure, ground-level damage and flood reports, and most importantly, urgent needs of affected people. We deploy and test the system during a recent real-world catastrophe, the 2022 Pakistan floods, to surface critical situational and damage information at the district level. We validated the system's effectiveness through various statistical analyses using official ground-truth data, showcasing its strong performance and explanatory power of integrating multiple data sources. Moreover, the system was commended by the United Nations Development Programme stationed in Pakistan, as well as local authorities, for pinpointing hard-hit districts and enhancing disaster response.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 4","pages":"486-501"},"PeriodicalIF":2.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12196","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph machine learning (GML) has been successfully applied across a wide range of tasks. Nonetheless, GML faces significant challenges in generalizing over out-of-distribution (OOD) data, which raises concerns about its wider applicability. Recent advancements have underscored the crucial role of causality-driven approaches in overcoming these generalization challenges. Distinct from traditional GML methods that primarily rely on statistical dependencies, causality-focused strategies delve into the underlying causal mechanisms of data generation and model prediction, thus significantly improving the generalization of GML across different environments. This paper offers a thorough review of recent progress in causality-involved GML generalization. We elucidate the fundamental concepts of employing causality to enhance graph model generalization and categorize the various approaches, providing detailed descriptions of their methodologies and the connections among them. Furthermore, we explore the incorporation of causality in other related important areas of trustworthy GML, such as explanation, fairness, and robustness. Concluding with a discussion on potential future research directions, this review seeks to articulate the continuing development and future potential of causality in enhancing the trustworthiness of GML.
{"title":"A survey of out-of-distribution generalization for graph machine learning from a causal view","authors":"Jing Ma","doi":"10.1002/aaai.12202","DOIUrl":"https://doi.org/10.1002/aaai.12202","url":null,"abstract":"<p>Graph machine learning (GML) has been successfully applied across a wide range of tasks. Nonetheless, GML faces significant challenges in generalizing over out-of-distribution (OOD) data, which raises concerns about its wider applicability. Recent advancements have underscored the crucial role of causality-driven approaches in overcoming these generalization challenges. Distinct from traditional GML methods that primarily rely on statistical dependencies, causality-focused strategies delve into the underlying causal mechanisms of data generation and model prediction, thus significantly improving the generalization of GML across different environments. This paper offers a thorough review of recent progress in causality-involved GML generalization. We elucidate the fundamental concepts of employing causality to enhance graph model generalization and categorize the various approaches, providing detailed descriptions of their methodologies and the connections among them. Furthermore, we explore the incorporation of causality in other related important areas of trustworthy GML, such as explanation, fairness, and robustness. Concluding with a discussion on potential future research directions, this review seeks to articulate the continuing development and future potential of causality in enhancing the trustworthiness of GML.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 4","pages":"537-548"},"PeriodicalIF":2.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12202","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johannes Rehm, Irina Reshodko, Stian Zimmermann Børresen, Odd Erik Gundersen
This work introduces the design, development, and deployment of a virtual driving instructor (VDI) for enhanced driver education. The VDI provides personalized, real-time feedback to students in a driving simulator, addressing some of the limitations of traditional driver instruction. Employing a hybrid AI system, the VDI combines rule-based agents, learning-based agents, knowledge graphs, and Bayesian networks to assess and monitor student performance in a comprehensive manner. Implemented in multiple simulators at a driving school in Norway, the system aims to leverage AI and driving simulation to improve both the learning experience and the efficiency of instruction. Initial feedback from students has been largely positive, highlighting the effectiveness of this integration while also pointing to areas for further improvement. This marks a significant stride in infusing technology into driver education, offering a scalable and efficient approach to instruction.
{"title":"The virtual driving instructor: Multi-agent system collaborating via knowledge graph for scalable driver education","authors":"Johannes Rehm, Irina Reshodko, Stian Zimmermann Børresen, Odd Erik Gundersen","doi":"10.1002/aaai.12201","DOIUrl":"https://doi.org/10.1002/aaai.12201","url":null,"abstract":"<p>This work introduces the design, development, and deployment of a virtual driving instructor (VDI) for enhanced driver education. The VDI provides personalized, real-time feedback to students in a driving simulator, addressing some of the limitations of traditional driver instruction. Employing a hybrid AI system, the VDI combines rule-based agents, learning-based agents, knowledge graphs, and Bayesian networks to assess and monitor student performance in a comprehensive manner. Implemented in multiple simulators at a driving school in Norway, the system aims to leverage AI and driving simulation to improve both the learning experience and the efficiency of instruction. Initial feedback from students has been largely positive, highlighting the effectiveness of this integration while also pointing to areas for further improvement. This marks a significant stride in infusing technology into driver education, offering a scalable and efficient approach to instruction.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 4","pages":"514-525"},"PeriodicalIF":2.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12201","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the realm of business automation, conversational assistants are emerging as the primary method for making automation software accessible to users in various business sectors. Access to automation primarily occurs through application programming interface (APIs) and robotic process automation (RPAs). To effectively convert APIs and RPAs into chatbots on a larger scale, it is crucial to establish an automated process for generating data and training models that can recognize user intentions, identify questions for conversational slot filling, and provide recommendations for subsequent actions. In this paper, we present a technique for enhancing and generating natural language conversational artifacts from API specifications using large language models (LLMs). The goal is to utilize LLMs in the “build” phase to assist humans in creating skills for digital assistants. As a result, the system does not need to rely on LLMs during conversations with business users, leading to efficient deployment. Along with enabling digital assistants, our system employs LLMs as proxies to simulate human interaction and automatically evaluate the digital assistant's performance. Experimental results highlight the effectiveness of our proposed approach. Our system is deployed in the IBM Watson Orchestrate product for general availability.
{"title":"Framework to enable and test conversational assistant for APIs and RPAs","authors":"Jayachandu Bandlamudi, Kushal Mukherjee, Prerna Agarwal, Ritwik Chaudhuri, Rakesh Pimplikar, Sampath Dechu, Alex Straley, Anbumunee Ponniah, Renuka Sindhgatta","doi":"10.1002/aaai.12198","DOIUrl":"https://doi.org/10.1002/aaai.12198","url":null,"abstract":"<p>In the realm of business automation, conversational assistants are emerging as the primary method for making automation software accessible to users in various business sectors. Access to automation primarily occurs through application programming interface (APIs) and robotic process automation (RPAs). To effectively convert APIs and RPAs into chatbots on a larger scale, it is crucial to establish an automated process for generating data and training models that can recognize user intentions, identify questions for conversational slot filling, and provide recommendations for subsequent actions. In this paper, we present a technique for enhancing and generating natural language conversational artifacts from API specifications using large language models (LLMs). The goal is to utilize LLMs in the “build” phase to assist humans in creating skills for digital assistants. As a result, the system does not need to rely on LLMs during conversations with business users, leading to efficient deployment. Along with enabling digital assistants, our system employs LLMs as proxies to simulate human interaction and automatically evaluate the digital assistant's performance. Experimental results highlight the effectiveness of our proposed approach. Our system is deployed in the IBM Watson Orchestrate product for general availability.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 4","pages":"443-456"},"PeriodicalIF":2.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12198","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph-structured data, ranging from social networks to financial transaction networks, from citation networks to gene regulatory networks, have been widely used for modeling a myriad of real-world systems. As a prevailing model architecture to model graph-structured data, graph neural networks (GNNs) have drawn much attention in both academic and industrial communities in the past decades. Despite their success in different graph learning tasks, existing methods usually rely on learning from “big” data, requiring a large amount of labeled data for model training. However, it is common that real-world graphs are associated with “small” labeled data as data annotation and labeling on graphs is always time and resource-consuming. Therefore, it is imperative to investigate graph machine learning (graph ML) with low-cost human supervision for low-resource settings where limited or even no labeled data is available. This paper investigates a new research field—data-efficient graph learning, which aims to push forward the performance boundary of graph ML models with different kinds of low-cost supervision signals. Specifically, we outline the fundamental research problems, review the current progress, and discuss the future prospects of data-efficient graph learning, aiming to illuminate the path for subsequent research in this field.
从社交网络到金融交易网络,从引文网络到基因调控网络,图结构数据已被广泛用于模拟现实世界中的各种系统。作为图结构数据建模的主流模型架构,图神经网络(GNN)在过去几十年中引起了学术界和工业界的广泛关注。尽管它们在不同的图学习任务中取得了成功,但现有方法通常依赖于从 "大 "数据中学习,需要大量标注数据来进行模型训练。然而,现实世界中的图通常与 "小 "标注数据相关联,因为对图进行数据注释和标注总是耗费时间和资源。因此,在资源有限甚至没有标注数据的情况下,研究具有低成本人工监督的图机器学习(graph ML)势在必行。本文探讨了一个新的研究领域--数据高效图学习,旨在通过不同类型的低成本监督信号来推动图 ML 模型的性能边界。具体而言,我们概述了数据高效图学习的基础研究问题,回顾了当前的研究进展,并讨论了其未来前景,旨在为该领域的后续研究指明方向。
{"title":"Data-efficient graph learning: Problems, progress, and prospects","authors":"Kaize Ding, Yixin Liu, Chuxu Zhang, Jianling Wang","doi":"10.1002/aaai.12200","DOIUrl":"https://doi.org/10.1002/aaai.12200","url":null,"abstract":"<p>Graph-structured data, ranging from social networks to financial transaction networks, from citation networks to gene regulatory networks, have been widely used for modeling a myriad of real-world systems. As a prevailing model architecture to model graph-structured data, graph neural networks (GNNs) have drawn much attention in both academic and industrial communities in the past decades. Despite their success in different graph learning tasks, existing methods usually rely on learning from “big” data, requiring a large amount of labeled data for model training. However, it is common that real-world graphs are associated with “small” labeled data as data annotation and labeling on graphs is always time and resource-consuming. Therefore, it is imperative to investigate graph machine learning (graph ML) with low-cost human supervision for low-resource settings where limited or even no labeled data is available. This paper investigates a new research field—data-efficient graph learning, which aims to push forward the performance boundary of graph ML models with different kinds of low-cost supervision signals. Specifically, we outline the fundamental research problems, review the current progress, and discuss the future prospects of data-efficient graph learning, aiming to illuminate the path for subsequent research in this field.</p>","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"45 4","pages":"549-560"},"PeriodicalIF":2.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.12200","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anqi Lu, Zifeng Wu, Zheng Jiang, Wei Wang, Eerdun Hasi, Yi Wang
Visual interpretation is extremely important in human geography as the primary technique for geographers to use photograph data in identifying, classifying, and quantifying geographic and topological objects or regions. However, it is also time-consuming and requires overwhelming manual effort from professional geographers. This paper describes our interdisciplinary team's efforts in integrating computer vision models with geographers' visual image interpretation process to reduce their workload in interpreting images. Focusing on the dune segmentation task, we proposed an approach called