Proceedings of the AAAI Symposium Series最新文献

英文中文

Human-like Learning in Temporally Structured Environments 在时间结构环境中进行类人学习

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31273

Matt Jones, Tyler R. Scott, Michael C. Mozer

Natural environments have correlations at a wide range of timescales. Human cognition is tuned to this temporal structure, as seen by power laws of learning and memory, and by spacing effects whereby the intervals between repeated training data affect how long knowledge is retained. Machine learning is instead dominated by batch iid training or else relatively simple nonstationarity assumptions such as random walks or discrete task sequences.The main contributions of our work are:(1) We develop a Bayesian model formalizing the brain's inductive bias for temporal structureand show our model accounts for key features of human learning and memory.(2) We translate the model into a new gradient-based optimization technique for neural networks that endows them with human-like temporal inductive bias and improves their performance in realistic nonstationary tasks.Our technical approach is founded on Bayesian inference over 1/f noise, a statistical signature of many natural environments with long-range, power law correlations. We derive a new closed-form solution to this problem by treating the state of the environment as a sum of processes on different timescales and applying an extended Kalman filter to learn all timescales jointly. We then derive a variational approximation of this model for training neural networks, which can be used as a drop-in replacement for standard optimizers in arbitrary architectures. Our optimizer decomposes each weight in the network as a sum of subweights with different learning and decay rates and tracks their joint uncertainty. Thus knowledge becomes distributed across timescales, enabling rapid adaptation to task changes while retaining long-term knowledge and avoiding catastrophic interference. Simulations show improved performance in environments with realistic multiscale nonstationarity.Finally, we present simulations showing our model gives essentially parameter-free fits of learning, forgetting, and spacing effects in human data. We then explore the analogue of human spacing effects in a deep net trained in a structured environment where tasks recur at different rates and compare the model's behavioral properties to those of people.

自然环境中的相关性具有广泛的时间尺度。从学习和记忆的幂律以及间隔效应（重复训练数据之间的间隔会影响知识的保留时间）可以看出，人类认知与这种时间结构相适应。而机器学习则受制于批量整数训练或相对简单的非平稳性假设，如随机漫步或离散任务序列。我们工作的主要贡献有：（1）我们建立了一个贝叶斯模型，将大脑对时间结构的归纳偏好形式化，并表明我们的模型解释了人类学习和记忆的关键特征。(2) 我们将该模型转化为一种新的基于梯度的神经网络优化技术，该技术赋予神经网络类似于人类的时间归纳偏差，并提高了神经网络在现实非稳态任务中的性能。我们的技术方法建立在对 1/f 噪声的贝叶斯推理基础之上，1/f 噪声是许多自然环境的统计特征，具有长程幂律相关性。我们将环境状态视为不同时间尺度上的过程之和，并应用扩展卡尔曼滤波器联合学习所有时间尺度，从而推导出这一问题的新闭式解决方案。然后，我们推导出用于训练神经网络的该模型的变分近似值，该近似值可用于替代任意架构中的标准优化器。我们的优化器将网络中的每个权重分解为具有不同学习率和衰减率的子权重之和，并跟踪它们的联合不确定性。这样，知识就可以跨时标分布，从而在快速适应任务变化的同时，保留长期知识并避免灾难性干扰。模拟结果表明，在具有现实多尺度非平稳性的环境中，该模型的性能得到了改善。最后，我们展示了模拟结果，表明我们的模型基本上无参数地拟合了人类数据中的学习、遗忘和间隔效应。然后，我们探索了在结构化环境中训练的深度网的人类间距效应，在这种环境中，任务以不同的速度重复出现，并将模型的行为特性与人类的行为特性进行了比较。

{"title":"Human-like Learning in Temporally Structured Environments","authors":"Matt Jones, Tyler R. Scott, Michael C. Mozer","doi":"10.1609/aaaiss.v3i1.31273","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31273","url":null,"abstract":"Natural environments have correlations at a wide range of timescales. Human cognition is tuned to this temporal structure, as seen by power laws of learning and memory, and by spacing effects whereby the intervals between repeated training data affect how long knowledge is retained. Machine learning is instead dominated by batch iid training or else relatively simple nonstationarity assumptions such as random walks or discrete task sequences.\u0000\u0000The main contributions of our work are:\u0000(1) We develop a Bayesian model formalizing the brain's inductive bias for temporal structure\u0000and show our model accounts for key features of human learning and memory.\u0000(2) We translate the model into a new gradient-based optimization technique for neural networks that endows them with human-like temporal inductive bias and improves their performance in realistic nonstationary tasks.\u0000\u0000Our technical approach is founded on Bayesian inference over 1/f noise, a statistical signature of many natural environments with long-range, power law correlations. We derive a new closed-form solution to this problem by treating the state of the environment as a sum of processes on different timescales and applying an extended Kalman filter to learn all timescales jointly. \u0000\u0000We then derive a variational approximation of this model for training neural networks, which can be used as a drop-in replacement for standard optimizers in arbitrary architectures. Our optimizer decomposes each weight in the network as a sum of subweights with different learning and decay rates and tracks their joint uncertainty. Thus knowledge becomes distributed across timescales, enabling rapid adaptation to task changes while retaining long-term knowledge and avoiding catastrophic interference. Simulations show improved performance in environments with realistic multiscale nonstationarity.\u0000\u0000Finally, we present simulations showing our model gives essentially parameter-free fits of learning, forgetting, and spacing effects in human data. We then explore the analogue of human spacing effects in a deep net trained in a structured environment where tasks recur at different rates and compare the model's behavioral properties to those of people.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"29 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141118928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Inclusion Ethics in AI: Use Cases in African Fashion 人工智能中的包容伦理：非洲时尚界的使用案例

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31266

Christelle Scharff, James Brusseau, K. Bathula, Kaleemunnisa Fnu, Samyak Rakesh Meshram, Om Gaikhe

This paper addresses the ethics of inclusion in artificial in-telligence in the context of African fashion. Despite the proliferation of fashion-related AI applications and da-tasets global diversity remains limited, and African fash-ion is significantly underrepresented. This paper docu-ments two use-cases that enhance AI's inclusivity by in-corporating sub-Saharan fashion elements. The first case details the creation of a Senegalese fashion dataset and a model for classifying traditional apparel using transfer learning. The second case investigates African wax textile patterns generated through generative adversarial net-works (GANs), specifically StyleGAN architectures, and machine learning diffusion models. Alongside the practi-cal, technological advances, theoretical ethical progress is made in two directions. First, the cases are used to elabo-rate and define the ethics of inclusion, while also contrib-uting to current debates about how inclusion differs from ethical fairness. Second, the cases engage with the ethical debate on whether AI innovation should be slowed to prevent ethical imbalances or accelerated to solve them.

本文探讨了非洲时尚背景下人工智能的包容性伦理问题。尽管与时尚相关的人工智能应用和工具集不断涌现，但全球多样性仍然有限，非洲时尚的代表性严重不足。本文记录了两个通过融入撒哈拉以南地区的时尚元素来增强人工智能包容性的案例。第一个案例详细介绍了塞内加尔时尚数据集的创建，以及利用迁移学习对传统服装进行分类的模型。第二个案例研究了通过生成式对抗网络工程（GAN）（特别是 StyleGAN 架构）和机器学习扩散模型生成的非洲蜡纺织品图案。在实践和技术进步的同时，理论伦理也在两个方向上取得了进展。首先，这些案例被用来确定和定义全纳伦理，同时也有助于当前关于全纳与伦理公平有何不同的辩论。其次，案例参与了关于人工智能创新应该放缓以防止伦理失衡，还是加快以解决伦理失衡的伦理辩论。

{"title":"Inclusion Ethics in AI: Use Cases in African Fashion","authors":"Christelle Scharff, James Brusseau, K. Bathula, Kaleemunnisa Fnu, Samyak Rakesh Meshram, Om Gaikhe","doi":"10.1609/aaaiss.v3i1.31266","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31266","url":null,"abstract":"This paper addresses the ethics of inclusion in artificial in-telligence in the context of African fashion. Despite the proliferation of fashion-related AI applications and da-tasets global diversity remains limited, and African fash-ion is significantly underrepresented. This paper docu-ments two use-cases that enhance AI's inclusivity by in-corporating sub-Saharan fashion elements. The first case details the creation of a Senegalese fashion dataset and a model for classifying traditional apparel using transfer learning. The second case investigates African wax textile patterns generated through generative adversarial net-works (GANs), specifically StyleGAN architectures, and machine learning diffusion models. Alongside the practi-cal, technological advances, theoretical ethical progress is made in two directions. First, the cases are used to elabo-rate and define the ethics of inclusion, while also contrib-uting to current debates about how inclusion differs from ethical fairness. Second, the cases engage with the ethical debate on whether AI innovation should be slowed to prevent ethical imbalances or accelerated to solve them.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"18 13","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How Can GenAI Foster Well-being in Self-regulated Learning? GenAI 如何促进自我调节学习中的幸福感？

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31234

Stefanie Hauske, Oliver Bendel

This paper explores how generative AI (GenAI) can improve the well-being of learners within self-regulated learning (SRL) frameworks in the corporate context. In the “GenAI to Support SRL” section, it presents three custom versions of ChatGPT aimed at assisting learners. These so-called GPTs demonstrate the GenAI’s potential to actively support learners in SRL and positively influence their well-being. The “Discussion” and “Summary and Outlook” sections provide a balanced overview of the opportunities and risks associated with GenAI in the field of learning and highlight directions for future research. The results indicate that GenAI could improve the well-being of learners in SRL through providing personalized guidance, reducing feelings of stress, and increasing motivation and self-efficacy. At the same time, there are several challenges for companies and employees that need to be overcome.

本文探讨了生成式人工智能（GenAI）如何在企业背景下的自我调节学习（SRL）框架内改善学习者的福祉。在 "支持自律学习的 GenAI "部分，本文介绍了三个旨在帮助学习者的定制版 ChatGPT。这些所谓的 GPT 展示了 GenAI 在积极支持学习者 SRL 方面的潜力，并对他们的福祉产生了积极影响。讨论 "和 "总结与展望 "部分对 GenAI 在学习领域的机遇和风险进行了均衡的概述，并强调了未来研究的方向。研究结果表明，GenAI 可以通过提供个性化指导、减少压力感、提高学习动机和自我效能感，改善学习者在自学过程中的幸福感。与此同时，企业和员工也需要克服一些挑战。

引用次数: 0

Toward Human-Like Representation Learning for Cognitive Architectures 面向认知架构的类人表征学习

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31274

Steven Jones, Peter Lindes

Human-like learning includes an ability to learn concepts from a stream of embodiment sensor data. Echoing previous thoughts such as those from Barsalou that cognition and perception share a common representation system, we suggest an addendum to the common model of cognition. This addendum poses a simultaneous semantic memory and perception learning that bypasses working memory, and that uses parallel processing to learn concepts apart from deliberate reasoning. The goal is to provide a general outline for how to extend a class of cognitive architectures to implement a more human-like interface between cognition and embodiment of an agent, where a critical aspect of that interface is that it is dynamic because of learning.

类人学习包括从体现传感器数据流中学习概念的能力。与巴萨罗等人之前关于认知和感知共享一个共同表征系统的观点相呼应，我们建议对认知的共同模型进行增补。该附录提出了一种同时学习语义记忆和感知的方法，它绕过了工作记忆，利用并行处理来学习刻意推理之外的概念。我们的目标是为如何扩展一类认知架构提供一个总纲，以便在认知和代理的体现之间实现一个更像人类的界面，而该界面的一个关键方面是，由于学习，它是动态的。

引用次数: 0

A Dataset for Estimating Participant Inspiration in Meetings toward AI-Based Meeting Support System to Improve Worker Wellbeing 估算与会者在会议中的灵感的数据集，实现基于人工智能的会议支持系统，提高工人的幸福感

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31231

Soki Arai, Yuki Yamamoto, Yuji Nozaki, Haruka Matsukura, Maki Sakamoto

Various meetings are carried out in intellectual production activities and workers have to spend much time to create ideas. In creative meetings, it is sometime difficult for the meeting moderators and facilitators to efficiently conduct the meetings because the participants are required to come up with new ideas one after another and some participants hesitate to express unconventional ideas. Therefore, we propose to develop an AI-based meeting support system that estimates participants’ inspiration and helps to generate comfortable meeting environments for improvement of worker wellbeing. Participants’ inspiration is assumed to be estimated based on their speech and micro behaviors including smiles and nods. In this paper, a dataset we collected for the development of the proposed system is reported. The dataset consists of participants’ brain blood flows measured near-infrared spectrometers, micro behavior annotated from video recording, and inspiration the participants reported with buttons. The data for 1020 min was collected by conducting simulation meetings. In future work, we plan to train an LSTM (long short-term memory) based neural network model to realize the proposed system.

在智力生产活动中会举行各种会议，工人们需要花费大量时间来创造想法。在创意会议中，由于与会者需要一个接一个地提出新想法，有些与会者在表达非常规想法时会犹豫不决，因此会议主持人和协调人有时很难高效地主持会议。因此，我们建议开发一种基于人工智能的会议支持系统，该系统可估算与会者的灵感，并帮助营造舒适的会议环境，从而提高员工的幸福感。与会者的灵感假定是根据他们的言语和微观行为（包括微笑和点头）估算出来的。本文报告了我们为开发拟议系统而收集的数据集。该数据集包括通过近红外光谱仪测量的参与者脑血流量、通过视频记录的微观行为注释以及参与者通过按钮报告的灵感。我们通过模拟会议收集了 1020 分钟的数据。在未来的工作中，我们计划训练一个基于 LSTM（长短期记忆）的神经网络模型来实现所提出的系统。

{"title":"A Dataset for Estimating Participant Inspiration in Meetings toward AI-Based Meeting Support System to Improve Worker Wellbeing","authors":"Soki Arai, Yuki Yamamoto, Yuji Nozaki, Haruka Matsukura, Maki Sakamoto","doi":"10.1609/aaaiss.v3i1.31231","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31231","url":null,"abstract":"Various meetings are carried out in intellectual production activities and workers have to spend much time to create ideas. In creative meetings, it is sometime difficult for the meeting moderators and facilitators to efficiently conduct the meetings because the participants are required to come up with new ideas one after another and some participants hesitate to express unconventional ideas. Therefore, we propose to develop an AI-based meeting support system that estimates participants’ inspiration and helps to generate comfortable meeting environments for improvement of worker wellbeing. Participants’ inspiration is assumed to be estimated based on their speech and micro behaviors including smiles and nods. In this paper, a dataset we collected for the development of the proposed system is reported. The dataset consists of participants’ brain blood flows measured near-infrared spectrometers, micro behavior annotated from video recording, and inspiration the participants reported with buttons. The data for 1020 min was collected by conducting simulation meetings. In future work, we plan to train an LSTM (long short-term memory) based neural network model to realize the proposed system.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"84 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141123068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Resource-aware Federated Data Analytics in Edge-Enabled IoT Systems 边缘物联网系统中的资源感知联合数据分析

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31219

Hana Khamfroush

In a resource constrained environment like Internet-of-Things (IoT) systems, it is critical to make optimal decisions on how much resourcesto allocate pre-processing and how much to allocate to model training, and which specific combination of preprocessing and learning should be selected. This talk first, provides an overview of some initial steps we took towards developing federated data pre-processing in IoT environments, and then avisionary overview of potential research problems related to developing an integrated resource-aware and Quality-of-Service (QoS)-aware data pre-processing and model training system is provided.

在物联网（IoT）系统这种资源受限的环境中，关键是要就分配多少资源进行预处理、分配多少资源进行模型训练以及选择预处理和学习的具体组合做出最佳决策。本讲座首先概述了我们为在物联网环境中开发联合数据预处理而采取的一些初步措施，然后概述了与开发综合资源感知和服务质量（QoS）感知的数据预处理和模型训练系统有关的潜在研究问题。

引用次数: 0

Semantic Verification in Large Language Model-based Retrieval Augmented Generation 基于大型语言模型的检索增强生成中的语义验证

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31199

Andreas Martin, Hans Friedrich Witschel, Maximilian Mandl, Mona Stockhecke

This position paper presents a novel approach of semantic verification in Large Language Model-based Retrieval Augmented Generation (LLM-RAG) systems, focusing on the critical need for factually accurate information dissemination during public debates, especially prior to plebiscites e.g. in direct democracies, particularly in the context of Switzerland. Recognizing the unique challenges posed by the current generation of Large Language Models (LLMs) in maintaining factual integrity, this research proposes an innovative solution that integrates retrieval mechanisms with enhanced semantic verification processes. The paper outlines a comprehensive methodology following a Design Science Research approach, which includes defining user personas, designing conversational interfaces, and iteratively developing a hybrid dialogue system. Central to this system is a robust semantic verification framework that leverages a knowledge graph for fact-checking and validation, ensuring the correctness and consistency of information generated by LLMs. The paper discusses the significance of this research in the context of Swiss direct democracy, where informed decision-making is pivotal. By improving the accuracy and reliability of information provided to the public, the proposed system aims to support the democratic process, enabling citizens to make well-informed decisions on complex issues. The research contributes to advancing the field of natural language processing and information retrieval, demonstrating the potential of AI and LLMs in enhancing civic engagement and democratic participation.

本立场文件介绍了在基于大型语言模型的检索增强生成（LLM-RAG）系统中进行语义验证的新方法，重点关注在公开辩论期间，特别是在直接民主国家的全民投票之前，尤其是在瑞士的背景下，对事实准确性信息传播的关键需求。本研究认识到目前的大语言模型（LLM）在保持事实完整性方面所面临的独特挑战，提出了一种创新的解决方案，将检索机制与增强的语义验证过程整合在一起。论文概述了一种采用设计科学研究方法的综合方法，包括定义用户角色、设计对话界面和迭代开发混合对话系统。该系统的核心是一个强大的语义验证框架，它利用知识图谱进行事实检查和验证，确保 LLM 生成的信息的正确性和一致性。论文讨论了这项研究在瑞士直接民主背景下的意义，在瑞士，知情决策至关重要。通过提高向公众提供的信息的准确性和可靠性，拟议的系统旨在支持民主进程，使公民能够就复杂问题做出充分知情的决策。这项研究有助于推动自然语言处理和信息检索领域的发展，展示了人工智能和 LLM 在加强公民参与和民主参与方面的潜力。

{"title":"Semantic Verification in Large Language Model-based Retrieval Augmented Generation","authors":"Andreas Martin, Hans Friedrich Witschel, Maximilian Mandl, Mona Stockhecke","doi":"10.1609/aaaiss.v3i1.31199","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31199","url":null,"abstract":"This position paper presents a novel approach of semantic verification in Large Language Model-based Retrieval Augmented Generation (LLM-RAG) systems, focusing on the critical need for factually accurate information dissemination during public debates, especially prior to plebiscites e.g. in direct democracies, particularly in the context of Switzerland. Recognizing the unique challenges posed by the current generation of Large Language Models (LLMs) in maintaining factual integrity, this research proposes an innovative solution that integrates retrieval mechanisms with enhanced semantic verification processes. The paper outlines a comprehensive methodology following a Design Science Research approach, which includes defining user personas, designing conversational interfaces, and iteratively developing a hybrid dialogue system. Central to this system is a robust semantic verification framework that leverages a knowledge graph for fact-checking and validation, ensuring the correctness and consistency of information generated by LLMs. The paper discusses the significance of this research in the context of Swiss direct democracy, where informed decision-making is pivotal. By improving the accuracy and reliability of information provided to the public, the proposed system aims to support the democratic process, enabling citizens to make well-informed decisions on complex issues. The research contributes to advancing the field of natural language processing and information retrieval, demonstrating the potential of AI and LLMs in enhancing civic engagement and democratic participation.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"16 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Arithmetic of Machine Decision : How to Find the Symmetries of Complete Chaos 机器决策的算术：如何找到完全混沌的对称性

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31171

O. Bartheye, L. Chaudron

This present work is deliberately placed in the context capable of defining the requirements expressed by machine decision-making calculations. The informational nature ofa decision requires abandoning any invariant preserving the structure but on the contrary switching into total chaos, a necessary and sufficient condition for exploiting the symmetriesallowing the calculation to converge. Decision arithmetic is the best way to precisely define the nature of these symmetries.

本课题特意将其置于能够定义机器决策计算所表达的要求的背景下。决策的信息性质要求放弃任何保持结构不变的变量，相反，它需要切换到完全混乱的状态，这是利用对称性使计算收敛的必要和充分条件。决策运算是精确定义这些对称性本质的最佳方法。

引用次数: 0

Centering Humans in Artificial Intelligence 人工智能以人为本

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31170

Cecilia O. Alm

AI systems are breaking into new domains and applications, and it is pivotal to center humans in contemporary AI systems and contemplate what this means. This discussion considers three perspectives or human roles in AI as users, contributors, and researchers-in-training, to illustrate this notion.

人工智能系统正在进入新的领域和应用，因此将人类置于当代人工智能系统的中心并思考这意味着什么至关重要。本讨论将从用户、贡献者和培训中的研究人员这三个角度或人类在人工智能中的角色来说明这一概念。

引用次数: 0

Multi-Modal Instruction-Tuning Small-Scale Language-and-Vision Assistant for Semiconductor Electron Micrograph Analysis 用于半导体电子显微图像分析的多模式指令调整小型语言和视觉助手

Proceedings of the AAAI Symposium Series

Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31205

Sagar Srinivas Sakhinana, Geethan Sannidhi, Venkataramana Runkana

We present a novel framework for analyzing and interpreting electron microscopy images in semiconductor manufacturing using vision-language instruction tuning. The framework employs a unique teacher-student approach, leveraging pretrained multimodal large language models such as GPT-4 to generate instruction-following data for zero-shot visual question answering (VQA) and classification tasks, customizing smaller multimodal models (SMMs) for microscopy image analysis, resulting in an instruction tuned language-and-vision assistant. Our framework merges knowledge engineering with machine learning to integrate domain-specific expertise from larger to smaller multimodal models within this specialized field, greatly reducing the need for extensive human labeling. Our study presents a secure, cost-effective, and customizable approach for analyzing microscopy images, addressing the challenges of adopting proprietary models in semiconductor manufacturing.

我们提出了一个新颖的框架，利用视觉语言指令调整来分析和解释半导体制造中的电子显微镜图像。该框架采用独特的师生方法，利用 GPT-4 等预训练的多模态大型语言模型，为零镜头视觉问题解答（VQA）和分类任务生成指令跟踪数据，为显微镜图像分析定制较小的多模态模型 (SMM)，从而形成一个经过指令调整的语言和视觉助手。我们的框架将知识工程与机器学习相结合，在这一专业领域内将特定领域的专业知识从较大的多模态模型整合到较小的多模态模型中，从而大大减少了对大量人工标注的需求。我们的研究提出了一种安全、经济、可定制的显微图像分析方法，解决了在半导体制造中采用专有模型的难题。

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the AAAI Symposium Series

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀