首页 > 最新文献

ArXiv最新文献

英文 中文
System-level Impact of Non-Ideal Program-Time of Charge Trap Flash (CTF) on Deep Neural Network 电荷陷阱闪存 (CTF) 非理想编程时间对深度神经网络的系统级影响
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.09792
S. Shrivastava, A. Biswas, S. Chakrabarty, G. Dash, V. Saraswat, U. Ganguly
Learning of deep neural networks (DNN) using Resistive Processing Unit (RPU) architecture is energy-efficient as it utilizes dedicated neuromorphic hardware and stochastic computation of weight updates for in-memory computing. Charge Trap Flash (CTF) devices can implement RPU-based weight updates in DNNs. However, prior work has shown that the weight updates (V_T) in CTF-based RPU are impacted by the non-ideal program time of CTF. The non-ideal program time is affected by two factors of CTF. Firstly, the effects of the number of input pulses (N) or pulse width (pw), and secondly, the gap between successive update pulses (t_gap) used for the stochastic computation of weight updates. Therefore, the impact of this non-ideal program time must be studied for neural network training simulations. In this study, Firstly, we propose a pulse-train design compensation technique to reduce the total error caused by non-ideal program time of CTF and stochastic variance of a network. Secondly, we simulate RPU-based DNN with non-ideal program time of CTF on MNIST and Fashion-MNIST datasets. We find that for larger N (~1000), learning performance approaches the ideal (software-level) training level and, therefore, is not much impacted by the choice of t_gap used to implement RPU-based weight updates. However, for lower N (<500), learning performance depends on T_gap of the pulses. Finally, we also performed an ablation study to isolate the causal factor of the improved learning performance. We conclude that the lower noise level in the weight updates is the most likely significant factor to improve the learning performance of DNN. Thus, our study attempts to compensate for the error caused by non-ideal program time and standardize the pulse length (N) and pulse gap (t_gap) specifications for CTF-based RPUs for accurate system-level on-chip training.
使用电阻式处理单元(RPU)架构学习深度神经网络(DNN)非常节能,因为它利用了专用的神经形态硬件和随机计算权重更新的内存计算。电荷陷阱闪存(CTF)设备可以在 DNN 中实现基于 RPU 的权重更新。然而,先前的研究表明,基于 CTF 的 RPU 中的权重更新(V_T)会受到 CTF 非理想编程时间的影响。非理想程序时间受 CTF 的两个因素影响。首先是输入脉冲数(N)或脉冲宽度(pw)的影响,其次是用于随机计算权重更新的连续更新脉冲之间的间隙(t_gap)。因此,必须研究这种非理想程序时间对神经网络训练模拟的影响。在本研究中,首先,我们提出了一种脉冲-训练设计补偿技术,以减少 CTF 非理想程序时间和网络随机方差造成的总误差。其次,我们在 MNIST 和 Fashion-MNIST 数据集上模拟了基于 RPU 的 DNN 与 CTF 的非理想编程时间。我们发现,对于较大的 N(约 1000),学习性能接近理想的(软件级)训练水平,因此,用于实现基于 RPU 的权重更新的 t_gap 选择不会对学习性能产生太大影响。然而,对于较低的 N(<500),学习性能取决于脉冲的 T_gap。最后,我们还进行了一项消融研究,以找出学习性能提高的原因。我们得出的结论是,权值更新中较低的噪声水平最有可能是提高 DNN 学习性能的重要因素。因此,我们的研究试图弥补非理想程序时间造成的误差,并对基于 CTF 的 RPU 的脉冲长度(N)和脉冲间隙(t_gap)规格进行标准化,以实现精确的系统级片上训练。
{"title":"System-level Impact of Non-Ideal Program-Time of Charge Trap Flash (CTF) on Deep Neural Network","authors":"S. Shrivastava, A. Biswas, S. Chakrabarty, G. Dash, V. Saraswat, U. Ganguly","doi":"10.48550/arXiv.2402.09792","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09792","url":null,"abstract":"Learning of deep neural networks (DNN) using Resistive Processing Unit (RPU) architecture is energy-efficient as it utilizes dedicated neuromorphic hardware and stochastic computation of weight updates for in-memory computing. Charge Trap Flash (CTF) devices can implement RPU-based weight updates in DNNs. However, prior work has shown that the weight updates (V_T) in CTF-based RPU are impacted by the non-ideal program time of CTF. The non-ideal program time is affected by two factors of CTF. Firstly, the effects of the number of input pulses (N) or pulse width (pw), and secondly, the gap between successive update pulses (t_gap) used for the stochastic computation of weight updates. Therefore, the impact of this non-ideal program time must be studied for neural network training simulations. In this study, Firstly, we propose a pulse-train design compensation technique to reduce the total error caused by non-ideal program time of CTF and stochastic variance of a network. Secondly, we simulate RPU-based DNN with non-ideal program time of CTF on MNIST and Fashion-MNIST datasets. We find that for larger N (~1000), learning performance approaches the ideal (software-level) training level and, therefore, is not much impacted by the choice of t_gap used to implement RPU-based weight updates. However, for lower N (<500), learning performance depends on T_gap of the pulses. Finally, we also performed an ablation study to isolate the causal factor of the improved learning performance. We conclude that the lower noise level in the weight updates is the most likely significant factor to improve the learning performance of DNN. Thus, our study attempts to compensate for the error caused by non-ideal program time and standardize the pulse length (N) and pulse gap (t_gap) specifications for CTF-based RPUs for accurate system-level on-chip training.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles TOAD:以任务为导向、响应风格多样的自动对话框
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.10137
Yinhong Liu, Yimai Fang, David Vandyke, Nigel Collier
In light of recent advances in large language models (LLMs), the expectations for the next generation of virtual assistants include enhanced naturalness and adaptability across diverse usage scenarios. However, the creation of high-quality annotated data for Task-Oriented Dialog (TOD) is recognized to be slow and costly. To address these challenges, we introduce Task-Oriented Automatic Dialogs (TOAD), a novel and scalable TOD dataset along with its automatic generation pipeline. The TOAD dataset simulates realistic app context interaction and provide a variety of system response style options. Two aspects of system response styles are considered, verbosity level and users' expression mirroring. We benchmark TOAD on two response generation tasks and the results show that modelling more verbose or responses without user expression mirroring is more challenging.
鉴于最近在大型语言模型(LLMs)方面取得的进展,人们对下一代虚拟助手的期望包括在各种使用场景中增强自然性和适应性。然而,为面向任务的对话(TOD)创建高质量的注释数据被认为是缓慢而昂贵的。为了应对这些挑战,我们推出了任务导向自动对话(TOAD)--一种新颖且可扩展的 TOD 数据集及其自动生成管道。TOAD 数据集模拟了真实的应用程序上下文交互,并提供了多种系统响应风格选项。我们考虑了系统响应风格的两个方面,即冗长程度和用户表达镜像。我们在两个响应生成任务中对 TOAD 进行了基准测试,结果表明,在没有用户表情镜像的情况下模拟更多的冗长响应或响应更具挑战性。
{"title":"TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles","authors":"Yinhong Liu, Yimai Fang, David Vandyke, Nigel Collier","doi":"10.48550/arXiv.2402.10137","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10137","url":null,"abstract":"In light of recent advances in large language models (LLMs), the expectations for the next generation of virtual assistants include enhanced naturalness and adaptability across diverse usage scenarios. However, the creation of high-quality annotated data for Task-Oriented Dialog (TOD) is recognized to be slow and costly. To address these challenges, we introduce Task-Oriented Automatic Dialogs (TOAD), a novel and scalable TOD dataset along with its automatic generation pipeline. The TOAD dataset simulates realistic app context interaction and provide a variety of system response style options. Two aspects of system response styles are considered, verbosity level and users' expression mirroring. We benchmark TOAD on two response generation tasks and the results show that modelling more verbose or responses without user expression mirroring is more challenging.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
User Privacy Harms and Risks in Conversational AI: A Proposed Framework 人工智能对话中的用户隐私危害与风险:一个拟议框架
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.09716
Ece Gumusel, Kyrie Zhixuan Zhou, M. Sanfilippo
This study presents a unique framework that applies and extends Solove (2006)'s taxonomy to address privacy concerns in interactions with text-based AI chatbots. As chatbot prevalence grows, concerns about user privacy have heightened. While existing literature highlights design elements compromising privacy, a comprehensive framework is lacking. Through semi-structured interviews with 13 participants interacting with two AI chatbots, this study identifies 9 privacy harms and 9 privacy risks in text-based interactions. Using a grounded theory approach for interview and chatlog analysis, the framework examines privacy implications at various interaction stages. The aim is to offer developers, policymakers, and researchers a tool for responsible and secure implementation of conversational AI, filling the existing gap in addressing privacy issues associated with text-based AI chatbots.
本研究提出了一个独特的框架,应用并扩展了 Solove(2006 年)的分类法,以解决与基于文本的人工智能聊天机器人交互过程中的隐私问题。随着聊天机器人的普及,人们对用户隐私的担忧也随之增加。虽然现有文献强调了损害隐私的设计要素,但缺乏一个全面的框架。本研究通过对与两个人工智能聊天机器人互动的 13 名参与者进行半结构式访谈,确定了基于文本的互动中的 9 种隐私危害和 9 种隐私风险。该框架采用基础理论方法进行访谈和聊天记录分析,研究了不同交互阶段的隐私影响。其目的是为开发人员、决策者和研究人员提供一个负责任地、安全地实施人工智能对话的工具,填补在解决与基于文本的人工智能聊天机器人相关的隐私问题方面的现有空白。
{"title":"User Privacy Harms and Risks in Conversational AI: A Proposed Framework","authors":"Ece Gumusel, Kyrie Zhixuan Zhou, M. Sanfilippo","doi":"10.48550/arXiv.2402.09716","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09716","url":null,"abstract":"This study presents a unique framework that applies and extends Solove (2006)'s taxonomy to address privacy concerns in interactions with text-based AI chatbots. As chatbot prevalence grows, concerns about user privacy have heightened. While existing literature highlights design elements compromising privacy, a comprehensive framework is lacking. Through semi-structured interviews with 13 participants interacting with two AI chatbots, this study identifies 9 privacy harms and 9 privacy risks in text-based interactions. Using a grounded theory approach for interview and chatlog analysis, the framework examines privacy implications at various interaction stages. The aim is to offer developers, policymakers, and researchers a tool for responsible and secure implementation of conversational AI, filling the existing gap in addressing privacy issues associated with text-based AI chatbots.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Systematic Evaluation of Evolving Highly Nonlinear Boolean Functions in Odd Sizes 对奇数大小高度非线性布尔函数进化的系统评估
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.09937
C. Carlet, Marko Ðurasevic, D. Jakobović, S. Picek, L. Mariot
Boolean functions are mathematical objects used in diverse applications. Different applications also have different requirements, making the research on Boolean functions very active. In the last 30 years, evolutionary algorithms have been shown to be a strong option for evolving Boolean functions in different sizes and with different properties. Still, most of those works consider similar settings and provide results that are mostly interesting from the evolutionary algorithm's perspective. This work considers the problem of evolving highly nonlinear Boolean functions in odd sizes. While the problem formulation sounds simple, the problem is remarkably difficult, and the related work is extremely scarce. We consider three solutions encodings and four Boolean function sizes and run a detailed experimental analysis. Our results show that the problem is challenging, and finding optimal solutions is impossible except for the smallest tested size. However, once we added local search to the evolutionary algorithm, we managed to find a Boolean function in nine inputs with nonlinearity 241, which, to our knowledge, had never been accomplished before with evolutionary algorithms.
布尔函数是用于各种应用的数学对象。不同的应用也有不同的要求,因此布尔函数的研究非常活跃。在过去的 30 年中,进化算法已被证明是进化不同大小和不同性质的布尔函数的有力选择。不过,这些研究大多考虑的是类似的设置,提供的结果大多是从进化算法的角度来看比较有趣的。这项工作考虑的是奇数大小的高度非线性布尔函数的进化问题。虽然问题的表述听起来很简单,但这个问题却非常困难,相关的工作也非常少。我们考虑了三种解决方案编码和四种布尔函数大小,并进行了详细的实验分析。我们的结果表明,这个问题极具挑战性,除了最小的测试规模外,找到最优解是不可能的。然而,一旦我们在进化算法中加入局部搜索,我们就能在九个输入中找到一个布尔函数,非线性241,据我们所知,这是进化算法从未实现过的。
{"title":"A Systematic Evaluation of Evolving Highly Nonlinear Boolean Functions in Odd Sizes","authors":"C. Carlet, Marko Ðurasevic, D. Jakobović, S. Picek, L. Mariot","doi":"10.48550/arXiv.2402.09937","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09937","url":null,"abstract":"Boolean functions are mathematical objects used in diverse applications. Different applications also have different requirements, making the research on Boolean functions very active. In the last 30 years, evolutionary algorithms have been shown to be a strong option for evolving Boolean functions in different sizes and with different properties. Still, most of those works consider similar settings and provide results that are mostly interesting from the evolutionary algorithm's perspective. This work considers the problem of evolving highly nonlinear Boolean functions in odd sizes. While the problem formulation sounds simple, the problem is remarkably difficult, and the related work is extremely scarce. We consider three solutions encodings and four Boolean function sizes and run a detailed experimental analysis. Our results show that the problem is challenging, and finding optimal solutions is impossible except for the smallest tested size. However, once we added local search to the evolutionary algorithm, we managed to find a Boolean function in nine inputs with nonlinearity 241, which, to our knowledge, had never been accomplished before with evolutionary algorithms.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Alpha Transparency In Language And Vision-Based AI Systems 在基于语言和视觉的人工智能系统中利用阿尔法透明度
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.09671
David A. Noever, Forrest McKee
This investigation reveals a novel exploit derived from PNG image file formats, specifically their alpha transparency layer, and its potential to fool multiple AI vision systems. Our method uses this alpha layer as a clandestine channel invisible to human observers but fully actionable by AI image processors. The scope tested for the vulnerability spans representative vision systems from Apple, Microsoft, Google, Salesforce, Nvidia, and Facebook, highlighting the attack's potential breadth. This vulnerability challenges the security protocols of existing and fielded vision systems, from medical imaging to autonomous driving technologies. Our experiments demonstrate that the affected systems, which rely on convolutional neural networks or the latest multimodal language models, cannot quickly mitigate these vulnerabilities through simple patches or updates. Instead, they require retraining and architectural changes, indicating a persistent hole in multimodal technologies without some future adversarial hardening against such vision-language exploits.
这项研究揭示了一种源自 PNG 图像文件格式(特别是其阿尔法透明层)的新型漏洞利用方法,及其欺骗多种人工智能视觉系统的潜力。我们的方法将阿尔法层用作人类观察者看不到、但人工智能图像处理器完全可以操作的秘密通道。该漏洞的测试范围涵盖苹果、微软、谷歌、Salesforce、Nvidia 和 Facebook 等公司的代表性视觉系统,凸显了攻击的潜在广度。从医疗成像到自动驾驶技术,该漏洞对现有和已投入使用的视觉系统的安全协议提出了挑战。我们的实验表明,依赖卷积神经网络或最新多模态语言模型的受影响系统无法通过简单的补丁或更新快速缓解这些漏洞。相反,它们需要重新训练和改变架构,这表明,如果未来不针对此类视觉语言漏洞进行对抗性加固,多模态技术中的漏洞将长期存在。
{"title":"Exploiting Alpha Transparency In Language And Vision-Based AI Systems","authors":"David A. Noever, Forrest McKee","doi":"10.48550/arXiv.2402.09671","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09671","url":null,"abstract":"This investigation reveals a novel exploit derived from PNG image file formats, specifically their alpha transparency layer, and its potential to fool multiple AI vision systems. Our method uses this alpha layer as a clandestine channel invisible to human observers but fully actionable by AI image processors. The scope tested for the vulnerability spans representative vision systems from Apple, Microsoft, Google, Salesforce, Nvidia, and Facebook, highlighting the attack's potential breadth. This vulnerability challenges the security protocols of existing and fielded vision systems, from medical imaging to autonomous driving technologies. Our experiments demonstrate that the affected systems, which rely on convolutional neural networks or the latest multimodal language models, cannot quickly mitigate these vulnerabilities through simple patches or updates. Instead, they require retraining and architectural changes, indicating a persistent hole in multimodal technologies without some future adversarial hardening against such vision-language exploits.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
User Modeling and User Profiling: A Comprehensive Survey 用户建模和用户分析:全面调查
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.09660
Erasmo Purificato, Ludovico Boratto, Ernesto William De Luca
The integration of artificial intelligence (AI) into daily life, particularly through information retrieval and recommender systems, has necessitated advanced user modeling and profiling techniques to deliver personalized experiences. These techniques aim to construct accurate user representations based on the rich amounts of data generated through interactions with these systems. This paper presents a comprehensive survey of the current state, evolution, and future directions of user modeling and profiling research. We provide a historical overview, tracing the development from early stereotype models to the latest deep learning techniques, and propose a novel taxonomy that encompasses all active topics in this research area, including recent trends. Our survey highlights the paradigm shifts towards more sophisticated user profiling methods, emphasizing implicit data collection, multi-behavior modeling, and the integration of graph data structures. We also address the critical need for privacy-preserving techniques and the push towards explainability and fairness in user modeling approaches. By examining the definitions of core terminology, we aim to clarify ambiguities and foster a clearer understanding of the field by proposing two novel encyclopedic definitions of the main terms. Furthermore, we explore the application of user modeling in various domains, such as fake news detection, cybersecurity, and personalized education. This survey serves as a comprehensive resource for researchers and practitioners, offering insights into the evolution of user modeling and profiling and guiding the development of more personalized, ethical, and effective AI systems.
人工智能(AI)与日常生活的融合,特别是通过信息检索和推荐系统,需要先进的用户建模和分析技术来提供个性化体验。这些技术旨在根据与这些系统交互时产生的大量数据构建准确的用户表征。本文全面介绍了用户建模和特征分析研究的现状、发展和未来方向。我们提供了一个历史概述,追溯了从早期定型模型到最新深度学习技术的发展历程,并提出了一个新颖的分类法,涵盖了该研究领域的所有活跃课题,包括最新趋势。我们的调查突出了向更复杂的用户剖析方法的范式转变,强调了隐式数据收集、多行为建模和图数据结构的整合。我们还讨论了对隐私保护技术的迫切需求,以及在用户建模方法中对可解释性和公平性的推动。通过研究核心术语的定义,我们提出了两个新颖的百科全书式的主要术语定义,旨在澄清歧义,促进对该领域更清晰的理解。此外,我们还探讨了用户建模在假新闻检测、网络安全和个性化教育等不同领域的应用。这份调查报告为研究人员和从业人员提供了全面的资源,让他们深入了解用户建模和用户画像的演变,并为开发更个性化、更道德、更有效的人工智能系统提供指导。
{"title":"User Modeling and User Profiling: A Comprehensive Survey","authors":"Erasmo Purificato, Ludovico Boratto, Ernesto William De Luca","doi":"10.48550/arXiv.2402.09660","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09660","url":null,"abstract":"The integration of artificial intelligence (AI) into daily life, particularly through information retrieval and recommender systems, has necessitated advanced user modeling and profiling techniques to deliver personalized experiences. These techniques aim to construct accurate user representations based on the rich amounts of data generated through interactions with these systems. This paper presents a comprehensive survey of the current state, evolution, and future directions of user modeling and profiling research. We provide a historical overview, tracing the development from early stereotype models to the latest deep learning techniques, and propose a novel taxonomy that encompasses all active topics in this research area, including recent trends. Our survey highlights the paradigm shifts towards more sophisticated user profiling methods, emphasizing implicit data collection, multi-behavior modeling, and the integration of graph data structures. We also address the critical need for privacy-preserving techniques and the push towards explainability and fairness in user modeling approaches. By examining the definitions of core terminology, we aim to clarify ambiguities and foster a clearer understanding of the field by proposing two novel encyclopedic definitions of the main terms. Furthermore, we explore the application of user modeling in various domains, such as fake news detection, cybersecurity, and personalized education. This survey serves as a comprehensive resource for researchers and practitioners, offering insights into the evolution of user modeling and profiling and guiding the development of more personalized, ethical, and effective AI systems.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is Continual Learning Ready for Real-world Challenges? 持续学习能否应对现实世界的挑战?
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.10130
Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler
Despite continual learning's long and well-established academic history, its application in real-world scenarios remains rather limited. This paper contends that this gap is attributable to a misalignment between the actual challenges of continual learning and the evaluation protocols in use, rendering proposed solutions ineffective for addressing the complexities of real-world setups. We validate our hypothesis and assess progress to date, using a new 3D semantic segmentation benchmark, OCL-3DSS. We investigate various continual learning schemes from the literature by utilizing more realistic protocols that necessitate online and continual learning for dynamic, real-world scenarios (eg., in robotics and 3D vision applications). The outcomes are sobering: all considered methods perform poorly, significantly deviating from the upper bound of joint offline training. This raises questions about the applicability of existing methods in realistic settings. Our paper aims to initiate a paradigm shift, advocating for the adoption of continual learning methods through new experimental protocols that better emulate real-world conditions to facilitate breakthroughs in the field.
尽管持续学习在学术界有着悠久的历史和良好的口碑,但其在现实世界中的应用仍然相当有限。本文认为,造成这一差距的原因是持续学习的实际挑战与使用中的评估协议之间存在偏差,导致提出的解决方案无法有效解决现实世界中的复杂问题。我们利用新的三维语义分割基准 OCL-3DSS 验证了我们的假设,并评估了迄今为止取得的进展。我们利用更现实的协议来研究文献中的各种持续学习方案,这些协议要求在动态的真实世界场景(如机器人和三维视觉应用)中进行在线持续学习。结果令人警醒:所有考虑的方法都表现不佳,明显偏离了联合离线训练的上限。这就对现有方法在现实环境中的适用性提出了质疑。我们的论文旨在启动范式转变,倡导通过新的实验协议采用持续学习方法,更好地模拟现实世界的条件,以促进该领域的突破。
{"title":"Is Continual Learning Ready for Real-world Challenges?","authors":"Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler","doi":"10.48550/arXiv.2402.10130","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10130","url":null,"abstract":"Despite continual learning's long and well-established academic history, its application in real-world scenarios remains rather limited. This paper contends that this gap is attributable to a misalignment between the actual challenges of continual learning and the evaluation protocols in use, rendering proposed solutions ineffective for addressing the complexities of real-world setups. We validate our hypothesis and assess progress to date, using a new 3D semantic segmentation benchmark, OCL-3DSS. We investigate various continual learning schemes from the literature by utilizing more realistic protocols that necessitate online and continual learning for dynamic, real-world scenarios (eg., in robotics and 3D vision applications). The outcomes are sobering: all considered methods perform poorly, significantly deviating from the upper bound of joint offline training. This raises questions about the applicability of existing methods in realistic settings. Our paper aims to initiate a paradigm shift, advocating for the adoption of continual learning methods through new experimental protocols that better emulate real-world conditions to facilitate breakthroughs in the field.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent 多才多艺的变压器代理商
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.09844
Quentin Gallou'edec, Edward Beeching, Cl'ement Romac, Emmanuel Dellandr'ea
The search for a general model that can operate seamlessly across multiple domains remains a key goal in machine learning research. The prevailing methodology in Reinforcement Learning (RL) typically limits models to a single task within a unimodal framework, a limitation that contrasts with the broader vision of a versatile, multi-domain model. In this paper, we present Jack of All Trades (JAT), a transformer-based model with a unique design optimized for handling sequential decision-making tasks and multimodal data types. The JAT model demonstrates its robust capabilities and versatility by achieving strong performance on very different RL benchmarks, along with promising results on Computer Vision (CV) and Natural Language Processing (NLP) tasks, all using a single set of weights. The JAT model marks a significant step towards more general, cross-domain AI model design, and notably, it is the first model of its kind to be fully open-sourced (see https://huggingface.co/jat-project/jat), including a pioneering general-purpose dataset.
寻找一种能在多个领域无缝运行的通用模型仍然是机器学习研究的一个关键目标。强化学习(RL)中的主流方法通常将模型限制在单模态框架内的单一任务中,这种局限性与多功能、多领域模型的广阔愿景形成了鲜明对比。在本文中,我们介绍了 Jack of All Trades (JAT),这是一种基于转换器的模型,其独特的设计针对处理顺序决策任务和多模态数据类型进行了优化。JAT 模型在不同的 RL 基准上都取得了很好的性能,在计算机视觉(CV)和自然语言处理(NLP)任务上也取得了可喜的成果,所有这些都使用了单组权重,从而展示了其强大的能力和多功能性。JAT 模型标志着向更通用的跨领域人工智能模型设计迈出了重要一步,值得注意的是,它是首个完全开源的同类模型(见 https://huggingface.co/jat-project/jat),包括一个开创性的通用数据集。
{"title":"Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent","authors":"Quentin Gallou'edec, Edward Beeching, Cl'ement Romac, Emmanuel Dellandr'ea","doi":"10.48550/arXiv.2402.09844","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09844","url":null,"abstract":"The search for a general model that can operate seamlessly across multiple domains remains a key goal in machine learning research. The prevailing methodology in Reinforcement Learning (RL) typically limits models to a single task within a unimodal framework, a limitation that contrasts with the broader vision of a versatile, multi-domain model. In this paper, we present Jack of All Trades (JAT), a transformer-based model with a unique design optimized for handling sequential decision-making tasks and multimodal data types. The JAT model demonstrates its robust capabilities and versatility by achieving strong performance on very different RL benchmarks, along with promising results on Computer Vision (CV) and Natural Language Processing (NLP) tasks, all using a single set of weights. The JAT model marks a significant step towards more general, cross-domain AI model design, and notably, it is the first model of its kind to be fully open-sourced (see https://huggingface.co/jat-project/jat), including a pioneering general-purpose dataset.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Reducing Diagnostic Errors with Interpretable Risk Prediction 通过可解释的风险预测减少诊断错误
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.10109
Denis Jered McInerney, William Dickinson, Lucy Flynn, Andrea Young, Geoffrey Young, J.-W. van de Meent, Byron C. Wallace
Many diagnostic errors occur because clinicians cannot easily access relevant information in patient Electronic Health Records (EHRs). In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. In particular, we propose a Neural Additive Model to make predictions backed by evidence with individualized risk estimates at time-points where clinicians are still uncertain, aiming to specifically mitigate delays in diagnosis and errors stemming from an incomplete differential. To train such a model, it is necessary to infer temporally fine-grained retrospective labels of eventual"true"diagnoses. We do so with LLMs, to ensure that the input text is from before a confident diagnosis can be made. We use an LLM to retrieve an initial pool of evidence, but then refine this set of evidence according to correlations learned by the model. We conduct an in-depth evaluation of the usefulness of our approach by simulating how it might be used by a clinician to decide between a pre-defined list of differential diagnoses.
由于临床医生无法轻松获取患者电子健康记录(EHR)中的相关信息,因此出现了许多诊断错误。在这项工作中,我们提出了一种方法,利用 LLMs 来识别病人电子健康记录数据中表明特定诊断风险增加或减少的证据片段;我们的最终目的是增加证据的获取途径,减少诊断错误。特别是,我们提出了一种神经相加模型,在临床医生仍不确定的时间点上,以证据为支持进行预测,并提供个性化的风险估计,目的是特别减少因不完全鉴别而导致的诊断延误和错误。要训练这样一个模型,就必须推断出最终 "真实 "诊断的时间细粒度回溯标签。我们使用 LLM 来完成这项工作,以确保在做出可靠诊断之前,输入的文本是真实的。我们使用 LLM 来检索初始证据库,然后根据模型学习到的相关性来完善这组证据。我们通过模拟临床医生如何使用我们的方法在预定义的鉴别诊断列表中做出决定,对我们的方法的实用性进行了深入评估。
{"title":"Towards Reducing Diagnostic Errors with Interpretable Risk Prediction","authors":"Denis Jered McInerney, William Dickinson, Lucy Flynn, Andrea Young, Geoffrey Young, J.-W. van de Meent, Byron C. Wallace","doi":"10.48550/arXiv.2402.10109","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10109","url":null,"abstract":"Many diagnostic errors occur because clinicians cannot easily access relevant information in patient Electronic Health Records (EHRs). In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. In particular, we propose a Neural Additive Model to make predictions backed by evidence with individualized risk estimates at time-points where clinicians are still uncertain, aiming to specifically mitigate delays in diagnosis and errors stemming from an incomplete differential. To train such a model, it is necessary to infer temporally fine-grained retrospective labels of eventual\"true\"diagnoses. We do so with LLMs, to ensure that the input text is from before a confident diagnosis can be made. We use an LLM to retrieve an initial pool of evidence, but then refine this set of evidence according to correlations learned by the model. We conduct an in-depth evaluation of the usefulness of our approach by simulating how it might be used by a clinician to decide between a pre-defined list of differential diagnoses.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification Diffusion Models 分类扩散模型
Pub Date : 2024-02-15 DOI: 10.48550/arXiv.2402.10095
Shahar Yadin, Noam Elata, T. Michaeli
A prominent family of methods for learning data distributions relies on density ratio estimation (DRE), where a model is trained to $textit{classify}$ between data samples and samples from some reference distribution. These techniques are successful in simple low-dimensional settings but fail to achieve good results on complex high-dimensional data, like images. A different family of methods for learning distributions is that of denoising diffusion models (DDMs), in which a model is trained to $textit{denoise}$ data samples. These approaches achieve state-of-the-art results in image, video, and audio generation. In this work, we present $textit{Classification Diffusion Models}$ (CDMs), a generative technique that adopts the denoising-based formalism of DDMs while making use of a classifier that predicts the amount of noise added to a clean signal, similarly to DRE methods. Our approach is based on the observation that an MSE-optimal denoiser for white Gaussian noise can be expressed in terms of the gradient of a cross-entropy-optimal classifier for predicting the noise level. As we illustrate, CDM achieves better denoising results compared to DDM, and leads to at least comparable FID in image generation. CDM is also capable of highly efficient one-step exact likelihood estimation, achieving state-of-the-art results among methods that use a single step. Code is available on the project's webpage in https://shaharYadin.github.io/CDM/ .
学习数据分布的一系列著名方法都依赖于密度比估计(DRE),即训练模型在数据样本和来自某种参考分布的样本之间进行 $textit{classify}$。这些技术在简单的低维设置中取得了成功,但在复杂的高维数据(如图像)中却无法取得良好的效果。去噪扩散模型(Denoising diffusion models,DDMs)是学习分布的一个不同方法系列,其中一个模型被训练为 $textit{denoise}$ 数据样本。这些方法在图像、视频和音频生成方面取得了最先进的成果。在这项工作中,我们提出了$textit{分类扩散模型}$ (CDMs),这是一种生成技术,它采用了 DDMs 基于去噪的形式主义,同时利用分类器预测添加到干净信号中的噪声量,与 DRE 方法类似。我们的方法基于以下观察:白高斯噪声的 MSE 最佳去噪器可以用预测噪声水平的交叉熵最佳分类器的梯度来表示。正如我们所说明的,CDM 与 DDM 相比能获得更好的去噪效果,在生成图像时至少能达到相当的 FID。CDM 还能进行高效的单步精确似然估计,在使用单步估计的方法中取得了最先进的结果。代码可在该项目的网页 https://shaharYadin.github.io/CDM/ 上获取。
{"title":"Classification Diffusion Models","authors":"Shahar Yadin, Noam Elata, T. Michaeli","doi":"10.48550/arXiv.2402.10095","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10095","url":null,"abstract":"A prominent family of methods for learning data distributions relies on density ratio estimation (DRE), where a model is trained to $textit{classify}$ between data samples and samples from some reference distribution. These techniques are successful in simple low-dimensional settings but fail to achieve good results on complex high-dimensional data, like images. A different family of methods for learning distributions is that of denoising diffusion models (DDMs), in which a model is trained to $textit{denoise}$ data samples. These approaches achieve state-of-the-art results in image, video, and audio generation. In this work, we present $textit{Classification Diffusion Models}$ (CDMs), a generative technique that adopts the denoising-based formalism of DDMs while making use of a classifier that predicts the amount of noise added to a clean signal, similarly to DRE methods. Our approach is based on the observation that an MSE-optimal denoiser for white Gaussian noise can be expressed in terms of the gradient of a cross-entropy-optimal classifier for predicting the noise level. As we illustrate, CDM achieves better denoising results compared to DDM, and leads to at least comparable FID in image generation. CDM is also capable of highly efficient one-step exact likelihood estimation, achieving state-of-the-art results among methods that use a single step. Code is available on the project's webpage in https://shaharYadin.github.io/CDM/ .","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ArXiv
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1