首页 > 最新文献

arXiv - CS - Artificial Intelligence最新文献

英文 中文
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation 从端到端可微分仿真看自主车辆控制器
Pub Date : 2024-09-12 DOI: arxiv-2409.07965
Asen Nachkov, Danda Pani Paudel, Luc Van Gool
Current methods to learn controllers for autonomous vehicles (AVs) focus onbehavioural cloning. Being trained only on exact historic data, the resultingagents often generalize poorly to novel scenarios. Simulators provide theopportunity to go beyond offline datasets, but they are still treated ascomplicated black boxes, only used to update the global simulation state. As aresult, these RL algorithms are slow, sample-inefficient, and prior-agnostic.In this work, we leverage a differentiable simulator and design an analyticpolicy gradients (APG) approach to training AV controllers on the large-scaleWaymo Open Motion Dataset. Our proposed framework brings the differentiablesimulator into an end-to-end training loop, where gradients of the environmentdynamics serve as a useful prior to help the agent learn a more groundedpolicy. We combine this setup with a recurrent architecture that canefficiently propagate temporal information across long simulated trajectories.This APG method allows us to learn robust, accurate, and fast policies, whileonly requiring widely-available expert trajectories, instead of scarce expertactions. We compare to behavioural cloning and find significant improvements inperformance and robustness to noise in the dynamics, as well as overall moreintuitive human-like handling.
目前学习自动驾驶汽车(AV)控制器的方法主要集中在行为克隆上。由于只能在精确的历史数据基础上进行训练,由此产生的控制器对新场景的泛化能力往往很差。模拟器提供了超越离线数据集的机会,但仍被视为复杂的黑盒子,仅用于更新全局模拟状态。在这项工作中,我们利用可微分模拟器,设计了一种分析政策梯度(APG)方法,在大规模的 Waymo 开放运动数据集上训练 AV 控制器。我们提出的框架将可微分模拟器引入端到端训练循环,其中环境动力学梯度可作为有用的先验,帮助代理学习更接地气的政策。这种 APG 方法使我们能够学习稳健、准确和快速的策略,同时只需要广泛可用的专家轨迹,而不是稀缺的专家交互。我们将其与行为克隆进行了比较,发现其在性能和对动态噪声的鲁棒性方面都有显著提高,而且整体处理方式更直观,更像人类。
{"title":"Autonomous Vehicle Controllers From End-to-End Differentiable Simulation","authors":"Asen Nachkov, Danda Pani Paudel, Luc Van Gool","doi":"arxiv-2409.07965","DOIUrl":"https://doi.org/arxiv-2409.07965","url":null,"abstract":"Current methods to learn controllers for autonomous vehicles (AVs) focus on\u0000behavioural cloning. Being trained only on exact historic data, the resulting\u0000agents often generalize poorly to novel scenarios. Simulators provide the\u0000opportunity to go beyond offline datasets, but they are still treated as\u0000complicated black boxes, only used to update the global simulation state. As a\u0000result, these RL algorithms are slow, sample-inefficient, and prior-agnostic.\u0000In this work, we leverage a differentiable simulator and design an analytic\u0000policy gradients (APG) approach to training AV controllers on the large-scale\u0000Waymo Open Motion Dataset. Our proposed framework brings the differentiable\u0000simulator into an end-to-end training loop, where gradients of the environment\u0000dynamics serve as a useful prior to help the agent learn a more grounded\u0000policy. We combine this setup with a recurrent architecture that can\u0000efficiently propagate temporal information across long simulated trajectories.\u0000This APG method allows us to learn robust, accurate, and fast policies, while\u0000only requiring widely-available expert trajectories, instead of scarce expert\u0000actions. We compare to behavioural cloning and find significant improvements in\u0000performance and robustness to noise in the dynamics, as well as overall more\u0000intuitive human-like handling.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In-Situ Fine-Tuning of Wildlife Models in IoT-Enabled Camera Traps for Efficient Adaptation 原位微调物联网摄像头捕获器中的野生动物模型,实现高效适应
Pub Date : 2024-09-12 DOI: arxiv-2409.07796
Mohammad Mehdi Rastikerdar, Jin Huang, Hui Guan, Deepak Ganesan
Wildlife monitoring via camera traps has become an essential tool in ecology,but the deployment of machine learning models for on-device animalclassification faces significant challenges due to domain shifts and resourceconstraints. This paper introduces WildFit, a novel approach that reconcilesthe conflicting goals of achieving high domain generalization performance andensuring efficient inference for camera trap applications. WildFit leveragescontinuous background-aware model fine-tuning to deploy ML models tailored tothe current location and time window, allowing it to maintain robustclassification accuracy in the new environment without requiring significantcomputational resources. This is achieved by background-aware data synthesis,which generates training images representing the new domain by blendingbackground images with animal images from the source domain. We further enhancefine-tuning effectiveness through background drift detection and classdistribution drift detection, which optimize the quality of synthesized dataand improve generalization performance. Our extensive evaluation acrossmultiple camera trap datasets demonstrates that WildFit achieves significantimprovements in classification accuracy and computational efficiency comparedto traditional approaches.
通过相机陷阱对野生动物进行监测已成为生态学的重要工具,但由于领域转移和资源限制,在设备上部署用于动物分类的机器学习模型面临着巨大挑战。本文介绍的 WildFit 是一种新颖的方法,它能在实现高领域泛化性能和确保相机陷阱应用的高效推理这两个相互冲突的目标之间取得平衡。WildFit 利用连续的背景感知模型微调技术,部署适合当前位置和时间窗口的 ML 模型,使其能够在新环境中保持稳健的分类准确性,而无需大量的计算资源。这是通过背景感知数据合成实现的,它通过将背景图像与源领域的动物图像混合生成代表新领域的训练图像。我们通过背景漂移检测和类分布漂移检测进一步提高了微调效果,从而优化了合成数据的质量,提高了泛化性能。我们在多个相机陷阱数据集上进行的广泛评估表明,与传统方法相比,WildFit 在分类准确性和计算效率方面都有显著提高。
{"title":"In-Situ Fine-Tuning of Wildlife Models in IoT-Enabled Camera Traps for Efficient Adaptation","authors":"Mohammad Mehdi Rastikerdar, Jin Huang, Hui Guan, Deepak Ganesan","doi":"arxiv-2409.07796","DOIUrl":"https://doi.org/arxiv-2409.07796","url":null,"abstract":"Wildlife monitoring via camera traps has become an essential tool in ecology,\u0000but the deployment of machine learning models for on-device animal\u0000classification faces significant challenges due to domain shifts and resource\u0000constraints. This paper introduces WildFit, a novel approach that reconciles\u0000the conflicting goals of achieving high domain generalization performance and\u0000ensuring efficient inference for camera trap applications. WildFit leverages\u0000continuous background-aware model fine-tuning to deploy ML models tailored to\u0000the current location and time window, allowing it to maintain robust\u0000classification accuracy in the new environment without requiring significant\u0000computational resources. This is achieved by background-aware data synthesis,\u0000which generates training images representing the new domain by blending\u0000background images with animal images from the source domain. We further enhance\u0000fine-tuning effectiveness through background drift detection and class\u0000distribution drift detection, which optimize the quality of synthesized data\u0000and improve generalization performance. Our extensive evaluation across\u0000multiple camera trap datasets demonstrates that WildFit achieves significant\u0000improvements in classification accuracy and computational efficiency compared\u0000to traditional approaches.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? DSBench:数据科学代理离成为数据科学专家还有多远?
Pub Date : 2024-09-12 DOI: arxiv-2409.07703
Liqiang Jing, Zhehui Huang, Xiaoyang Wang, Wenlin Yao, Wenhao Yu, Kaixin Ma, Hongming Zhang, Xinya Du, Dong Yu
Large Language Models (LLMs) and Large Vision-Language Models (LVLMs) havedemonstrated impressive language/vision reasoning abilities, igniting therecent trend of building agents for targeted applications such as shoppingassistants or AI software engineers. Recently, many data science benchmarkshave been proposed to investigate their performance in the data science domain.However, existing data science benchmarks still fall short when compared toreal-world data science applications due to their simplified settings. Tobridge this gap, we introduce DSBench, a comprehensive benchmark designed toevaluate data science agents with realistic tasks. This benchmark includes 466data analysis tasks and 74 data modeling tasks, sourced from Eloquence andKaggle competitions. DSBench offers a realistic setting by encompassing longcontexts, multimodal task backgrounds, reasoning with large data files andmulti-table structures, and performing end-to-end data modeling tasks. Ourevaluation of state-of-the-art LLMs, LVLMs, and agents shows that they strugglewith most tasks, with the best agent solving only 34.12% of data analysis tasksand achieving a 34.74% Relative Performance Gap (RPG). These findingsunderscore the need for further advancements in developing more practical,intelligent, and autonomous data science agents.
大型语言模型(LLMs)和大型视觉语言模型(LVLMs)已经展示了令人印象深刻的语言/视觉推理能力,从而引发了为购物助手或人工智能软件工程师等目标应用构建代理的新趋势。最近,人们提出了许多数据科学基准,以研究它们在数据科学领域的性能。然而,现有的数据科学基准由于设置简化,与真实世界的数据科学应用相比仍有不足。为了弥补这一不足,我们引入了 DSBench,这是一个综合性基准,旨在通过现实任务评估数据科学代理。该基准包括 466 项数据分析任务和 74 项数据建模任务,均来自 Eloquence 和 Kaggle 竞赛。DSBench 提供了一个逼真的环境,包括长上下文、多模式任务背景、大型数据文件和多表结构推理,以及执行端到端数据建模任务。对最先进的 LLM、LVLM 和代理的评估表明,它们在大多数任务中都很吃力,最好的代理只能解决 34.12% 的数据分析任务,相对性能差距 (RPG) 为 34.74%。这些发现表明,需要进一步开发更实用、更智能、更自主的数据科学代理。
{"title":"DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?","authors":"Liqiang Jing, Zhehui Huang, Xiaoyang Wang, Wenlin Yao, Wenhao Yu, Kaixin Ma, Hongming Zhang, Xinya Du, Dong Yu","doi":"arxiv-2409.07703","DOIUrl":"https://doi.org/arxiv-2409.07703","url":null,"abstract":"Large Language Models (LLMs) and Large Vision-Language Models (LVLMs) have\u0000demonstrated impressive language/vision reasoning abilities, igniting the\u0000recent trend of building agents for targeted applications such as shopping\u0000assistants or AI software engineers. Recently, many data science benchmarks\u0000have been proposed to investigate their performance in the data science domain.\u0000However, existing data science benchmarks still fall short when compared to\u0000real-world data science applications due to their simplified settings. To\u0000bridge this gap, we introduce DSBench, a comprehensive benchmark designed to\u0000evaluate data science agents with realistic tasks. This benchmark includes 466\u0000data analysis tasks and 74 data modeling tasks, sourced from Eloquence and\u0000Kaggle competitions. DSBench offers a realistic setting by encompassing long\u0000contexts, multimodal task backgrounds, reasoning with large data files and\u0000multi-table structures, and performing end-to-end data modeling tasks. Our\u0000evaluation of state-of-the-art LLMs, LVLMs, and agents shows that they struggle\u0000with most tasks, with the best agent solving only 34.12% of data analysis tasks\u0000and achieving a 34.74% Relative Performance Gap (RPG). These findings\u0000underscore the need for further advancements in developing more practical,\u0000intelligent, and autonomous data science agents.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation 用于半监督医学图像分割的拉格朗日对偶性和复合多注意变换器
Pub Date : 2024-09-12 DOI: arxiv-2409.07793
Fuchen Zheng, Quanjun Li, Weixuan Li, Xuhang Chen, Yihang Dong, Guoheng Huang, Chi-Man Pun, Shoujun Zhou
Medical image segmentation, a critical application of semantic segmentationin healthcare, has seen significant advancements through specialized computervision techniques. While deep learning-based medical image segmentation isessential for assisting in medical diagnosis, the lack of diverse training datacauses the long-tail problem. Moreover, most previous hybrid CNN-ViTarchitectures have limited ability to combine various attentions in differentlayers of the Convolutional Neural Network. To address these issues, we proposea Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-AwareContrastive Loss, as the overall training objective for semi-supervisedlearning to mitigate the long-tail problem. Additionally, we introduceCMAformer, a novel network that synergizes the strengths of ResUNet andTransformer. The cross-attention block in CMAformer effectively integratesspatial attention and channel attention for multi-scale feature fusion.Overall, our results indicate that CMAformer, combined with the feature fusionframework and the new consistency loss, demonstrates strong complementarity insemi-supervised learning ensembles. We achieve state-of-the-art results onmultiple public medical image datasets. Example code are available at:url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.
医学图像分割是语义分割在医疗保健领域的重要应用,通过专业的计算机视觉技术,医学图像分割技术取得了长足的进步。虽然基于深度学习的医学图像分割对辅助医疗诊断至关重要,但缺乏多样化的训练数据会导致长尾问题。此外,之前的大多数混合 CNN-ViT 架构将各种注意力结合到卷积神经网络不同层的能力有限。为了解决这些问题,我们提出了拉格朗日对偶一致性(LDC)损失,并将其与边界感知对比损失(Boundary-AwareContrastive Loss)相结合,作为半监督学习的总体训练目标,以缓解长尾问题。此外,我们还引入了一种新型网络--CMAformer,它协同了 ResUNet 和 Transformer 的优势。总之,我们的研究结果表明,CMAformer 与特征融合框架和新的一致性损失相结合,在半监督学习集合中表现出很强的互补性。我们在多个公共医疗图像数据集上取得了最先进的结果。示例代码见:url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}。
{"title":"Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation","authors":"Fuchen Zheng, Quanjun Li, Weixuan Li, Xuhang Chen, Yihang Dong, Guoheng Huang, Chi-Man Pun, Shoujun Zhou","doi":"arxiv-2409.07793","DOIUrl":"https://doi.org/arxiv-2409.07793","url":null,"abstract":"Medical image segmentation, a critical application of semantic segmentation\u0000in healthcare, has seen significant advancements through specialized computer\u0000vision techniques. While deep learning-based medical image segmentation is\u0000essential for assisting in medical diagnosis, the lack of diverse training data\u0000causes the long-tail problem. Moreover, most previous hybrid CNN-ViT\u0000architectures have limited ability to combine various attentions in different\u0000layers of the Convolutional Neural Network. To address these issues, we propose\u0000a Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware\u0000Contrastive Loss, as the overall training objective for semi-supervised\u0000learning to mitigate the long-tail problem. Additionally, we introduce\u0000CMAformer, a novel network that synergizes the strengths of ResUNet and\u0000Transformer. The cross-attention block in CMAformer effectively integrates\u0000spatial attention and channel attention for multi-scale feature fusion.\u0000Overall, our results indicate that CMAformer, combined with the feature fusion\u0000framework and the new consistency loss, demonstrates strong complementarity in\u0000semi-supervised learning ensembles. We achieve state-of-the-art results on\u0000multiple public medical image datasets. Example code are available at:\u0000url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols 人工智能控制游戏:人工智能部署协议的安全评估模型
Pub Date : 2024-09-12 DOI: arxiv-2409.07985
Charlie Griffin, Louis Thomson, Buck Shlegeris, Alessandro Abate
To evaluate the safety and usefulness of deployment protocols for untrustedAIs, AI Control uses a red-teaming exercise played between a protocol designerand an adversary. This paper introduces AI-Control Games, a formaldecision-making model of the red-teaming exercise as a multi-objective,partially observable, stochastic game. We also introduce methods for findingoptimal protocols in AI-Control Games, by reducing them to a set of zero-sumpartially observable stochastic games. We apply our formalism to model,evaluate and synthesise protocols for deploying untrusted language models asprogramming assistants, focusing on Trusted Monitoring protocols, which useweaker language models and limited human assistance. Finally, we demonstratethe utility of our formalism by showcasing improvements over empirical studiesin existing settings, evaluating protocols in new settings, and analysing howmodelling assumptions affect the safety and usefulness of protocols.
为了评估不受信任的人工智能部署协议的安全性和实用性,《人工智能控制》使用了协议设计者与对手之间的 "红队演习"(red-teaming exercise)。本文介绍了人工智能控制游戏,这是一种多目标、部分可观测、随机博弈的红队练习形式决策模型。我们还介绍了在人工智能控制博弈中寻找最优协议的方法,将其简化为一组零-部分可观测随机博弈。我们将我们的形式主义应用于建模、评估和合成将不信任的语言模型部署为编程助手的协议,重点是使用弱者语言模型和有限人工协助的可信监控协议。最后,我们展示了我们的形式主义在现有环境下对经验研究的改进,评估了新环境下的协议,并分析了建模假设如何影响协议的安全性和实用性,从而证明了我们的形式主义的实用性。
{"title":"Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols","authors":"Charlie Griffin, Louis Thomson, Buck Shlegeris, Alessandro Abate","doi":"arxiv-2409.07985","DOIUrl":"https://doi.org/arxiv-2409.07985","url":null,"abstract":"To evaluate the safety and usefulness of deployment protocols for untrusted\u0000AIs, AI Control uses a red-teaming exercise played between a protocol designer\u0000and an adversary. This paper introduces AI-Control Games, a formal\u0000decision-making model of the red-teaming exercise as a multi-objective,\u0000partially observable, stochastic game. We also introduce methods for finding\u0000optimal protocols in AI-Control Games, by reducing them to a set of zero-sum\u0000partially observable stochastic games. We apply our formalism to model,\u0000evaluate and synthesise protocols for deploying untrusted language models as\u0000programming assistants, focusing on Trusted Monitoring protocols, which use\u0000weaker language models and limited human assistance. Finally, we demonstrate\u0000the utility of our formalism by showcasing improvements over empirical studies\u0000in existing settings, evaluating protocols in new settings, and analysing how\u0000modelling assumptions affect the safety and usefulness of protocols.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning 针对合作多代理深度强化学习的时空隐形后门攻击
Pub Date : 2024-09-12 DOI: arxiv-2409.07775
Yinbo Yu, Saihao Yan, Jiajia Liu
Recent studies have shown that cooperative multi-agent deep reinforcementlearning (c-MADRL) is under the threat of backdoor attacks. Once a backdoortrigger is observed, it will perform abnormal actions leading to failures ormalicious goals. However, existing proposed backdoors suffer from severalissues, e.g., fixed visual trigger patterns lack stealthiness, the backdoor istrained or activated by an additional network, or all agents are backdoored. Tothis end, in this paper, we propose a novel backdoor attack against c-MADRL,which attacks the entire multi-agent team by embedding the backdoor only in asingle agent. Firstly, we introduce adversary spatiotemporal behavior patternsas the backdoor trigger rather than manual-injected fixed visual patterns orinstant status and control the attack duration. This method can guarantee thestealthiness and practicality of injected backdoors. Secondly, we hack theoriginal reward function of the backdoored agent via reward reverse andunilateral guidance during training to ensure its adverse influence on theentire team. We evaluate our backdoor attacks on two classic c-MADRL algorithmsVDN and QMIX, in a popular c-MADRL environment SMAC. The experimental resultsdemonstrate that our backdoor attacks are able to reach a high attack successrate (91.6%) while maintaining a low clean performance variance rate (3.7%).
最近的研究表明,合作式多代理深度强化学习(c-MADRL)面临着后门攻击的威胁。一旦后门触发器被观察到,它就会执行异常行动,导致失败或恶意目标。然而,现有的后门存在几个问题,例如,固定的视觉触发模式缺乏隐蔽性,后门由额外的网络训练或激活,或者所有代理都被后门屏蔽。为此,我们在本文中提出了一种针对 c-MADRL 的新型后门攻击,即只在单个代理中嵌入后门,从而攻击整个多代理团队。首先,我们引入对手的时空行为模式作为后门触发器,而不是人工注入固定的视觉模式或瞬时状态,并控制攻击持续时间。这种方法可以保证注入后门的隐蔽性和实用性。其次,我们在训练过程中通过奖励反向和单边引导的方式黑掉了后门代理的原始奖励功能,以确保其对整个团队产生不利影响。我们在流行的 c-MADRL 环境 SMAC 中评估了对两种经典 c-MADRL 算法VDN 和 QMIX 的后门攻击。实验结果表明,我们的后门攻击能够达到较高的攻击成功率(91.6%),同时保持较低的清洁性能差异率(3.7%)。
{"title":"A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning","authors":"Yinbo Yu, Saihao Yan, Jiajia Liu","doi":"arxiv-2409.07775","DOIUrl":"https://doi.org/arxiv-2409.07775","url":null,"abstract":"Recent studies have shown that cooperative multi-agent deep reinforcement\u0000learning (c-MADRL) is under the threat of backdoor attacks. Once a backdoor\u0000trigger is observed, it will perform abnormal actions leading to failures or\u0000malicious goals. However, existing proposed backdoors suffer from several\u0000issues, e.g., fixed visual trigger patterns lack stealthiness, the backdoor is\u0000trained or activated by an additional network, or all agents are backdoored. To\u0000this end, in this paper, we propose a novel backdoor attack against c-MADRL,\u0000which attacks the entire multi-agent team by embedding the backdoor only in a\u0000single agent. Firstly, we introduce adversary spatiotemporal behavior patterns\u0000as the backdoor trigger rather than manual-injected fixed visual patterns or\u0000instant status and control the attack duration. This method can guarantee the\u0000stealthiness and practicality of injected backdoors. Secondly, we hack the\u0000original reward function of the backdoored agent via reward reverse and\u0000unilateral guidance during training to ensure its adverse influence on the\u0000entire team. We evaluate our backdoor attacks on two classic c-MADRL algorithms\u0000VDN and QMIX, in a popular c-MADRL environment SMAC. The experimental results\u0000demonstrate that our backdoor attacks are able to reach a high attack success\u0000rate (91.6%) while maintaining a low clean performance variance rate (3.7%).","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Inverse Graphics for Few-Shot Concept Learning 用于少量概念学习的贝叶斯逆向图形
Pub Date : 2024-09-12 DOI: arxiv-2409.08351
Octavio Arriaga, Jichen Guo, Rebecca Adam, Sebastian Houben, Frank Kirchner
Humans excel at building generalizations of new concepts from just one singleexample. Contrary to this, current computer vision models typically requirelarge amount of training samples to achieve a comparable accuracy. In this workwe present a Bayesian model of perception that learns using only minimal data,a prototypical probabilistic program of an object. Specifically, we propose agenerative inverse graphics model of primitive shapes, to infer posteriordistributions over physically consistent parameters from one or several images.We show how this representation can be used for downstream tasks such asfew-shot classification and pose estimation. Our model outperforms existingfew-shot neural-only classification algorithms and demonstrates generalizationacross varying lighting conditions, backgrounds, and out-of-distributionshapes. By design, our model is uncertainty-aware and uses our newdifferentiable renderer for optimizing global scene parameters through gradientdescent, sampling posterior distributions over object parameters with MarkovChain Monte Carlo (MCMC), and using a neural based likelihood function.
人类擅长从单个样本中归纳出新概念。与此相反,目前的计算机视觉模型通常需要大量的训练样本才能达到相当的准确度。在这项研究中,我们提出了一种贝叶斯感知模型,该模型只需使用极少量的数据(物体的原型概率程序)即可学习。具体来说,我们提出了一个原始形状的生成逆图形模型,从一张或多张图像中推断出物理上一致的参数的后分布。我们的模型优于现有的仅有少量镜头的神经分类算法,并展示了在不同光照条件、背景和分布外形状下的泛化能力。在设计上,我们的模型具有不确定性感知能力,并使用我们新的可微分渲染器,通过梯度下降优化全局场景参数,使用马尔可夫链蒙特卡罗(MCMC)对对象参数的后验分布进行采样,并使用基于神经的似然函数。
{"title":"Bayesian Inverse Graphics for Few-Shot Concept Learning","authors":"Octavio Arriaga, Jichen Guo, Rebecca Adam, Sebastian Houben, Frank Kirchner","doi":"arxiv-2409.08351","DOIUrl":"https://doi.org/arxiv-2409.08351","url":null,"abstract":"Humans excel at building generalizations of new concepts from just one single\u0000example. Contrary to this, current computer vision models typically require\u0000large amount of training samples to achieve a comparable accuracy. In this work\u0000we present a Bayesian model of perception that learns using only minimal data,\u0000a prototypical probabilistic program of an object. Specifically, we propose a\u0000generative inverse graphics model of primitive shapes, to infer posterior\u0000distributions over physically consistent parameters from one or several images.\u0000We show how this representation can be used for downstream tasks such as\u0000few-shot classification and pose estimation. Our model outperforms existing\u0000few-shot neural-only classification algorithms and demonstrates generalization\u0000across varying lighting conditions, backgrounds, and out-of-distribution\u0000shapes. By design, our model is uncertainty-aware and uses our new\u0000differentiable renderer for optimizing global scene parameters through gradient\u0000descent, sampling posterior distributions over object parameters with Markov\u0000Chain Monte Carlo (MCMC), and using a neural based likelihood function.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Framework for Predicting the Impact of Game Balance Changes through Meta Discovery 通过元发现预测游戏平衡变化影响的框架
Pub Date : 2024-09-11 DOI: arxiv-2409.07340
Akash Saravanan, Matthew Guzdial
A metagame is a collection of knowledge that goes beyond the rules of a game.In competitive, team-based games like Pok'emon or League of Legends, it refersto the set of current dominant characters and/or strategies within the playerbase. Developer changes to the balance of the game can have drastic andunforeseen consequences on these sets of meta characters. A framework forpredicting the impact of balance changes could aid developers in making moreinformed balance decisions. In this paper we present such a Meta Discoveryframework, leveraging Reinforcement Learning for automated testing of balancechanges. Our results demonstrate the ability to predict the outcome of balancechanges in Pok'emon Showdown, a collection of competitive Pok'emon tiers,with high accuracy.
在《Pok'emon》或《英雄联盟》等以团队为基础的竞技游戏中,元游戏是指玩家群体中当前占主导地位的角色和/或策略的集合。开发者对游戏平衡性的改变可能会对这些元角色集产生不可预见的严重后果。一个预测平衡性变化影响的框架可以帮助开发者做出更明智的平衡性决策。在本文中,我们提出了这样一个元发现框架,利用强化学习技术对平衡性变化进行自动测试。我们的研究结果表明,我们能够高精度地预测《Pok'emon Showdown》中平衡性变化的结果。
{"title":"A Framework for Predicting the Impact of Game Balance Changes through Meta Discovery","authors":"Akash Saravanan, Matthew Guzdial","doi":"arxiv-2409.07340","DOIUrl":"https://doi.org/arxiv-2409.07340","url":null,"abstract":"A metagame is a collection of knowledge that goes beyond the rules of a game.\u0000In competitive, team-based games like Pok'emon or League of Legends, it refers\u0000to the set of current dominant characters and/or strategies within the player\u0000base. Developer changes to the balance of the game can have drastic and\u0000unforeseen consequences on these sets of meta characters. A framework for\u0000predicting the impact of balance changes could aid developers in making more\u0000informed balance decisions. In this paper we present such a Meta Discovery\u0000framework, leveraging Reinforcement Learning for automated testing of balance\u0000changes. Our results demonstrate the ability to predict the outcome of balance\u0000changes in Pok'emon Showdown, a collection of competitive Pok'emon tiers,\u0000with high accuracy.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding Foundation Models: Are We Back in 1924? 了解地基模型:我们是否回到了 1924 年?
Pub Date : 2024-09-11 DOI: arxiv-2409.07618
Alan F. Smeaton
This position paper explores the rapid development of Foundation Models (FMs)in AI and their implications for intelligence and reasoning. It examines thecharacteristics of FMs, including their training on vast datasets and use ofembedding spaces to capture semantic relationships. The paper discusses recentadvancements in FMs' reasoning abilities which we argue cannot be attributed toincreased model size but to novel training techniques which yield learningphenomena like grokking. It also addresses the challenges in benchmarking FMsand compares their structure to the human brain. We argue that while FMs showpromising developments in reasoning and knowledge representation, understandingtheir inner workings remains a significant challenge, similar to ongoingefforts in neuroscience to comprehend human brain function. Despite having somesimilarities, fundamental differences between FMs and the structure of humanbrain warn us against making direct comparisons or expecting neuroscience toprovide immediate insights into FM function.
本立场文件探讨了人工智能中基础模型(FMs)的快速发展及其对智能和推理的影响。它探讨了基础模型的特点,包括在庞大的数据集上进行训练,以及使用嵌入空间来捕捉语义关系。论文讨论了 FMs 最近在推理能力方面取得的进展,我们认为这不能归因于模型规模的扩大,而是因为新颖的训练技术产生了摸索等学习现象。此外,我们还讨论了为调频装置设定基准所面临的挑战,并将调频装置的结构与人脑进行了比较。我们认为,虽然调频模型在推理和知识表示方面取得了可喜的发展,但理解其内部运作仍然是一项重大挑战,这与神经科学为理解人脑功能所做的努力相似。尽管调频有一些相似之处,但调频与人脑结构之间的根本差异告诫我们不要进行直接比较,也不要指望神经科学能立即提供有关调频功能的见解。
{"title":"Understanding Foundation Models: Are We Back in 1924?","authors":"Alan F. Smeaton","doi":"arxiv-2409.07618","DOIUrl":"https://doi.org/arxiv-2409.07618","url":null,"abstract":"This position paper explores the rapid development of Foundation Models (FMs)\u0000in AI and their implications for intelligence and reasoning. It examines the\u0000characteristics of FMs, including their training on vast datasets and use of\u0000embedding spaces to capture semantic relationships. The paper discusses recent\u0000advancements in FMs' reasoning abilities which we argue cannot be attributed to\u0000increased model size but to novel training techniques which yield learning\u0000phenomena like grokking. It also addresses the challenges in benchmarking FMs\u0000and compares their structure to the human brain. We argue that while FMs show\u0000promising developments in reasoning and knowledge representation, understanding\u0000their inner workings remains a significant challenge, similar to ongoing\u0000efforts in neuroscience to comprehend human brain function. Despite having some\u0000similarities, fundamental differences between FMs and the structure of human\u0000brain warn us against making direct comparisons or expecting neuroscience to\u0000provide immediate insights into FM function.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning and Constraint Programming for Efficient Healthcare Scheduling 利用机器学习和约束编程实现高效医疗调度
Pub Date : 2024-09-11 DOI: arxiv-2409.07547
Aymen Ben Said, Malek Mouhoub
Solving combinatorial optimization problems involve satisfying a set of hardconstraints while optimizing some objectives. In this context, exact orapproximate methods can be used. While exact methods guarantee the optimalsolution, they often come with an exponential running time as opposed toapproximate methods that trade the solutions quality for a better running time.In this context, we tackle the Nurse Scheduling Problem (NSP). The NSP consistin assigning nurses to daily shifts within a planning horizon such thatworkload constraints are satisfied while hospitals costs and nurses preferencesare optimized. To solve the NSP, we propose implicit and explicit approaches.In the implicit solving approach, we rely on Machine Learning methods usinghistorical data to learn and generate new solutions through the constraints andobjectives that may be embedded in the learned patterns. To quantify thequality of using our implicit approach in capturing the embedded constraintsand objectives, we rely on the Frobenius Norm, a quality measure used tocompute the average error between the generated solutions and historical data.To compensate for the uncertainty related to the implicit approach given thatthe constraints and objectives may not be concretely visible in the producedsolutions, we propose an alternative explicit approach where we first model theNSP using the Constraint Satisfaction Problem (CSP) framework. Then we developStochastic Local Search methods and a new Branch and Bound algorithm enhancedwith constraint propagation techniques and variables/values orderingheuristics. Since our implicit approach may not guarantee the feasibility oroptimality of the generated solution, we propose a data-driven approach topassively learn the NSP as a constraint network. The learned constraintnetwork, formulated as a CSP, will then be solved using the methods we listedearlier.
解决组合优化问题需要在优化某些目标的同时满足一系列硬约束。在这种情况下,可以使用精确或近似方法。精确法虽然能保证得到最优解,但其运行时间往往是指数级的,而近似法则可以用解的质量来换取更好的运行时间。NSP 包括在规划期限内为护士分配每日班次,以满足工作量约束,同时优化医院成本和护士偏好。为了解决 NSP,我们提出了隐式和显式方法。在隐式求解方法中,我们依靠机器学习方法,利用历史数据来学习并通过可能嵌入在学习模式中的约束和目标生成新的解决方案。为了量化隐式方法在捕捉嵌入式约束和目标方面的质量,我们采用了弗罗贝尼斯规范(Frobenius Norm),这是一种用于计算生成的解决方案与历史数据之间平均误差的质量度量方法。鉴于约束和目标在生成的解决方案中可能并不具体可见,为了弥补与隐式方法相关的不确定性,我们提出了另一种显式方法,即首先使用约束满足问题(CSP)框架对 NSP 进行建模。然后,我们开发了随机局部搜索方法和一种新的分支与边界算法,并采用了约束传播技术和变量/值排序启发式算法。由于我们的隐式方法可能无法保证生成的解决方案的可行性或最优性,因此我们提出了一种数据驱动方法,将 NSP 作为约束网络进行被动学习。学习到的约束网络表述为 CSP,然后将使用我们前面列出的方法进行求解。
{"title":"Machine Learning and Constraint Programming for Efficient Healthcare Scheduling","authors":"Aymen Ben Said, Malek Mouhoub","doi":"arxiv-2409.07547","DOIUrl":"https://doi.org/arxiv-2409.07547","url":null,"abstract":"Solving combinatorial optimization problems involve satisfying a set of hard\u0000constraints while optimizing some objectives. In this context, exact or\u0000approximate methods can be used. While exact methods guarantee the optimal\u0000solution, they often come with an exponential running time as opposed to\u0000approximate methods that trade the solutions quality for a better running time.\u0000In this context, we tackle the Nurse Scheduling Problem (NSP). The NSP consist\u0000in assigning nurses to daily shifts within a planning horizon such that\u0000workload constraints are satisfied while hospitals costs and nurses preferences\u0000are optimized. To solve the NSP, we propose implicit and explicit approaches.\u0000In the implicit solving approach, we rely on Machine Learning methods using\u0000historical data to learn and generate new solutions through the constraints and\u0000objectives that may be embedded in the learned patterns. To quantify the\u0000quality of using our implicit approach in capturing the embedded constraints\u0000and objectives, we rely on the Frobenius Norm, a quality measure used to\u0000compute the average error between the generated solutions and historical data.\u0000To compensate for the uncertainty related to the implicit approach given that\u0000the constraints and objectives may not be concretely visible in the produced\u0000solutions, we propose an alternative explicit approach where we first model the\u0000NSP using the Constraint Satisfaction Problem (CSP) framework. Then we develop\u0000Stochastic Local Search methods and a new Branch and Bound algorithm enhanced\u0000with constraint propagation techniques and variables/values ordering\u0000heuristics. Since our implicit approach may not guarantee the feasibility or\u0000optimality of the generated solution, we propose a data-driven approach to\u0000passively learn the NSP as a constraint network. The learned constraint\u0000network, formulated as a CSP, will then be solved using the methods we listed\u0000earlier.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1