首页 > 最新文献

Machine Learning Science and Technology最新文献

英文 中文
ArtiSAN: navigating the complexity of material structures with deep reinforcement learning ArtiSAN:利用深度强化学习驾驭复杂的材料结构
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-13 DOI: 10.1088/2632-2153/ad69ff
Jonas Elsborg, Arghya Bhowmik
Finding low-energy atomic ordering in compositionally complex materials is one of the hardest problems in materials discovery, the solution of which can lead to breakthroughs in functional materials—from alloys to ceramics. In this work, we present the Artificial Structure Arranging Net (ArtiSAN)—a reinforcement learning agent utilizing graph representation that is trained to find low-energy atomic configurations of multicomponent systems through a series of atomic switch operations. ArtiSAN is trained on small alloy supercells ranging from binary to septenary. Strikingly, ArtiSAN generalizes to much larger systems of more than a thousand atoms, which are inaccessible with state-of-the-art methods due to the combinatorially larger search space. The performance of the current ArtiSAN agent is tested and deployed on several compositions that can be correlated with known experimental and high-fidelity computational structures. ArtiSAN demonstrates transfer across size and composition and finds physically meaningful structures using no energy evaluation calls once fully trained. While ArtiSAN will require further modifications to capture all variability in structure search, it is a remarkable step towards solving the structural part of the problem of disordered materials discovery.
在成分复杂的材料中寻找低能原子排序是材料发现中最难的问题之一,解决这一问题可以在功能材料(从合金到陶瓷)领域取得突破。在这项工作中,我们提出了人工结构排列网(ArtiSAN)--一种利用图表示的强化学习代理,通过一系列原子切换操作,训练它找到多组分系统的低能原子配置。ArtiSAN 在从二元到七元的小型合金超级电池上进行训练。令人吃惊的是,ArtiSAN 还能推广到由一千多个原子组成的更大系统。目前的 ArtiSAN 代理的性能已在几个可与已知实验和高保真计算结构相关联的组合上进行了测试和部署。ArtiSAN 展示了在不同大小和组成之间的转移,并且在经过充分训练后,无需调用能量评估即可找到有物理意义的结构。虽然 ArtiSAN 还需要进一步修改才能捕捉结构搜索中的所有变化,但它在解决无序材料发现问题的结构部分迈出了重要一步。
{"title":"ArtiSAN: navigating the complexity of material structures with deep reinforcement learning","authors":"Jonas Elsborg, Arghya Bhowmik","doi":"10.1088/2632-2153/ad69ff","DOIUrl":"https://doi.org/10.1088/2632-2153/ad69ff","url":null,"abstract":"Finding low-energy atomic ordering in compositionally complex materials is one of the hardest problems in materials discovery, the solution of which can lead to breakthroughs in functional materials—from alloys to ceramics. In this work, we present the <bold>Arti</bold>ficial <bold>S</bold>tructure <bold>A</bold>rranging <bold>N</bold>et (<bold>ArtiSAN</bold>)—a reinforcement learning agent utilizing graph representation that is trained to find low-energy atomic configurations of multicomponent systems through a series of atomic switch operations. ArtiSAN is trained on small alloy supercells ranging from binary to septenary. Strikingly, ArtiSAN generalizes to much larger systems of more than a thousand atoms, which are inaccessible with state-of-the-art methods due to the combinatorially larger search space. The performance of the current ArtiSAN agent is tested and deployed on several compositions that can be correlated with known experimental and high-fidelity computational structures. ArtiSAN demonstrates transfer across size and composition and finds physically meaningful structures using no energy evaluation calls once fully trained. While ArtiSAN will require further modifications to capture all variability in structure search, it is a remarkable step towards solving the structural part of the problem of disordered materials discovery.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"11 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concept graph embedding models for enhanced accuracy and interpretability 概念图嵌入模型,提高准确性和可解释性
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-13 DOI: 10.1088/2632-2153/ad6ad2
Sangwon Kim, Byoung Chul Ko
In fields requiring high accountability, it is necessary to understand how deep-learning models make decisions when analyzing the causes of image classification. Concept-based interpretation methods have recently been introduced to reveal the internal mechanisms of deep learning models using high-level concepts. However, such methods are constrained by a trade-off between accuracy and interpretability. For instance, in real-world environments, unlike in well-curated training data, the accurate prediction of expected concepts becomes a challenge owing to the various distortions and complexities introduced by different objects. To overcome this tradeoff, we propose concept graph embedding models (CGEM), reflecting the complex dependencies and structures among concepts through the learning of mutual directionalities. The concept graph convolutional neural network (Concept GCN), a downstream task of CGEM, differs from previous methods that solely determine the presence of concepts because it performs a final classification based on the relationships between con- cepts learned through graph embedding. This process endows the model with high resilience even in the presence of incorrect concepts. In addition, we utilize a deformable bipartite GCN for object- centric concept encoding in the earlier stages, which enhances the homogeneity of the concepts. The experimental results show that, based on deformable concept encoding, the CGEM mitigates the trade-off between task accuracy and interpretability. Moreover, it was confirmed that this approach allows the model to increase the resilience and interpretability while maintaining robustness against various real-world concept distortions and incorrect concept interventions. Our code is available at https://github.com/jumpsnack/cgem.
在需要高度责任感的领域,有必要了解深度学习模型在分析图像分类原因时是如何做出决策的。最近推出了基于概念的解释方法,利用高级概念揭示深度学习模型的内部机制。然而,这些方法受到准确性和可解释性之间权衡的限制。例如,在真实世界环境中,与经过精心整理的训练数据不同,由于不同物体带来的各种扭曲和复杂性,准确预测预期概念成为一项挑战。为了克服这种取舍,我们提出了概念图嵌入模型(CGEM),通过学习相互方向性来反映概念之间复杂的依赖关系和结构。概念图卷积神经网络(Concept GCN)是 CGEM 的下游任务,它不同于以往仅确定概念是否存在的方法,因为它根据通过图嵌入学习到的概念之间的关系进行最终分类。即使存在错误的概念,这一过程也能赋予模型很强的复原能力。此外,我们在早期阶段利用可变形的双方形 GCN 进行以对象为中心的概念编码,从而增强了概念的同质性。实验结果表明,在可变形概念编码的基础上,CGEM 可减轻任务准确性和可解释性之间的权衡。此外,实验还证实,这种方法可以提高模型的复原力和可解释性,同时还能保持对现实世界中各种概念扭曲和错误概念干预的稳健性。我们的代码见 https://github.com/jumpsnack/cgem。
{"title":"Concept graph embedding models for enhanced accuracy and interpretability","authors":"Sangwon Kim, Byoung Chul Ko","doi":"10.1088/2632-2153/ad6ad2","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6ad2","url":null,"abstract":"In fields requiring high accountability, it is necessary to understand how deep-learning models make decisions when analyzing the causes of image classification. Concept-based interpretation methods have recently been introduced to reveal the internal mechanisms of deep learning models using high-level concepts. However, such methods are constrained by a trade-off between accuracy and interpretability. For instance, in real-world environments, unlike in well-curated training data, the accurate prediction of expected concepts becomes a challenge owing to the various distortions and complexities introduced by different objects. To overcome this tradeoff, we propose concept graph embedding models (CGEM), reflecting the complex dependencies and structures among concepts through the learning of mutual directionalities. The concept graph convolutional neural network (Concept GCN), a downstream task of CGEM, differs from previous methods that solely determine the presence of concepts because it performs a final classification based on the relationships between con- cepts learned through graph embedding. This process endows the model with high resilience even in the presence of incorrect concepts. In addition, we utilize a deformable bipartite GCN for object- centric concept encoding in the earlier stages, which enhances the homogeneity of the concepts. The experimental results show that, based on deformable concept encoding, the CGEM mitigates the trade-off between task accuracy and interpretability. Moreover, it was confirmed that this approach allows the model to increase the resilience and interpretability while maintaining robustness against various real-world concept distortions and incorrect concept interventions. Our code is available at <ext-link ext-link-type=\"uri\" xlink:href=\"https://github.com/jumpsnack/cgem\">https://github.com/jumpsnack/cgem</ext-link>.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"69 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual reality for understanding artificial-intelligence-driven scientific discovery with an application in quantum optics 虚拟现实技术在量子光学中的应用:了解人工智能驱动的科学发现
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-13 DOI: 10.1088/2632-2153/ad5fdb
Philipp Schmidt, Sören Arlt, Carlos Ruiz-Gonzalez, Xuemei Gu, Carla Rodríguez, Mario Krenn
Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fundamental insights behind proposals based solely on examples. This task can be challenging as the output is often highly complex and thus not immediately accessible to humans. In this work we show how transferring part of the analysis process into an immersive virtual reality (VR) environment can assist researchers in developing an understanding of AI-generated solutions. We demonstrate the usefulness of VR in finding interpretable configurations of abstract graphs, representing Quantum Optics experiments. Thereby, we can manually discover new generalizations of AI-discoveries as well as new understanding in experimental quantum optics. Furthermore, it allows us to customize the search space in an informed way—as a human-in-the-loop—to achieve significantly faster subsequent discovery iterations. As concrete examples, with this technology, we discover a new resource-efficient 3-dimensional entanglement swapping scheme, as well as a 3-dimensional 4-particle Greenberger–Horne–Zeilinger-state analyzer. Our results show the potential of VR to enhance a researcher’s ability to derive knowledge from graph-based generative AI. This type of AI is a widely used abstract data representation in various scientific fields.
人工智能(AI)生成模型可以为人类无法解决的科学问题提出解决方案。要真正在概念上有所贡献,研究人员需要有能力理解人工智能生成的结构,并提取潜在的概念和想法。当算法在输出的同时几乎不提供解释性推理时,科学家就必须仅根据示例来逆向工程建议背后的基本见解。这项任务极具挑战性,因为输出结果往往非常复杂,人类无法立即理解。在这项工作中,我们展示了如何将部分分析过程转移到沉浸式虚拟现实(VR)环境中,从而帮助研究人员理解人工智能生成的解决方案。我们展示了虚拟现实在寻找代表量子光学实验的抽象图形的可解释配置方面的实用性。因此,我们可以手动发现人工智能发现的新概括以及对量子光学实验的新理解。此外,它还允许我们在知情的情况下定制搜索空间--作为 "回路中的人",从而大大加快后续发现迭代的速度。举个具体例子,利用这项技术,我们发现了一种新的资源节约型三维纠缠交换方案,以及一种三维四粒子格林伯格-霍恩-蔡林格状态分析器。我们的研究结果表明,虚拟现实技术可以提高研究人员从基于图的生成式人工智能中获取知识的能力。这种人工智能是各科学领域广泛使用的一种抽象数据表示方式。
{"title":"Virtual reality for understanding artificial-intelligence-driven scientific discovery with an application in quantum optics","authors":"Philipp Schmidt, Sören Arlt, Carlos Ruiz-Gonzalez, Xuemei Gu, Carla Rodríguez, Mario Krenn","doi":"10.1088/2632-2153/ad5fdb","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5fdb","url":null,"abstract":"Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fundamental insights behind proposals based solely on examples. This task can be challenging as the output is often highly complex and thus not immediately accessible to humans. In this work we show how transferring part of the analysis process into an immersive virtual reality (VR) environment can assist researchers in developing an understanding of AI-generated solutions. We demonstrate the usefulness of VR in finding interpretable configurations of abstract graphs, representing Quantum Optics experiments. Thereby, we can manually discover new generalizations of AI-discoveries as well as new understanding in experimental quantum optics. Furthermore, it allows us to customize the search space in an informed way—as a human-in-the-loop—to achieve significantly faster subsequent discovery iterations. As concrete examples, with this technology, we discover a new resource-efficient 3-dimensional entanglement swapping scheme, as well as a 3-dimensional 4-particle Greenberger–Horne–Zeilinger-state analyzer. Our results show the potential of VR to enhance a researcher’s ability to derive knowledge from graph-based generative AI. This type of AI is a widely used abstract data representation in various scientific fields.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"15 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer learning with generative models for object detection on limited datasets 利用生成模型在有限数据集上进行物体检测的迁移学习
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-11 DOI: 10.1088/2632-2153/ad65b5
M Paiano, S Martina, C Giannelli and F Caruso
The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.
在某些领域,数据的可用性是有限的,特别是在物体检测任务中,需要在每个物体周围有正确标注的边界框。海洋生物学领域就是这种数据匮乏的一个显著例子,在该领域,开发用于环境监测的海底物种自动检测方法非常有用。为了解决这种数据限制,最先进的机器学习策略主要采用两种方法。第一种方法是在现有数据集上对模型进行预训练,然后再将其推广到感兴趣的特定领域。第二种策略是使用复制粘贴技术或临时模拟器等方法创建专门针对目标领域的合成数据集。第一种策略通常会面临重大的领域转变,而第二种策略则需要为特定任务量身定制解决方案。为了应对这些挑战,我们在这里提出了一个适用于通用场景的迁移学习框架。在这个框架中,生成的图像有助于提高物体检测器在少量真实数据环境中的性能。这是通过在大型通用数据集上预训练的基于扩散的生成模型实现的。与最先进的技术相比,我们发现无需对特定兴趣领域的生成模型进行微调。我们认为这是一个重要的进步,因为它减轻了物体检测任务中人工标记图像的劳动密集型任务。我们以水下环境中的鱼类和城市环境中更常见的汽车领域为重点,对我们的方法进行了验证。我们的方法只使用了几百个输入数据,就实现了与在数千张图像上训练的模型相当的检测性能。我们的研究成果为新的基于生成式人工智能的协议铺平了道路,该协议适用于从地球物理学到生物学和医学等各个领域的机器学习应用。
{"title":"Transfer learning with generative models for object detection on limited datasets","authors":"M Paiano, S Martina, C Giannelli and F Caruso","doi":"10.1088/2632-2153/ad65b5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad65b5","url":null,"abstract":"The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"192 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trainability issues in quantum policy gradients 量子政策梯度的可训练性问题
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-05 DOI: 10.1088/2632-2153/ad6830
André Sequeira, Luis Paulo Santos and Luis Soares Barbosa
This research explores the trainability of Parameterized Quantum Circuit-based policies in Reinforcement Learning, an area that has recently seen a surge in empirical exploration. While some studies suggest improved sample complexity using quantum gradient estimation, the efficient trainability of these policies remains an open question. Our findings reveal significant challenges, including standard Barren Plateaus with exponentially small gradients and gradient explosion. These phenomena depend on the type of basis-state partitioning and the mapping of these partitions onto actions. For a polynomial number of actions, a trainable window can be ensured with a polynomial number of measurements if a contiguous-like partitioning of basis-states is employed. These results are empirically validated in a multi-armed bandit environment.
本研究探讨了强化学习中基于参数化量子电路的策略的可训练性,这一领域的实证探索最近出现了激增。虽然一些研究表明,量子梯度估计提高了样本复杂度,但这些策略的高效可训练性仍是一个未决问题。我们的研究结果揭示了巨大的挑战,包括梯度呈指数级小的标准贫瘠高原和梯度爆炸。这些现象取决于基态划分的类型以及这些划分对行动的映射。对于多项式数量的动作,如果采用类似连续的基态划分,则只需多项式数量的测量就能确保可训练窗口。这些结果在多臂强盗环境中得到了经验验证。
{"title":"Trainability issues in quantum policy gradients","authors":"André Sequeira, Luis Paulo Santos and Luis Soares Barbosa","doi":"10.1088/2632-2153/ad6830","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6830","url":null,"abstract":"This research explores the trainability of Parameterized Quantum Circuit-based policies in Reinforcement Learning, an area that has recently seen a surge in empirical exploration. While some studies suggest improved sample complexity using quantum gradient estimation, the efficient trainability of these policies remains an open question. Our findings reveal significant challenges, including standard Barren Plateaus with exponentially small gradients and gradient explosion. These phenomena depend on the type of basis-state partitioning and the mapping of these partitions onto actions. For a polynomial number of actions, a trainable window can be ensured with a polynomial number of measurements if a contiguous-like partitioning of basis-states is employed. These results are empirically validated in a multi-armed bandit environment.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"76 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular relaxation by reverse diffusion with time step prediction 反向扩散的分子弛豫与时间步长预测
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-05 DOI: 10.1088/2632-2153/ad652c
Khaled Kahouli, Stefaan Simon Pierre Hessmann, Klaus-Robert Müller, Shinichi Nakajima, Stefan Gugler and Niklas Wolf Andreas Gebauer
Molecular relaxation, finding the equilibrium state of a non-equilibrium structure, is an essential component of computational chemistry to understand reactivity. Classical force field (FF) methods often rely on insufficient local energy minimization, while neural network FF models require large labeled datasets encompassing both equilibrium and non-equilibrium structures. As a remedy, we propose MoreRed, molecular relaxation by reverse diffusion, a conceptually novel and purely statistical approach where non-equilibrium structures are treated as noisy instances of their corresponding equilibrium states. To enable the denoising of arbitrarily noisy inputs via a generative diffusion model, we further introduce a novel diffusion time step predictor. Notably, MoreRed learns a simpler pseudo potential energy surface (PES) instead of the complex physical PES. It is trained on a significantly smaller, and thus computationally cheaper, dataset consisting of solely unlabeled equilibrium structures, avoiding the computation of non-equilibrium structures altogether. We compare MoreRed to classical FFs, equivariant neural network FFs trained on a large dataset of equilibrium and non-equilibrium data, as well as a semi-empirical tight-binding model. To assess this quantitatively, we evaluate the root-mean-square deviation between the found equilibrium structures and the reference equilibrium structures as well as their energies.
分子弛豫,即寻找非平衡态结构的平衡状态,是计算化学理解反应性的重要组成部分。经典的力场(FF)方法通常依赖于不充分的局部能量最小化,而神经网络 FF 模型则需要包含平衡和非平衡结构的大型标记数据集。作为一种补救措施,我们提出了反向扩散分子弛豫方法(MoreRed),这是一种概念新颖的纯统计方法,将非平衡态结构视为其相应平衡态的噪声实例。为了通过生成扩散模型对任意噪声输入进行去噪处理,我们进一步引入了一种新型扩散时间步预测器。值得注意的是,MoreRed 学习的是更简单的伪势能面(PES),而不是复杂的物理势能面。它是在一个明显更小的数据集上进行训练的,因此计算成本更低,该数据集仅由未标记的平衡结构组成,完全避免了非平衡结构的计算。我们将 MoreRed 与经典 FF、在大量平衡和非平衡数据集上训练的等变神经网络 FF 以及半经验紧密结合模型进行了比较。为了定量评估这一点,我们评估了所发现的平衡结构与参考平衡结构之间的均方根偏差以及它们的能量。
{"title":"Molecular relaxation by reverse diffusion with time step prediction","authors":"Khaled Kahouli, Stefaan Simon Pierre Hessmann, Klaus-Robert Müller, Shinichi Nakajima, Stefan Gugler and Niklas Wolf Andreas Gebauer","doi":"10.1088/2632-2153/ad652c","DOIUrl":"https://doi.org/10.1088/2632-2153/ad652c","url":null,"abstract":"Molecular relaxation, finding the equilibrium state of a non-equilibrium structure, is an essential component of computational chemistry to understand reactivity. Classical force field (FF) methods often rely on insufficient local energy minimization, while neural network FF models require large labeled datasets encompassing both equilibrium and non-equilibrium structures. As a remedy, we propose MoreRed, molecular relaxation by reverse diffusion, a conceptually novel and purely statistical approach where non-equilibrium structures are treated as noisy instances of their corresponding equilibrium states. To enable the denoising of arbitrarily noisy inputs via a generative diffusion model, we further introduce a novel diffusion time step predictor. Notably, MoreRed learns a simpler pseudo potential energy surface (PES) instead of the complex physical PES. It is trained on a significantly smaller, and thus computationally cheaper, dataset consisting of solely unlabeled equilibrium structures, avoiding the computation of non-equilibrium structures altogether. We compare MoreRed to classical FFs, equivariant neural network FFs trained on a large dataset of equilibrium and non-equilibrium data, as well as a semi-empirical tight-binding model. To assess this quantitatively, we evaluate the root-mean-square deviation between the found equilibrium structures and the reference equilibrium structures as well as their energies.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"303 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-perspective feedback-attention coupling model for continuous-time dynamic graphs 连续时间动态图的多视角反馈-关注耦合模型
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-04 DOI: 10.1088/2632-2153/ad66af
Xiaobo Zhu, Yan Wu, Jin Che, Chao Wang, Liying Wang and Zhanheng Chen
Representation learning over graph networks has recently gained popularity, with many models showing promising results. However, several challenges remain: (1) most methods are designed for static or discrete-time dynamic graphs; (2) existing continuous-time dynamic graph algorithms focus on a single evolving perspective; and (3) many continuous-time dynamic graph approaches necessitate numerous temporal neighbors to capture long-term dependencies. In response, this paper introduces a Multi-Perspective Feedback-Attention Coupling (MPFA) model. MPFA incorporates information from both evolving and original perspectives to effectively learn the complex dynamics of dynamic graph evolution processes. The evolving perspective considers the current state of historical interaction events of nodes and uses a temporal attention module to aggregate current state information. This perspective also makes it possible to capture long-term dependencies of nodes using a small number of temporal neighbors. Meanwhile, the original perspective utilizes a feedback attention module with growth characteristic coefficients to aggregate the original state information of node interactions. Experimental results on one dataset organized by ourselves and seven public datasets validate the effectiveness and competitiveness of our proposed model.
图网络的表征学习最近很受欢迎,许多模型都取得了可喜的成果。然而,目前仍存在一些挑战:(1) 大多数方法都是针对静态或离散时间动态图设计的;(2) 现有的连续时间动态图算法只关注单一的演化视角;(3) 许多连续时间动态图方法需要大量的时间邻域来捕捉长期依赖关系。为此,本文引入了多视角反馈-关注耦合(MPFA)模型。MPFA 融合了演化视角和原始视角的信息,可有效学习动态图演化过程的复杂动态。演化视角考虑了节点历史交互事件的当前状态,并使用时间注意力模块来聚合当前状态信息。这种视角还能利用少量的时间邻域来捕捉节点的长期依赖关系。同时,原始视角利用具有增长特征系数的反馈注意力模块来聚合节点交互的原始状态信息。在我们自己组织的一个数据集和七个公共数据集上的实验结果验证了我们提出的模型的有效性和竞争力。
{"title":"Multi-perspective feedback-attention coupling model for continuous-time dynamic graphs","authors":"Xiaobo Zhu, Yan Wu, Jin Che, Chao Wang, Liying Wang and Zhanheng Chen","doi":"10.1088/2632-2153/ad66af","DOIUrl":"https://doi.org/10.1088/2632-2153/ad66af","url":null,"abstract":"Representation learning over graph networks has recently gained popularity, with many models showing promising results. However, several challenges remain: (1) most methods are designed for static or discrete-time dynamic graphs; (2) existing continuous-time dynamic graph algorithms focus on a single evolving perspective; and (3) many continuous-time dynamic graph approaches necessitate numerous temporal neighbors to capture long-term dependencies. In response, this paper introduces a Multi-Perspective Feedback-Attention Coupling (MPFA) model. MPFA incorporates information from both evolving and original perspectives to effectively learn the complex dynamics of dynamic graph evolution processes. The evolving perspective considers the current state of historical interaction events of nodes and uses a temporal attention module to aggregate current state information. This perspective also makes it possible to capture long-term dependencies of nodes using a small number of temporal neighbors. Meanwhile, the original perspective utilizes a feedback attention module with growth characteristic coefficients to aggregate the original state information of node interactions. Experimental results on one dataset organized by ourselves and seven public datasets validate the effectiveness and competitiveness of our proposed model.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"14 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards robust data-driven automated recovery of symbolic conservation laws from limited data 从有限数据中实现稳健的数据驱动自动恢复符号守恒定律
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-04 DOI: 10.1088/2632-2153/ad6390
Tracey Oellerich and Maria Emelianenko
Conservation laws are an inherent feature in many systems modeling real world phenomena, in particular, those modeling biological and chemical systems. If the form of the underlying dynamical system is known, linear algebra and algebraic geometry methods can be used to identify the conservation laws. Our work focuses on using data-driven methods to identify the conservation law(s) in the absence of the knowledge of system dynamics. We develop a robust data-driven computational framework that automates the process of identifying the number and type of the conservation law(s) while keeping the amount of required data to a minimum. We demonstrate that due to relative stability of singular vectors to noise we are able to reconstruct correct conservation laws without the need for excessive parameter tuning. While we focus primarily on biological examples, the framework proposed herein is suitable for a variety of data science applications and can be coupled with other machine learning approaches.
守恒定律是许多模拟现实世界现象的系统,特别是模拟生物和化学系统的系统的固有特征。如果已知基本动态系统的形式,就可以使用线性代数和代数几何方法来识别守恒定律。我们的工作重点是在缺乏系统动力学知识的情况下,使用数据驱动方法来识别守恒定律。我们开发了一个稳健的数据驱动计算框架,可自动识别守恒定律的数量和类型,同时将所需数据量保持在最低水平。我们证明,由于奇异向量对噪声的相对稳定性,我们能够重建正确的守恒定律,而无需过多的参数调整。虽然我们主要关注生物实例,但本文提出的框架适用于各种数据科学应用,并可与其他机器学习方法相结合。
{"title":"Towards robust data-driven automated recovery of symbolic conservation laws from limited data","authors":"Tracey Oellerich and Maria Emelianenko","doi":"10.1088/2632-2153/ad6390","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6390","url":null,"abstract":"Conservation laws are an inherent feature in many systems modeling real world phenomena, in particular, those modeling biological and chemical systems. If the form of the underlying dynamical system is known, linear algebra and algebraic geometry methods can be used to identify the conservation laws. Our work focuses on using data-driven methods to identify the conservation law(s) in the absence of the knowledge of system dynamics. We develop a robust data-driven computational framework that automates the process of identifying the number and type of the conservation law(s) while keeping the amount of required data to a minimum. We demonstrate that due to relative stability of singular vectors to noise we are able to reconstruct correct conservation laws without the need for excessive parameter tuning. While we focus primarily on biological examples, the framework proposed herein is suitable for a variety of data science applications and can be coupled with other machine learning approaches.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"28 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coincident learning for unsupervised anomaly detection of scientific instruments 用于科学仪器无监督异常检测的巧合学习
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-04 DOI: 10.1088/2632-2153/ad64a6
Ryan Humble, Zhe Zhang, Finn O’Shea, Eric Darve and Daniel Ratner
Anomaly detection is an important task for complex scientific experiments and other complex systems (e.g. industrial facilities, manufacturing), where failures in a sub-system can lead to lost data, poor performance, or even damage to components. While scientific facilities generate a wealth of data, labeled anomalies may be rare (or even nonexistent), and expensive to acquire. Unsupervised approaches are therefore common and typically search for anomalies either by distance or density of examples in the input feature space (or some associated low-dimensional representation). This paper presents a novel approach called coincident learning for anomaly detection (CoAD), which is specifically designed for multi-modal tasks and identifies anomalies based on coincident behavior across two different slices of the feature space. We define an unsupervised metric, , out of analogy to the supervised classification Fβ statistic. CoAD uses to train an anomaly detection algorithm on unlabeled data, based on the expectation that anomalous behavior in one feature slice is coincident with anomalous behavior in the other. The method is illustrated using a synthetic outlier data set and a MNIST-based image data set, and is compared to prior state-of-the-art on two real-world tasks: a metal milling data set and our motivating task of identifying RF station anomalies in a particle accelerator.
异常检测是复杂科学实验和其他复杂系统(如工业设施、制造业)的一项重要任务,其中子系统的故障可能导致数据丢失、性能低下,甚至损坏部件。虽然科学设施会产生大量数据,但标注的异常情况可能很少(甚至不存在),而且获取成本高昂。因此,无监督方法很常见,通常是通过输入特征空间(或一些相关的低维表示)中示例的距离或密度来搜索异常。本文提出了一种名为 "异常检测重合学习"(CoAD)的新方法,该方法专为多模态任务而设计,可根据特征空间两个不同片段的重合行为识别异常。我们定义了一个无监督度量,与监督分类 Fβ 统计量类似。CoAD 用于在无标记数据上训练异常检测算法,该算法基于一个特征片中的异常行为与另一个特征片中的异常行为重合的预期。我们使用合成离群点数据集和基于 MNIST 的图像数据集对该方法进行了说明,并在两个实际任务中将该方法与先前的先进方法进行了比较:一个是金属铣削数据集,另一个是我们在粒子加速器中识别射频站异常的激励任务。
{"title":"Coincident learning for unsupervised anomaly detection of scientific instruments","authors":"Ryan Humble, Zhe Zhang, Finn O’Shea, Eric Darve and Daniel Ratner","doi":"10.1088/2632-2153/ad64a6","DOIUrl":"https://doi.org/10.1088/2632-2153/ad64a6","url":null,"abstract":"Anomaly detection is an important task for complex scientific experiments and other complex systems (e.g. industrial facilities, manufacturing), where failures in a sub-system can lead to lost data, poor performance, or even damage to components. While scientific facilities generate a wealth of data, labeled anomalies may be rare (or even nonexistent), and expensive to acquire. Unsupervised approaches are therefore common and typically search for anomalies either by distance or density of examples in the input feature space (or some associated low-dimensional representation). This paper presents a novel approach called coincident learning for anomaly detection (CoAD), which is specifically designed for multi-modal tasks and identifies anomalies based on coincident behavior across two different slices of the feature space. We define an unsupervised metric, , out of analogy to the supervised classification Fβ statistic. CoAD uses to train an anomaly detection algorithm on unlabeled data, based on the expectation that anomalous behavior in one feature slice is coincident with anomalous behavior in the other. The method is illustrated using a synthetic outlier data set and a MNIST-based image data set, and is compared to prior state-of-the-art on two real-world tasks: a metal milling data set and our motivating task of identifying RF station anomalies in a particle accelerator.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"76 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OmniJet-α: the first cross-task foundation model for particle physics OmniJet-α:首个用于粒子物理学的跨任务基础模型
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-01 DOI: 10.1088/2632-2153/ad66ad
Joschka Birk, Anna Hallin and Gregor Kasieczka
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.
基础模型是多数据集和多任务机器学习方法,一旦经过预训练,就可以针对各种下游应用进行微调。为物理数据成功开发这种通用模型将是一个重大突破,因为它们可以提高可实现的物理性能,同时大幅减少所需的训练时间和数据量。我们报告了这一挑战在几个方面取得的重大进展。首先,我们引入了一套全面的评估方法,用于判断将物理数据编码为适合自回归生成具有变压器架构(基础模型的常见骨干)的粒子喷流的表示形式的质量。与之前的工作相比,这些措施促使我们选择了保真度更高的标记化方法。最后,我们用新的 OmniJet-α 模型演示了无监督问题(喷流生成)和经典监督任务(喷流标记)之间的迁移学习。这是首次成功地在两个不同的、被积极研究的任务类别之间进行迁移,是建立粒子物理学基础模型的重要一步。
{"title":"OmniJet-α: the first cross-task foundation model for particle physics","authors":"Joschka Birk, Anna Hallin and Gregor Kasieczka","doi":"10.1088/2632-2153/ad66ad","DOIUrl":"https://doi.org/10.1088/2632-2153/ad66ad","url":null,"abstract":"Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"81 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Learning Science and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1