Machine Learning Science and Technology最新文献_第3页

ArtiSAN: navigating the complexity of material structures with deep reinforcement learning ArtiSAN：利用深度强化学习驾驭复杂的材料结构

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-08-13 DOI: 10.1088/2632-2153/ad69ff

Jonas Elsborg, Arghya Bhowmik

Finding low-energy atomic ordering in compositionally complex materials is one of the hardest problems in materials discovery, the solution of which can lead to breakthroughs in functional materials—from alloys to ceramics. In this work, we present the Artificial Structure Arranging Net (ArtiSAN)—a reinforcement learning agent utilizing graph representation that is trained to find low-energy atomic configurations of multicomponent systems through a series of atomic switch operations. ArtiSAN is trained on small alloy supercells ranging from binary to septenary. Strikingly, ArtiSAN generalizes to much larger systems of more than a thousand atoms, which are inaccessible with state-of-the-art methods due to the combinatorially larger search space. The performance of the current ArtiSAN agent is tested and deployed on several compositions that can be correlated with known experimental and high-fidelity computational structures. ArtiSAN demonstrates transfer across size and composition and finds physically meaningful structures using no energy evaluation calls once fully trained. While ArtiSAN will require further modifications to capture all variability in structure search, it is a remarkable step towards solving the structural part of the problem of disordered materials discovery.

在成分复杂的材料中寻找低能原子排序是材料发现中最难的问题之一，解决这一问题可以在功能材料（从合金到陶瓷）领域取得突破。在这项工作中，我们提出了人工结构排列网（ArtiSAN）--一种利用图表示的强化学习代理，通过一系列原子切换操作，训练它找到多组分系统的低能原子配置。ArtiSAN 在从二元到七元的小型合金超级电池上进行训练。令人吃惊的是，ArtiSAN 还能推广到由一千多个原子组成的更大系统。目前的 ArtiSAN 代理的性能已在几个可与已知实验和高保真计算结构相关联的组合上进行了测试和部署。ArtiSAN 展示了在不同大小和组成之间的转移，并且在经过充分训练后，无需调用能量评估即可找到有物理意义的结构。虽然 ArtiSAN 还需要进一步修改才能捕捉结构搜索中的所有变化，但它在解决无序材料发现问题的结构部分迈出了重要一步。

{"title":"ArtiSAN: navigating the complexity of material structures with deep reinforcement learning","authors":"Jonas Elsborg, Arghya Bhowmik","doi":"10.1088/2632-2153/ad69ff","DOIUrl":"https://doi.org/10.1088/2632-2153/ad69ff","url":null,"abstract":"Finding low-energy atomic ordering in compositionally complex materials is one of the hardest problems in materials discovery, the solution of which can lead to breakthroughs in functional materials—from alloys to ceramics. In this work, we present the <bold>Arti</bold>ficial <bold>S</bold>tructure <bold>A</bold>rranging <bold>N</bold>et (<bold>ArtiSAN</bold>)—a reinforcement learning agent utilizing graph representation that is trained to find low-energy atomic configurations of multicomponent systems through a series of atomic switch operations. ArtiSAN is trained on small alloy supercells ranging from binary to septenary. Strikingly, ArtiSAN generalizes to much larger systems of more than a thousand atoms, which are inaccessible with state-of-the-art methods due to the combinatorially larger search space. The performance of the current ArtiSAN agent is tested and deployed on several compositions that can be correlated with known experimental and high-fidelity computational structures. ArtiSAN demonstrates transfer across size and composition and finds physically meaningful structures using no energy evaluation calls once fully trained. While ArtiSAN will require further modifications to capture all variability in structure search, it is a remarkable step towards solving the structural part of the problem of disordered materials discovery.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"11 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Virtual reality for understanding artificial-intelligence-driven scientific discovery with an application in quantum optics 虚拟现实技术在量子光学中的应用：了解人工智能驱动的科学发现

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-08-13 DOI: 10.1088/2632-2153/ad5fdb

Philipp Schmidt, Sören Arlt, Carlos Ruiz-Gonzalez, Xuemei Gu, Carla Rodríguez, Mario Krenn

Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fundamental insights behind proposals based solely on examples. This task can be challenging as the output is often highly complex and thus not immediately accessible to humans. In this work we show how transferring part of the analysis process into an immersive virtual reality (VR) environment can assist researchers in developing an understanding of AI-generated solutions. We demonstrate the usefulness of VR in finding interpretable configurations of abstract graphs, representing Quantum Optics experiments. Thereby, we can manually discover new generalizations of AI-discoveries as well as new understanding in experimental quantum optics. Furthermore, it allows us to customize the search space in an informed way—as a human-in-the-loop—to achieve significantly faster subsequent discovery iterations. As concrete examples, with this technology, we discover a new resource-efficient 3-dimensional entanglement swapping scheme, as well as a 3-dimensional 4-particle Greenberger–Horne–Zeilinger-state analyzer. Our results show the potential of VR to enhance a researcher’s ability to derive knowledge from graph-based generative AI. This type of AI is a widely used abstract data representation in various scientific fields.

人工智能（AI）生成模型可以为人类无法解决的科学问题提出解决方案。要真正在概念上有所贡献，研究人员需要有能力理解人工智能生成的结构，并提取潜在的概念和想法。当算法在输出的同时几乎不提供解释性推理时，科学家就必须仅根据示例来逆向工程建议背后的基本见解。这项任务极具挑战性，因为输出结果往往非常复杂，人类无法立即理解。在这项工作中，我们展示了如何将部分分析过程转移到沉浸式虚拟现实（VR）环境中，从而帮助研究人员理解人工智能生成的解决方案。我们展示了虚拟现实在寻找代表量子光学实验的抽象图形的可解释配置方面的实用性。因此，我们可以手动发现人工智能发现的新概括以及对量子光学实验的新理解。此外，它还允许我们在知情的情况下定制搜索空间--作为 "回路中的人"，从而大大加快后续发现迭代的速度。举个具体例子，利用这项技术，我们发现了一种新的资源节约型三维纠缠交换方案，以及一种三维四粒子格林伯格-霍恩-蔡林格状态分析器。我们的研究结果表明，虚拟现实技术可以提高研究人员从基于图的生成式人工智能中获取知识的能力。这种人工智能是各科学领域广泛使用的一种抽象数据表示方式。

{"title":"Virtual reality for understanding artificial-intelligence-driven scientific discovery with an application in quantum optics","authors":"Philipp Schmidt, Sören Arlt, Carlos Ruiz-Gonzalez, Xuemei Gu, Carla Rodríguez, Mario Krenn","doi":"10.1088/2632-2153/ad5fdb","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5fdb","url":null,"abstract":"Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fundamental insights behind proposals based solely on examples. This task can be challenging as the output is often highly complex and thus not immediately accessible to humans. In this work we show how transferring part of the analysis process into an immersive virtual reality (VR) environment can assist researchers in developing an understanding of AI-generated solutions. We demonstrate the usefulness of VR in finding interpretable configurations of abstract graphs, representing Quantum Optics experiments. Thereby, we can manually discover new generalizations of AI-discoveries as well as new understanding in experimental quantum optics. Furthermore, it allows us to customize the search space in an informed way—as a human-in-the-loop—to achieve significantly faster subsequent discovery iterations. As concrete examples, with this technology, we discover a new resource-efficient 3-dimensional entanglement swapping scheme, as well as a 3-dimensional 4-particle Greenberger–Horne–Zeilinger-state analyzer. Our results show the potential of VR to enhance a researcher’s ability to derive knowledge from graph-based generative AI. This type of AI is a widely used abstract data representation in various scientific fields.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"15 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transfer learning with generative models for object detection on limited datasets 利用生成模型在有限数据集上进行物体检测的迁移学习

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-08-11 DOI: 10.1088/2632-2153/ad65b5

M Paiano, S Martina, C Giannelli and F Caruso

The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.

在某些领域，数据的可用性是有限的，特别是在物体检测任务中，需要在每个物体周围有正确标注的边界框。海洋生物学领域就是这种数据匮乏的一个显著例子，在该领域，开发用于环境监测的海底物种自动检测方法非常有用。为了解决这种数据限制，最先进的机器学习策略主要采用两种方法。第一种方法是在现有数据集上对模型进行预训练，然后再将其推广到感兴趣的特定领域。第二种策略是使用复制粘贴技术或临时模拟器等方法创建专门针对目标领域的合成数据集。第一种策略通常会面临重大的领域转变，而第二种策略则需要为特定任务量身定制解决方案。为了应对这些挑战，我们在这里提出了一个适用于通用场景的迁移学习框架。在这个框架中，生成的图像有助于提高物体检测器在少量真实数据环境中的性能。这是通过在大型通用数据集上预训练的基于扩散的生成模型实现的。与最先进的技术相比，我们发现无需对特定兴趣领域的生成模型进行微调。我们认为这是一个重要的进步，因为它减轻了物体检测任务中人工标记图像的劳动密集型任务。我们以水下环境中的鱼类和城市环境中更常见的汽车领域为重点，对我们的方法进行了验证。我们的方法只使用了几百个输入数据，就实现了与在数千张图像上训练的模型相当的检测性能。我们的研究成果为新的基于生成式人工智能的协议铺平了道路，该协议适用于从地球物理学到生物学和医学等各个领域的机器学习应用。

{"title":"Transfer learning with generative models for object detection on limited datasets","authors":"M Paiano, S Martina, C Giannelli and F Caruso","doi":"10.1088/2632-2153/ad65b5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad65b5","url":null,"abstract":"The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"192 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Trainability issues in quantum policy gradients 量子政策梯度的可训练性问题

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-08-05 DOI: 10.1088/2632-2153/ad6830

André Sequeira, Luis Paulo Santos and Luis Soares Barbosa

This research explores the trainability of Parameterized Quantum Circuit-based policies in Reinforcement Learning, an area that has recently seen a surge in empirical exploration. While some studies suggest improved sample complexity using quantum gradient estimation, the efficient trainability of these policies remains an open question. Our findings reveal significant challenges, including standard Barren Plateaus with exponentially small gradients and gradient explosion. These phenomena depend on the type of basis-state partitioning and the mapping of these partitions onto actions. For a polynomial number of actions, a trainable window can be ensured with a polynomial number of measurements if a contiguous-like partitioning of basis-states is employed. These results are empirically validated in a multi-armed bandit environment.

本研究探讨了强化学习中基于参数化量子电路的策略的可训练性，这一领域的实证探索最近出现了激增。虽然一些研究表明，量子梯度估计提高了样本复杂度，但这些策略的高效可训练性仍是一个未决问题。我们的研究结果揭示了巨大的挑战，包括梯度呈指数级小的标准贫瘠高原和梯度爆炸。这些现象取决于基态划分的类型以及这些划分对行动的映射。对于多项式数量的动作，如果采用类似连续的基态划分，则只需多项式数量的测量就能确保可训练窗口。这些结果在多臂强盗环境中得到了经验验证。

引用次数: 0

OmniJet-α: the first cross-task foundation model for particle physics OmniJet-α：首个用于粒子物理学的跨任务基础模型

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-08-01 DOI: 10.1088/2632-2153/ad66ad

Joschka Birk, Anna Hallin and Gregor Kasieczka

Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.

基础模型是多数据集和多任务机器学习方法，一旦经过预训练，就可以针对各种下游应用进行微调。为物理数据成功开发这种通用模型将是一个重大突破，因为它们可以提高可实现的物理性能，同时大幅减少所需的训练时间和数据量。我们报告了这一挑战在几个方面取得的重大进展。首先，我们引入了一套全面的评估方法，用于判断将物理数据编码为适合自回归生成具有变压器架构（基础模型的常见骨干）的粒子喷流的表示形式的质量。与之前的工作相比，这些措施促使我们选择了保真度更高的标记化方法。最后，我们用新的 OmniJet-α 模型演示了无监督问题（喷流生成）和经典监督任务（喷流标记）之间的迁移学习。这是首次成功地在两个不同的、被积极研究的任务类别之间进行迁移，是建立粒子物理学基础模型的重要一步。

{"title":"OmniJet-α: the first cross-task foundation model for particle physics","authors":"Joschka Birk, Anna Hallin and Gregor Kasieczka","doi":"10.1088/2632-2153/ad66ad","DOIUrl":"https://doi.org/10.1088/2632-2153/ad66ad","url":null,"abstract":"Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"81 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Bayesian inference using physics-informed invertible neural networks for inverse problems 利用物理信息可逆神经网络对逆问题进行高效贝叶斯推理

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-07-22 DOI: 10.1088/2632-2153/ad5f74

Xiaofei Guan, Xintong Wang, Hao Wu, Zihao Yang and Peng Yu

This paper presents an innovative approach to tackle Bayesian inverse problems using physics-informed invertible neural networks (PI-INN). Serving as a neural operator model, PI-INN employs an invertible neural network (INN) to elucidate the relationship between the parameter field and the solution function in latent variable spaces. Specifically, the INN decomposes the latent variable of the parameter field into two distinct components: the expansion coefficients that represent the solution to the forward problem, and the noise that captures the inherent uncertainty associated with the inverse problem. Through precise estimation of the forward mapping and preservation of statistical independence between expansion coefficients and latent noise, PI-INN offers an accurate and efficient generative model for resolving Bayesian inverse problems, even in the absence of labeled data. For a given solution function, PI-INN can provide tractable and accurate estimates of the posterior distribution of the underlying parameter field. Moreover, capitalizing on the INN’s characteristics, we propose a novel independent loss function to effectively ensure the independence of the INN’s decomposition results. The efficacy and precision of the proposed PI-INN are demonstrated through a series of numerical experiments.

本文提出了一种利用物理信息可逆神经网络（PI-INN）解决贝叶斯逆问题的创新方法。作为一种神经算子模型，PI-INN 利用可逆神经网络（INN）来阐明潜变量空间中参数场与解函数之间的关系。具体来说，INN 将参数场的潜变量分解为两个不同的部分：代表正向问题解决方案的扩展系数，以及捕捉与逆向问题相关的固有不确定性的噪声。通过精确估计前向映射以及保持扩展系数和潜在噪声之间的统计独立性，PI-INN 为解决贝叶斯逆问题提供了一个精确高效的生成模型，即使在没有标记数据的情况下也是如此。对于给定的求解函数，PI-INN 可以对底层参数场的后验分布提供简便而准确的估计。此外，利用 INN 的特点，我们提出了一种新的独立损失函数，以有效确保 INN 分解结果的独立性。我们通过一系列数值实验证明了所提出的 PI-INN 的有效性和精确性。

{"title":"Efficient Bayesian inference using physics-informed invertible neural networks for inverse problems","authors":"Xiaofei Guan, Xintong Wang, Hao Wu, Zihao Yang and Peng Yu","doi":"10.1088/2632-2153/ad5f74","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f74","url":null,"abstract":"This paper presents an innovative approach to tackle Bayesian inverse problems using physics-informed invertible neural networks (PI-INN). Serving as a neural operator model, PI-INN employs an invertible neural network (INN) to elucidate the relationship between the parameter field and the solution function in latent variable spaces. Specifically, the INN decomposes the latent variable of the parameter field into two distinct components: the expansion coefficients that represent the solution to the forward problem, and the noise that captures the inherent uncertainty associated with the inverse problem. Through precise estimation of the forward mapping and preservation of statistical independence between expansion coefficients and latent noise, PI-INN offers an accurate and efficient generative model for resolving Bayesian inverse problems, even in the absence of labeled data. For a given solution function, PI-INN can provide tractable and accurate estimates of the posterior distribution of the underlying parameter field. Moreover, capitalizing on the INN’s characteristics, we propose a novel independent loss function to effectively ensure the independence of the INN’s decomposition results. The efficacy and precision of the proposed PI-INN are demonstrated through a series of numerical experiments.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"214 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Datacube segmentation via deep spectral clustering 通过深度光谱聚类进行数据立方体分割

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-07-21 DOI: 10.1088/2632-2153/ad622f

Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi and Chiara Ruberto

Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.

扩展视觉技术在物理学中无处不在。然而，由于从组成数据立方体的光谱中辨别相关信息的内在困难，从此类分析中产生的数据立方体往往对其解释构成挑战。此外，数据立方体光谱的巨大维度也给统计解释带来了复杂的任务；然而，这种复杂性包含了大量的统计信息，可以在无监督的情况下利用这些信息来概述手头案例研究的一些基本属性，例如，可以通过在适当定义的低维嵌入空间中对数据立方体光谱进行（深度）聚类来获得图像分割。为了解决这个问题，我们探索了在编码空间中应用无监督聚类方法的可能性，即对数据立方体像素的光谱属性进行深度聚类。统计降维是通过一个经过特别训练的（变异）自动编码器来完成的，它负责将光谱映射到低维的度量空间中，而聚类过程则是通过一个（可学习的）迭代 K-means 聚类算法来完成的。我们将这一技术应用于两个不同的使用案例，它们的物理来源各不相同：一组关于绘画艺术品的宏观映射 X 射线荧光（MA-XRF）合成数据，以及一个模拟天体物理观测数据集。

{"title":"Datacube segmentation via deep spectral clustering","authors":"Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi and Chiara Ruberto","doi":"10.1088/2632-2153/ad622f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad622f","url":null,"abstract":"Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"32 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Causal hybrid modeling with double machine learning—applications in carbon flux modeling 双机器学习的因果混合建模--在碳通量建模中的应用

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-07-18 DOI: 10.1088/2632-2153/ad5a60

Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls

Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.

混合建模将机器学习与科学知识相结合，以增强可解释性、概括性和对自然规律的遵循。然而，等价性和正则化偏差给混合建模实现这些目的带来了挑战。本文介绍了一种通过因果推理框架来估计混合模型的新方法，特别是采用双重机器学习（DML）来估计因果效应。我们在两个与二氧化碳通量有关的问题上展示了这种方法在地球科学中的应用。在 Q10 模型中，我们证明了基于 DML 的混合建模在估计因果参数方面优于端到端深度神经网络方法，证明了其效率、对正则化方法产生的偏差的鲁棒性以及规避等效性。我们的方法适用于碳通量分区，在适应异质因果效应方面表现出灵活性。该研究强调了明确定义因果图和因果关系的必要性，并倡导将此作为一般最佳实践。我们鼓励在混合模型中继续探索因果关系，以便在知识引导的机器学习中获得更可解释、更可信的结果。

{"title":"Causal hybrid modeling with double machine learning—applications in carbon flux modeling","authors":"Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls","doi":"10.1088/2632-2153/ad5a60","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5a60","url":null,"abstract":"Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"18 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retrieving past quantum features with deep hybrid classical-quantum reservoir computing 利用深度混合经典-量子存储计算检索过去的量子特征

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-07-18 DOI: 10.1088/2632-2153/ad5f12

Johannes Nokkala, Gian Luca Giorgi and Roberta Zambrini

Machine learning techniques have achieved impressive results in recent years and the possibility of harnessing the power of quantum physics opens new promising avenues to speed up classical learning methods. Rather than viewing classical and quantum approaches as exclusive alternatives, their integration into hybrid designs has gathered increasing interest, as seen in variational quantum algorithms, quantum circuit learning, and kernel methods. Here we introduce deep hybrid classical-quantum reservoir computing for temporal processing of quantum states where information about, for instance, the entanglement or the purity of past input states can be extracted via a single-step measurement. We find that the hybrid setup cascading two reservoirs not only inherits the strengths of both of its constituents but is even more than just the sum of its parts, outperforming comparable non-hybrid alternatives. The quantum layer is within reach of state-of-the-art multimode quantum optical platforms while the classical layer can be implemented in silico.

近年来，机器学习技术取得了令人瞩目的成就，而利用量子物理学的力量为加速经典学习方法开辟了新的前景广阔的途径。正如变量子算法、量子电路学习和内核方法一样，人们并不把经典方法和量子方法视为相互排斥的替代品，而是将它们整合到混合设计中，这引起了越来越多的兴趣。在这里，我们介绍了用于量子态时序处理的深度混合经典量子存储计算，在这种计算中，可以通过单步测量提取过去输入状态的纠缠或纯度等信息。我们发现，级联两个贮存器的混合装置不仅继承了两个贮存器的优点，而且比其各部分的总和更胜一筹，表现优于同类非混合装置。量子层可以在最先进的多模量子光学平台上实现，而经典层则可以在硅学中实现。

引用次数: 0

Ultrafast jet classification at the HL-LHC 超高速大型强子对撞机的超快射流分类

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-07-17 DOI: 10.1088/2632-2153/ad5f10

Patrick Odagiu, Zhiqiang Que, Javier Duarte, Johannes Haller, Gregor Kasieczka, Artur Lobanov, Vladimir Loncar, Wayne Luk, Jennifer Ngadiuba, Maurizio Pierini, Philipp Rincke, Arpita Seksaria, Sioni Summers, Andre Sznajder, Alexander Tapper and Thea K Årrestad

Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN large hadron collider during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.

三个机器学习模型用于进行喷气源分类。这些模型针对现场可编程门阵列设备的部署进行了优化。在这种情况下，我们展示了延迟和资源消耗如何随输入大小和算法选择而扩展。此外，本文提出的模型设计用于在欧洲核子研究中心大型强子对撞机高亮度阶段的数据类型和可预见的条件下工作。通过量化感知训练和针对特定现场可编程门阵列的高效合成，我们证明了以相对较低的计算资源成本推断深度集和交互网络等复杂架构是可行的。

引用次数: 0