首页 > 最新文献

Machine Learning Science and Technology最新文献

英文 中文
DiffLense: a conditional diffusion model for super-resolution of gravitational lensing data DiffLense:引力透镜数据超分辨率的条件扩散模型
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-18 DOI: 10.1088/2632-2153/ad76f8
Pranath Reddy, Michael W Toomey, Hanna Parul and Sergei Gleyzer
Gravitational lensing data is frequently collected at low resolution due to instrumental limitations and observing conditions. Machine learning-based super-resolution techniques offer a method to enhance the resolution of these images, enabling more precise measurements of lensing effects and a better understanding of the matter distribution in the lensing system. This enhancement can significantly improve our knowledge of the distribution of mass within the lensing galaxy and its environment, as well as the properties of the background source being lensed. Traditional super-resolution techniques typically learn a mapping function from lower-resolution to higher-resolution samples. However, these methods are often constrained by their dependence on optimizing a fixed distance function, which can result in the loss of intricate details crucial for astrophysical analysis. In this work, we introduce DiffLense, a novel super-resolution pipeline based on a conditional diffusion model specifically designed to enhance the resolution of gravitational lensing images obtained from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP). Our approach adopts a generative model, leveraging the detailed structural information present in Hubble space telescope (HST) counterparts. The diffusion model, trained to generate HST data, is conditioned on HSC data pre-processed with denoising techniques and thresholding to significantly reduce noise and background interference. This process leads to a more distinct and less overlapping conditional distribution during the model’s training phase. We demonstrate that DiffLense outperforms existing state-of-the-art single-image super-resolution techniques, particularly in retaining the fine details necessary for astrophysical analyses.
由于仪器和观测条件的限制,引力透镜数据经常是以低分辨率收集的。基于机器学习的超分辨率技术提供了一种提高这些图像分辨率的方法,可以更精确地测量透镜效应,更好地了解透镜系统中的物质分布。这种增强可以极大地提高我们对透镜星系及其环境中质量分布的了解,以及对被透镜的背景源特性的了解。传统的超分辨率技术通常是学习从低分辨率样本到高分辨率样本的映射函数。然而,这些方法往往受制于它们对固定距离函数的优化依赖,这可能导致对天体物理分析至关重要的复杂细节的丢失。在这项工作中,我们介绍了 DiffLense,这是一种基于条件扩散模型的新型超分辨率管道,专门用于提高从超级超ime-Cam Subaru 战略计划(HSC-SSP)获得的引力透镜图像的分辨率。我们的方法采用了一个生成模型,充分利用了哈勃空间望远镜(HST)对应图像中的详细结构信息。为生成 HST 数据而训练的扩散模型,以经过去噪技术和阈值化预处理的 HSC 数据为条件,以显著减少噪声和背景干扰。在模型的训练阶段,这一过程会使条件分布更清晰、重叠更少。我们证明,DiffLense优于现有的最先进的单图像超分辨率技术,尤其是在保留天体物理分析所需的精细细节方面。
{"title":"DiffLense: a conditional diffusion model for super-resolution of gravitational lensing data","authors":"Pranath Reddy, Michael W Toomey, Hanna Parul and Sergei Gleyzer","doi":"10.1088/2632-2153/ad76f8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad76f8","url":null,"abstract":"Gravitational lensing data is frequently collected at low resolution due to instrumental limitations and observing conditions. Machine learning-based super-resolution techniques offer a method to enhance the resolution of these images, enabling more precise measurements of lensing effects and a better understanding of the matter distribution in the lensing system. This enhancement can significantly improve our knowledge of the distribution of mass within the lensing galaxy and its environment, as well as the properties of the background source being lensed. Traditional super-resolution techniques typically learn a mapping function from lower-resolution to higher-resolution samples. However, these methods are often constrained by their dependence on optimizing a fixed distance function, which can result in the loss of intricate details crucial for astrophysical analysis. In this work, we introduce DiffLense, a novel super-resolution pipeline based on a conditional diffusion model specifically designed to enhance the resolution of gravitational lensing images obtained from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP). Our approach adopts a generative model, leveraging the detailed structural information present in Hubble space telescope (HST) counterparts. The diffusion model, trained to generate HST data, is conditioned on HSC data pre-processed with denoising techniques and thresholding to significantly reduce noise and background interference. This process leads to a more distinct and less overlapping conditional distribution during the model’s training phase. We demonstrate that DiffLense outperforms existing state-of-the-art single-image super-resolution techniques, particularly in retaining the fine details necessary for astrophysical analyses.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"70 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Masked particle modeling on sets: towards self-supervised high energy physics foundation models 集合上的掩蔽粒子建模:走向自监督高能物理基础模型
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-16 DOI: 10.1088/2632-2153/ad64a8
Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy and John Andrew Raine
We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains.
我们提出了掩蔽粒子建模(MPM)作为一种自监督方法,用于学习无序输入集上的通用、可转移和可重复使用的表示,以用于高能物理(HEP)科学数据。这项工作提供了一种新颖的方案,用于执行基于掩码建模的预训练,以学习集合上的包络不变函数。更广泛地说,这项工作为建立大型高能物理基础模型迈出了一步,这些模型可以通过自监督学习进行通用预训练,然后针对各种下游任务进行微调。在 MPM 中,一个集合中的粒子被遮蔽,训练目标是恢复它们的身份,身份由预先训练的向量量化变分自动编码器的离散标记表示法定义。我们研究了该方法在对撞机物理实验的高能射流样本中的功效,包括研究离散化、包络不变性和排序的影响。我们还研究了该模型的微调能力,表明它可以适应监督和弱监督射流分类等任务,而且该模型可以通过小规模微调数据集高效地转移到新的类别和新的数据域。
{"title":"Masked particle modeling on sets: towards self-supervised high energy physics foundation models","authors":"Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy and John Andrew Raine","doi":"10.1088/2632-2153/ad64a8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad64a8","url":null,"abstract":"We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"75 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transforming the bootstrap: using transformers to compute scattering amplitudes in planar N =... 转换自举法:使用转换器计算平面 N =...
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-15 DOI: 10.1088/2632-2153/ad743e
Tianji Cai, Garrett W Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer and Lance J Dixon
We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar Super Yang–Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy ( on both tasks. Our work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.
我们致力于利用深度学习方法来改进理论高能物理的最新计算。平面超杨-米尔斯理论是在大型强子对撞机上描述希格斯玻色子产生的理论的近亲;其散射振幅是包含整数系数的大型数学表达式。在本文中,我们应用变换器来预测这些系数。这个问题可以用一种类似语言的表示法来表述,适合标准的交叉熵训练目标。我们设计了两个相关实验,结果表明该模型在两个任务中都达到了很高的准确率。我们的工作表明,变换器可以成功地应用于理论物理中需要精确解的问题。
{"title":"Transforming the bootstrap: using transformers to compute scattering amplitudes in planar N =...","authors":"Tianji Cai, Garrett W Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer and Lance J Dixon","doi":"10.1088/2632-2153/ad743e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad743e","url":null,"abstract":"We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar Super Yang–Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy ( on both tasks. Our work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"12 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning on the correctness class for domain inverse problems of gravimetry 关于重力测量领域反问题正确性类的学习
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-11 DOI: 10.1088/2632-2153/ad72cc
Yihang Chen and Wenbin Li
We consider end-to-end learning approaches for inverse problems of gravimetry. Due to ill-posedness of the inverse gravimetry, the reliability of learning approaches is questionable. To deal with this problem, we propose the strategy of learning on the correctness class. The well-posedness theorems are employed when designing the neural-network architecture and constructing the training set. Given the density-contrast function as a priori information, the domain of mass can be uniquely determined under certain constrains, and the domain inverse problem is a correctness class of the inverse gravimetry. Under this correctness class, we design the neural network for learning by mimicking the level-set formulation for the inverse gravimetry. Numerical examples illustrate that the method is able to recover mass models with non-constant density contrast.
我们考虑了重力测量逆问题的端到端学习方法。由于反重力测量的非假设性,学习方法的可靠性值得怀疑。为了解决这个问题,我们提出了在正确性类上学习的策略。在设计神经网络架构和构建训练集时,我们采用了问题定理。给定密度对比函数作为先验信息,在一定的约束条件下,质量域可以唯一确定,质量域逆问题是反重力测量学的一个正确性类别。在这一正确性类别下,我们模仿反重力测量的水平集公式设计了用于学习的神经网络。数值实例表明,该方法能够恢复非恒定密度对比的质量模型。
{"title":"Learning on the correctness class for domain inverse problems of gravimetry","authors":"Yihang Chen and Wenbin Li","doi":"10.1088/2632-2153/ad72cc","DOIUrl":"https://doi.org/10.1088/2632-2153/ad72cc","url":null,"abstract":"We consider end-to-end learning approaches for inverse problems of gravimetry. Due to ill-posedness of the inverse gravimetry, the reliability of learning approaches is questionable. To deal with this problem, we propose the strategy of learning on the correctness class. The well-posedness theorems are employed when designing the neural-network architecture and constructing the training set. Given the density-contrast function as a priori information, the domain of mass can be uniquely determined under certain constrains, and the domain inverse problem is a correctness class of the inverse gravimetry. Under this correctness class, we design the neural network for learning by mimicking the level-set formulation for the inverse gravimetry. Numerical examples illustrate that the method is able to recover mass models with non-constant density contrast.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A combined modeling method for complex multi-fidelity data fusion 复杂多保真数据融合的组合建模方法
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-10 DOI: 10.1088/2632-2153/ad718f
Lei Tang, Feng Liu, Anping Wu, Yubo Li, Wanqiu Jiang, Qingfeng Wang and Jun Huang
Currently, mainstream methods for multi-fidelity data fusion have achieved great success in many fields, but they generally suffer from poor scalability. Therefore, this paper proposes a combination modeling method for complex multi-fidelity data fusion, devoted to solving the modeling problems with three types of multi-fidelity data fusion, and explores a general solution for any n types of multi-fidelity data fusion. Different from the traditional direct modeling method—Multi-Fidelity Deep Neural Network (MFDNN)—the method is an indirect modeling method. The experimental results on three representative benchmark functions and the prediction tasks of SG6043 airfoil aerodynamic performance show that combination modeling has the following advantages: (1) It can quickly establish the mapping relationship between high, medium, and low fidelity data. (2) It can effectively solve the data imbalance problem in multi-fidelity modeling. (3) Compared with MFDNN, it has stronger noise resistance and higher prediction accuracy. Additionally, this paper discusses the scalability problem of the method when n = 4 and n = 5, providing a reference for further research on the combined modeling method.
目前,多保真数据融合的主流方法在许多领域取得了巨大成功,但普遍存在可扩展性差的问题。因此,本文提出了一种复杂多保真度数据融合的组合建模方法,致力于解决三种多保真度数据融合的建模问题,并探索了适用于任意n种多保真度数据融合的通用解决方案。与传统的直接建模方法--多保真深度神经网络(MFDNN)不同,该方法是一种间接建模方法。在三个具有代表性的基准函数和 SG6043 机翼气动性能预测任务上的实验结果表明,组合建模具有以下优点:(1)可以快速建立高、中、低保真数据之间的映射关系。(2)能有效解决多保真度建模中的数据不平衡问题。(3) 与 MFDNN 相比,它具有更强的抗噪声能力和更高的预测精度。此外,本文还讨论了该方法在 n = 4 和 n = 5 时的可扩展性问题,为进一步研究组合建模方法提供了参考。
{"title":"A combined modeling method for complex multi-fidelity data fusion","authors":"Lei Tang, Feng Liu, Anping Wu, Yubo Li, Wanqiu Jiang, Qingfeng Wang and Jun Huang","doi":"10.1088/2632-2153/ad718f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad718f","url":null,"abstract":"Currently, mainstream methods for multi-fidelity data fusion have achieved great success in many fields, but they generally suffer from poor scalability. Therefore, this paper proposes a combination modeling method for complex multi-fidelity data fusion, devoted to solving the modeling problems with three types of multi-fidelity data fusion, and explores a general solution for any n types of multi-fidelity data fusion. Different from the traditional direct modeling method—Multi-Fidelity Deep Neural Network (MFDNN)—the method is an indirect modeling method. The experimental results on three representative benchmark functions and the prediction tasks of SG6043 airfoil aerodynamic performance show that combination modeling has the following advantages: (1) It can quickly establish the mapping relationship between high, medium, and low fidelity data. (2) It can effectively solve the data imbalance problem in multi-fidelity modeling. (3) Compared with MFDNN, it has stronger noise resistance and higher prediction accuracy. Additionally, this paper discusses the scalability problem of the method when n = 4 and n = 5, providing a reference for further research on the combined modeling method.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"56 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a comprehensive visualisation of structure in large scale data sets 实现大规模数据集结构的全面可视化
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-09 DOI: 10.1088/2632-2153/ad6fea
Joan Garriga and Frederic Bartumeus
Dimensionality reduction methods are fundamental to the exploration and visualisation of large data sets. Basic requirements for unsupervised data exploration are flexibility and scalability. However, current methods have computational limitations that restrict our ability to explore data structures to the lower range of scales. We focus on t-SNE and propose a chunk-and-mix protocol that enables the parallel implementation of this algorithm, as well as a self-adaptive parametric scheme that facilitates its parametric configuration. As a proof of concept, we present the pt-SNE algorithm, a parallel version of Barnes-Hat-SNE (an implementation of t-SNE). In pt-SNE, a single free parameter for the size of the neighbourhood, namely the perplexity, modulates the visualisation of the data structure at different scales, from local to global. Thanks to parallelisation, the runtime of the algorithm remains almost independent of the perplexity, which extends the range of scales to be analysed. The pt-SNE converges to a good global embedding comparable to current solutions, although it adds little noise at the local scale. This noise illustrates an unavoidable trade-off between computational speed and accuracy. We expect the same approach to be applicable to faster embedding algorithms than Barnes-Hat-SNE, such as Fast-Fourier Interpolation-based t-SNE or Uniform Manifold Approximation and Projection, thus extending the state of the art and allowing a more comprehensive visualisation and analysis of data structures.
降维方法是探索和可视化大型数据集的基础。无监督数据探索的基本要求是灵活性和可扩展性。然而,目前的方法存在计算上的局限性,将我们探索数据结构的能力限制在较低的尺度范围内。我们将重点放在 t-SNE 上,并提出了一种分块混合协议(chunk-and-mix protocol)来并行执行该算法,以及一种自适应参数方案(self-adaptive parametric scheme)来简化参数配置。作为概念验证,我们提出了pt-SNE 算法,它是 Barnes-Hat-SNE(t-SNE 的实现)的并行版本。在 pt-SNE 算法中,邻域大小的一个自由参数,即困惑度,可以调节从局部到全局等不同尺度的数据结构的可视化。得益于并行化,该算法的运行时间几乎不受复杂度的影响,从而扩大了可分析的尺度范围。尽管 pt-SNE 算法在局部尺度上几乎没有增加噪音,但它收敛到了与当前解决方案相当的良好全局嵌入。这种噪音说明了计算速度和精确度之间不可避免的权衡。我们希望同样的方法也能适用于比 Barnes-Hat-SNE 更快的嵌入算法,如基于快速傅立叶插值的 t-SNE 或均匀曲面逼近和投影,从而拓展技术领域,实现更全面的数据结构可视化和分析。
{"title":"Towards a comprehensive visualisation of structure in large scale data sets","authors":"Joan Garriga and Frederic Bartumeus","doi":"10.1088/2632-2153/ad6fea","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6fea","url":null,"abstract":"Dimensionality reduction methods are fundamental to the exploration and visualisation of large data sets. Basic requirements for unsupervised data exploration are flexibility and scalability. However, current methods have computational limitations that restrict our ability to explore data structures to the lower range of scales. We focus on t-SNE and propose a chunk-and-mix protocol that enables the parallel implementation of this algorithm, as well as a self-adaptive parametric scheme that facilitates its parametric configuration. As a proof of concept, we present the pt-SNE algorithm, a parallel version of Barnes-Hat-SNE (an implementation of t-SNE). In pt-SNE, a single free parameter for the size of the neighbourhood, namely the perplexity, modulates the visualisation of the data structure at different scales, from local to global. Thanks to parallelisation, the runtime of the algorithm remains almost independent of the perplexity, which extends the range of scales to be analysed. The pt-SNE converges to a good global embedding comparable to current solutions, although it adds little noise at the local scale. This noise illustrates an unavoidable trade-off between computational speed and accuracy. We expect the same approach to be applicable to faster embedding algorithms than Barnes-Hat-SNE, such as Fast-Fourier Interpolation-based t-SNE or Uniform Manifold Approximation and Projection, thus extending the state of the art and allowing a more comprehensive visualisation and analysis of data structures.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"30 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing quantum multi-category classifier from the perspective of brain processing information 从大脑处理信息的角度设计量子多类别分类器
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-06 DOI: 10.1088/2632-2153/ad7570
Xiaodong Ding, Jinchen Xu, Zhihui Song, Yifan Hou, Zheng Shan
In the field of machine learning, the multi-category classification problem plays a crucial role. Solving the problem has a profound impact on driving the innovation and development of machine learning techniques and addressing complex problems in the real world. In recent years, researchers have begun to focus on utilizing quantum computing to solve the multi-category classification problem. Some studies have shown that the process of processing information in the brain may be related to quantum phenomena, with different brain regions having neurons with different structures. Inspired by this, we design a quantum multi-category classifier model from this perspective for the first time. The model employs a heterogeneous population of quantum neural networks (QNNs) to simulate the cooperative work of multiple different brain regions. When processing information, these heterogeneous clusters of QNNs allow for simultaneous execution on different quantum computers, thus simulating the brain’s ability to utilize multiple brain regions working in concert to maintain the robustness of the model. By setting the number of heterogeneous QNN clusters and parameterizing the number of stacks of unit layers in the quantum circuit, the model demonstrates excellent scalability in dealing with different types of data and different numbers of classes in the classification problem. Based on the attention mechanism of the brain, we integrate the processing results of heterogeneous QNN clusters to achieve high accuracy in classification. Finally, we conducted classification simulation experiments on different datasets. The results show that our method exhibits strong robustness and scalability. Among them, on different subsets of the MNIST dataset, its classification accuracy improves by up to about 5% compared to other quantum multiclassification algorithms. This result becomes the state-of-the-art simulation result for quantum classification models and exceeds the performance of classical classifiers with a considerable number of trainable parameters on some subsets of the MNIST dataset.
在机器学习领域,多类别分类问题起着至关重要的作用。解决这一问题对推动机器学习技术的创新和发展以及解决现实世界中的复杂问题有着深远的影响。近年来,研究人员开始关注利用量子计算解决多类别分类问题。一些研究表明,大脑处理信息的过程可能与量子现象有关,不同脑区的神经元具有不同的结构。受此启发,我们首次从这个角度设计了一个量子多类别分类器模型。该模型采用量子神经网络(QNN)的异质群来模拟多个不同脑区的协同工作。在处理信息时,这些异构的量子神经网络集群可以在不同的量子计算机上同时执行,从而模拟大脑利用多个脑区协同工作的能力,以保持模型的鲁棒性。通过设置异构 QNN 群集的数量和量子电路中单元层堆叠数的参数,该模型在处理分类问题中不同类型的数据和不同数量的类别时表现出了出色的可扩展性。基于大脑的注意力机制,我们整合了异构 QNN 簇的处理结果,实现了高精度的分类。最后,我们在不同的数据集上进行了分类模拟实验。结果表明,我们的方法具有很强的鲁棒性和可扩展性。其中,在 MNIST 数据集的不同子集上,与其他量子多分类算法相比,其分类准确率提高了约 5%。这一结果成为量子分类模型最先进的模拟结果,并在 MNIST 数据集的某些子集中超过了具有相当数量可训练参数的经典分类器的性能。
{"title":"Designing quantum multi-category classifier from the perspective of brain processing information","authors":"Xiaodong Ding, Jinchen Xu, Zhihui Song, Yifan Hou, Zheng Shan","doi":"10.1088/2632-2153/ad7570","DOIUrl":"https://doi.org/10.1088/2632-2153/ad7570","url":null,"abstract":"In the field of machine learning, the multi-category classification problem plays a crucial role. Solving the problem has a profound impact on driving the innovation and development of machine learning techniques and addressing complex problems in the real world. In recent years, researchers have begun to focus on utilizing quantum computing to solve the multi-category classification problem. Some studies have shown that the process of processing information in the brain may be related to quantum phenomena, with different brain regions having neurons with different structures. Inspired by this, we design a quantum multi-category classifier model from this perspective for the first time. The model employs a heterogeneous population of quantum neural networks (QNNs) to simulate the cooperative work of multiple different brain regions. When processing information, these heterogeneous clusters of QNNs allow for simultaneous execution on different quantum computers, thus simulating the brain’s ability to utilize multiple brain regions working in concert to maintain the robustness of the model. By setting the number of heterogeneous QNN clusters and parameterizing the number of stacks of unit layers in the quantum circuit, the model demonstrates excellent scalability in dealing with different types of data and different numbers of classes in the classification problem. Based on the attention mechanism of the brain, we integrate the processing results of heterogeneous QNN clusters to achieve high accuracy in classification. Finally, we conducted classification simulation experiments on different datasets. The results show that our method exhibits strong robustness and scalability. Among them, on different subsets of the MNIST dataset, its classification accuracy improves by up to about 5% compared to other quantum multiclassification algorithms. This result becomes the state-of-the-art simulation result for quantum classification models and exceeds the performance of classical classifiers with a considerable number of trainable parameters on some subsets of the MNIST dataset.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"27 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Photonic modes prediction via multi-modal diffusion model 通过多模式扩散模型预测光子模式
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-05 DOI: 10.1088/2632-2153/ad743f
Jinyang Sun, Xi Chen, Xiumei Wang, Dandan Zhu, Xingping Zhou
The concept of photonic modes is the cornerstone in optics and photonics, which can describe the propagation of the light. The Maxwell’s equations play the role in calculating the mode field based on the structure information, while this process needs a great deal of computations, especially in the handle with a three-dimensional model. To overcome this obstacle, we introduce the multi-modal diffusion model to predict the photonic modes in one certain structure. The Contrastive Language–Image Pre-training (CLIP) model is used to build the connections between photonic structures and the corresponding modes. Then we exemplify Stable Diffusion (SD) model to realize the function of optical fields generation from structure information. Our work introduces multi-modal deep learning to construct complex mapping between structural information and optical field as high-dimensional vectors, and generates optical field images based on this mapping.
光子模式的概念是光学和光子学的基石,它可以描述光的传播。麦克斯韦方程的作用是根据结构信息计算模场,而这一过程需要大量计算,尤其是在处理三维模型时。为了克服这一障碍,我们引入了多模式扩散模型来预测特定结构中的光子模式。对比语言-图像预训练(CLIP)模型用于建立光子结构与相应模式之间的联系。然后,我们以稳定扩散(SD)模型为例,实现了从结构信息生成光场的功能。我们的工作引入多模态深度学习,以高维向量的形式构建结构信息与光场之间的复杂映射,并基于此映射生成光场图像。
{"title":"Photonic modes prediction via multi-modal diffusion model","authors":"Jinyang Sun, Xi Chen, Xiumei Wang, Dandan Zhu, Xingping Zhou","doi":"10.1088/2632-2153/ad743f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad743f","url":null,"abstract":"The concept of photonic modes is the cornerstone in optics and photonics, which can describe the propagation of the light. The Maxwell’s equations play the role in calculating the mode field based on the structure information, while this process needs a great deal of computations, especially in the handle with a three-dimensional model. To overcome this obstacle, we introduce the multi-modal diffusion model to predict the photonic modes in one certain structure. The Contrastive Language–Image Pre-training (CLIP) model is used to build the connections between photonic structures and the corresponding modes. Then we exemplify Stable Diffusion (SD) model to realize the function of optical fields generation from structure information. Our work introduces multi-modal deep learning to construct complex mapping between structural information and optical field as high-dimensional vectors, and generates optical field images based on this mapping.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"4 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An exponential reduction in training data sizes for machine learning derived entanglement witnesses 机器学习推导纠缠见证的训练数据量指数级减少
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-05 DOI: 10.1088/2632-2153/ad7457
Aiden R Rosebush, Alexander C B Greenwood, Brian T Kirby, Li Qian
We propose a support vector machine (SVM) based approach for generating an entanglement witness that requires exponentially less training data than previously proposed methods. SVMs generate hyperplanes represented by a weighted sum of expectation values of local observables whose coefficients are optimized to sum to a positive number for all separable states and a negative number for as many entangled states as possible near a specific target state. Previous SVM-based approaches for entanglement witness generation used large amounts of randomly generated separable states to perform training, a task with considerable computational overhead. Here, we propose a method for orienting the witness hyperplane using only the significantly smaller set of states consisting of the eigenstates of the generalized Pauli matrices and a set of entangled states near the target entangled states. With the orientation of the witness hyperplane set by the SVM, we tune the plane’s placement using a differential program that ensures perfect classification accuracy on a limited test set as well as maximal noise tolerance. For N qubits, the SVM portion of this approach requires only O(6N) training states, whereas an existing method needs O(24N). We use this method to construct witnesses of 4 and 5 qubit GHZ states with coefficients agreeing with stabilizer formalism witnesses to within 3.7 percent and 1 percent, respectively. We also use the same training states to generate novel 4 and 5 qubit W state witnesses. Finally, we computationally verify these witnesses on small test sets and propose methods for further verification.
我们提出了一种基于支持向量机(SVM)的方法来生成纠缠见证,与之前提出的方法相比,这种方法所需的训练数据要少得多。SVM 生成的超平面由局部观测值期望值的加权和表示,其系数经过优化,对于所有可分离状态,其总和为正数,而对于特定目标状态附近尽可能多的纠缠状态,其总和为负数。以前基于 SVM 的纠缠见证生成方法使用大量随机生成的可分离状态来进行训练,这项任务的计算开销相当大。在这里,我们提出了一种方法,只使用由广义保利矩阵的特征状态和目标纠缠状态附近的一组纠缠状态组成的较小的状态集来确定见证超平面的方向。利用 SVM 设定的见证超平面方向,我们使用差分程序调整平面的位置,以确保在有限的测试集上获得完美的分类准确性以及最大的噪声容限。对于 N 个量子比特,这种方法的 SVM 部分只需要 O(6N) 个训练状态,而现有方法需要 O(24N) 个训练状态。我们用这种方法构建了 4 和 5 量子 GHZ 状态的见证,其系数与稳定器形式主义见证的吻合度分别在 3.7% 和 1% 以内。我们还使用相同的训练态生成了新颖的 4 和 5 量子位 W 状态见证。最后,我们在小型测试集上对这些见证进行了计算验证,并提出了进一步验证的方法。
{"title":"An exponential reduction in training data sizes for machine learning derived entanglement witnesses","authors":"Aiden R Rosebush, Alexander C B Greenwood, Brian T Kirby, Li Qian","doi":"10.1088/2632-2153/ad7457","DOIUrl":"https://doi.org/10.1088/2632-2153/ad7457","url":null,"abstract":"We propose a support vector machine (SVM) based approach for generating an entanglement witness that requires exponentially less training data than previously proposed methods. SVMs generate hyperplanes represented by a weighted sum of expectation values of local observables whose coefficients are optimized to sum to a positive number for all separable states and a negative number for as many entangled states as possible near a specific target state. Previous SVM-based approaches for entanglement witness generation used large amounts of randomly generated separable states to perform training, a task with considerable computational overhead. Here, we propose a method for orienting the witness hyperplane using only the significantly smaller set of states consisting of the eigenstates of the generalized Pauli matrices and a set of entangled states near the target entangled states. With the orientation of the witness hyperplane set by the SVM, we tune the plane’s placement using a differential program that ensures perfect classification accuracy on a limited test set as well as maximal noise tolerance. For <italic toggle=\"yes\">N</italic> qubits, the SVM portion of this approach requires only <inline-formula>\u0000<tex-math><?CDATA $O(6^N)$?></tex-math><mml:math overflow=\"scroll\"><mml:mrow><mml:mi>O</mml:mi><mml:mo stretchy=\"false\">(</mml:mo><mml:msup><mml:mn>6</mml:mn><mml:mi>N</mml:mi></mml:msup><mml:mo stretchy=\"false\">)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href=\"mlstad7457ieqn1.gif\"></inline-graphic></inline-formula> training states, whereas an existing method needs <inline-formula>\u0000<tex-math><?CDATA $O(2^{4^N})$?></tex-math><mml:math overflow=\"scroll\"><mml:mrow><mml:mi>O</mml:mi><mml:mo stretchy=\"false\">(</mml:mo><mml:msup><mml:mn>2</mml:mn><mml:mrow><mml:msup><mml:mn>4</mml:mn><mml:mi>N</mml:mi></mml:msup></mml:mrow></mml:msup><mml:mo stretchy=\"false\">)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href=\"mlstad7457ieqn2.gif\"></inline-graphic></inline-formula>. We use this method to construct witnesses of 4 and 5 qubit GHZ states with coefficients agreeing with stabilizer formalism witnesses to within 3.7 percent and 1 percent, respectively. We also use the same training states to generate novel 4 and 5 qubit W state witnesses. Finally, we computationally verify these witnesses on small test sets and propose methods for further verification.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"210 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regulating the development of accurate data-driven physics-informed deformation models 规范准确的数据驱动型物理信息变形模型的开发
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-04 DOI: 10.1088/2632-2153/ad7192
Will Newman, Jamshid Ghaboussi, Michael Insana
The challenge posed by the inverse problem associated with ultrasonic elasticity imaging is well matched to the capabilities of data-driven solutions. This report describes how data properties and the time sequence by which the data are introduced during training influence deformation-model accuracy and training times. Our goal is to image the elastic modulus of soft linear-elastic media as accurately as possible within a limited volume. To monitor progress during training, we introduce metrics describing convergence rate and stress entropy to guide data acquisition and other timing features. For example, a regularization term in the loss function may be introduced and later removed to speed and stabilize developing deformation models as well as establishing stopping rules for neural-network convergence. Images of a 14.4 cm3 volume within 3D software phantom visually indicate the quality of modulus images resulting over a range of training variables. The results show that a data-driven method constrained by the physics of a deformed solid will lead to quantitively accurate 3D elastic modulus images with minimum artifacts.
与超声波弹性成像相关的逆问题所带来的挑战与数据驱动解决方案的能力非常匹配。本报告介绍了数据属性和在训练过程中引入数据的时间顺序如何影响形变模型的准确性和训练时间。我们的目标是在有限的体积内尽可能精确地对软线性弹性介质的弹性模量进行成像。为了监控训练过程中的进展,我们引入了描述收敛速度和应力熵的指标,以指导数据采集和其他计时特征。例如,可以在损失函数中引入正则化项,之后再将其移除,以加快和稳定变形模型的开发,并建立神经网络收敛的停止规则。三维软件模型中一个 14.4 cm3 体积的图像直观地显示了在一系列训练变量下产生的模量图像的质量。结果表明,受变形实体物理学制约的数据驱动方法可生成定量精确的三维弹性模量图像,并将伪影降到最低。
{"title":"Regulating the development of accurate data-driven physics-informed deformation models","authors":"Will Newman, Jamshid Ghaboussi, Michael Insana","doi":"10.1088/2632-2153/ad7192","DOIUrl":"https://doi.org/10.1088/2632-2153/ad7192","url":null,"abstract":"The challenge posed by the inverse problem associated with ultrasonic elasticity imaging is well matched to the capabilities of data-driven solutions. This report describes how data properties and the time sequence by which the data are introduced during training influence deformation-model accuracy and training times. Our goal is to image the elastic modulus of soft linear-elastic media as accurately as possible within a limited volume. To monitor progress during training, we introduce metrics describing convergence rate and stress entropy to guide data acquisition and other timing features. For example, a regularization term in the loss function may be introduced and later removed to speed and stabilize developing deformation models as well as establishing stopping rules for neural-network convergence. Images of a 14.4 cm<sup>3</sup> volume within 3D software phantom visually indicate the quality of modulus images resulting over a range of training variables. The results show that a data-driven method constrained by the physics of a deformed solid will lead to quantitively accurate 3D elastic modulus images with minimum artifacts.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"2 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Learning Science and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1