Pub Date : 2024-09-18DOI: 10.1088/2632-2153/ad76f8
Pranath Reddy, Michael W Toomey, Hanna Parul and Sergei Gleyzer
Gravitational lensing data is frequently collected at low resolution due to instrumental limitations and observing conditions. Machine learning-based super-resolution techniques offer a method to enhance the resolution of these images, enabling more precise measurements of lensing effects and a better understanding of the matter distribution in the lensing system. This enhancement can significantly improve our knowledge of the distribution of mass within the lensing galaxy and its environment, as well as the properties of the background source being lensed. Traditional super-resolution techniques typically learn a mapping function from lower-resolution to higher-resolution samples. However, these methods are often constrained by their dependence on optimizing a fixed distance function, which can result in the loss of intricate details crucial for astrophysical analysis. In this work, we introduce DiffLense, a novel super-resolution pipeline based on a conditional diffusion model specifically designed to enhance the resolution of gravitational lensing images obtained from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP). Our approach adopts a generative model, leveraging the detailed structural information present in Hubble space telescope (HST) counterparts. The diffusion model, trained to generate HST data, is conditioned on HSC data pre-processed with denoising techniques and thresholding to significantly reduce noise and background interference. This process leads to a more distinct and less overlapping conditional distribution during the model’s training phase. We demonstrate that DiffLense outperforms existing state-of-the-art single-image super-resolution techniques, particularly in retaining the fine details necessary for astrophysical analyses.
{"title":"DiffLense: a conditional diffusion model for super-resolution of gravitational lensing data","authors":"Pranath Reddy, Michael W Toomey, Hanna Parul and Sergei Gleyzer","doi":"10.1088/2632-2153/ad76f8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad76f8","url":null,"abstract":"Gravitational lensing data is frequently collected at low resolution due to instrumental limitations and observing conditions. Machine learning-based super-resolution techniques offer a method to enhance the resolution of these images, enabling more precise measurements of lensing effects and a better understanding of the matter distribution in the lensing system. This enhancement can significantly improve our knowledge of the distribution of mass within the lensing galaxy and its environment, as well as the properties of the background source being lensed. Traditional super-resolution techniques typically learn a mapping function from lower-resolution to higher-resolution samples. However, these methods are often constrained by their dependence on optimizing a fixed distance function, which can result in the loss of intricate details crucial for astrophysical analysis. In this work, we introduce DiffLense, a novel super-resolution pipeline based on a conditional diffusion model specifically designed to enhance the resolution of gravitational lensing images obtained from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP). Our approach adopts a generative model, leveraging the detailed structural information present in Hubble space telescope (HST) counterparts. The diffusion model, trained to generate HST data, is conditioned on HSC data pre-processed with denoising techniques and thresholding to significantly reduce noise and background interference. This process leads to a more distinct and less overlapping conditional distribution during the model’s training phase. We demonstrate that DiffLense outperforms existing state-of-the-art single-image super-resolution techniques, particularly in retaining the fine details necessary for astrophysical analyses.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"70 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1088/2632-2153/ad64a8
Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy and John Andrew Raine
We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains.
{"title":"Masked particle modeling on sets: towards self-supervised high energy physics foundation models","authors":"Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy and John Andrew Raine","doi":"10.1088/2632-2153/ad64a8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad64a8","url":null,"abstract":"We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"75 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-15DOI: 10.1088/2632-2153/ad743e
Tianji Cai, Garrett W Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer and Lance J Dixon
We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar Super Yang–Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy ( on both tasks. Our work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.
{"title":"Transforming the bootstrap: using transformers to compute scattering amplitudes in planar N =...","authors":"Tianji Cai, Garrett W Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer and Lance J Dixon","doi":"10.1088/2632-2153/ad743e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad743e","url":null,"abstract":"We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar Super Yang–Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy ( on both tasks. Our work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"12 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1088/2632-2153/ad72cc
Yihang Chen and Wenbin Li
We consider end-to-end learning approaches for inverse problems of gravimetry. Due to ill-posedness of the inverse gravimetry, the reliability of learning approaches is questionable. To deal with this problem, we propose the strategy of learning on the correctness class. The well-posedness theorems are employed when designing the neural-network architecture and constructing the training set. Given the density-contrast function as a priori information, the domain of mass can be uniquely determined under certain constrains, and the domain inverse problem is a correctness class of the inverse gravimetry. Under this correctness class, we design the neural network for learning by mimicking the level-set formulation for the inverse gravimetry. Numerical examples illustrate that the method is able to recover mass models with non-constant density contrast.
{"title":"Learning on the correctness class for domain inverse problems of gravimetry","authors":"Yihang Chen and Wenbin Li","doi":"10.1088/2632-2153/ad72cc","DOIUrl":"https://doi.org/10.1088/2632-2153/ad72cc","url":null,"abstract":"We consider end-to-end learning approaches for inverse problems of gravimetry. Due to ill-posedness of the inverse gravimetry, the reliability of learning approaches is questionable. To deal with this problem, we propose the strategy of learning on the correctness class. The well-posedness theorems are employed when designing the neural-network architecture and constructing the training set. Given the density-contrast function as a priori information, the domain of mass can be uniquely determined under certain constrains, and the domain inverse problem is a correctness class of the inverse gravimetry. Under this correctness class, we design the neural network for learning by mimicking the level-set formulation for the inverse gravimetry. Numerical examples illustrate that the method is able to recover mass models with non-constant density contrast.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1088/2632-2153/ad718f
Lei Tang, Feng Liu, Anping Wu, Yubo Li, Wanqiu Jiang, Qingfeng Wang and Jun Huang
Currently, mainstream methods for multi-fidelity data fusion have achieved great success in many fields, but they generally suffer from poor scalability. Therefore, this paper proposes a combination modeling method for complex multi-fidelity data fusion, devoted to solving the modeling problems with three types of multi-fidelity data fusion, and explores a general solution for any n types of multi-fidelity data fusion. Different from the traditional direct modeling method—Multi-Fidelity Deep Neural Network (MFDNN)—the method is an indirect modeling method. The experimental results on three representative benchmark functions and the prediction tasks of SG6043 airfoil aerodynamic performance show that combination modeling has the following advantages: (1) It can quickly establish the mapping relationship between high, medium, and low fidelity data. (2) It can effectively solve the data imbalance problem in multi-fidelity modeling. (3) Compared with MFDNN, it has stronger noise resistance and higher prediction accuracy. Additionally, this paper discusses the scalability problem of the method when n = 4 and n = 5, providing a reference for further research on the combined modeling method.
目前,多保真数据融合的主流方法在许多领域取得了巨大成功,但普遍存在可扩展性差的问题。因此,本文提出了一种复杂多保真度数据融合的组合建模方法,致力于解决三种多保真度数据融合的建模问题,并探索了适用于任意n种多保真度数据融合的通用解决方案。与传统的直接建模方法--多保真深度神经网络(MFDNN)不同,该方法是一种间接建模方法。在三个具有代表性的基准函数和 SG6043 机翼气动性能预测任务上的实验结果表明,组合建模具有以下优点:(1)可以快速建立高、中、低保真数据之间的映射关系。(2)能有效解决多保真度建模中的数据不平衡问题。(3) 与 MFDNN 相比,它具有更强的抗噪声能力和更高的预测精度。此外,本文还讨论了该方法在 n = 4 和 n = 5 时的可扩展性问题,为进一步研究组合建模方法提供了参考。
{"title":"A combined modeling method for complex multi-fidelity data fusion","authors":"Lei Tang, Feng Liu, Anping Wu, Yubo Li, Wanqiu Jiang, Qingfeng Wang and Jun Huang","doi":"10.1088/2632-2153/ad718f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad718f","url":null,"abstract":"Currently, mainstream methods for multi-fidelity data fusion have achieved great success in many fields, but they generally suffer from poor scalability. Therefore, this paper proposes a combination modeling method for complex multi-fidelity data fusion, devoted to solving the modeling problems with three types of multi-fidelity data fusion, and explores a general solution for any n types of multi-fidelity data fusion. Different from the traditional direct modeling method—Multi-Fidelity Deep Neural Network (MFDNN)—the method is an indirect modeling method. The experimental results on three representative benchmark functions and the prediction tasks of SG6043 airfoil aerodynamic performance show that combination modeling has the following advantages: (1) It can quickly establish the mapping relationship between high, medium, and low fidelity data. (2) It can effectively solve the data imbalance problem in multi-fidelity modeling. (3) Compared with MFDNN, it has stronger noise resistance and higher prediction accuracy. Additionally, this paper discusses the scalability problem of the method when n = 4 and n = 5, providing a reference for further research on the combined modeling method.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"56 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-09DOI: 10.1088/2632-2153/ad6fea
Joan Garriga and Frederic Bartumeus
Dimensionality reduction methods are fundamental to the exploration and visualisation of large data sets. Basic requirements for unsupervised data exploration are flexibility and scalability. However, current methods have computational limitations that restrict our ability to explore data structures to the lower range of scales. We focus on t-SNE and propose a chunk-and-mix protocol that enables the parallel implementation of this algorithm, as well as a self-adaptive parametric scheme that facilitates its parametric configuration. As a proof of concept, we present the pt-SNE algorithm, a parallel version of Barnes-Hat-SNE (an implementation of t-SNE). In pt-SNE, a single free parameter for the size of the neighbourhood, namely the perplexity, modulates the visualisation of the data structure at different scales, from local to global. Thanks to parallelisation, the runtime of the algorithm remains almost independent of the perplexity, which extends the range of scales to be analysed. The pt-SNE converges to a good global embedding comparable to current solutions, although it adds little noise at the local scale. This noise illustrates an unavoidable trade-off between computational speed and accuracy. We expect the same approach to be applicable to faster embedding algorithms than Barnes-Hat-SNE, such as Fast-Fourier Interpolation-based t-SNE or Uniform Manifold Approximation and Projection, thus extending the state of the art and allowing a more comprehensive visualisation and analysis of data structures.
{"title":"Towards a comprehensive visualisation of structure in large scale data sets","authors":"Joan Garriga and Frederic Bartumeus","doi":"10.1088/2632-2153/ad6fea","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6fea","url":null,"abstract":"Dimensionality reduction methods are fundamental to the exploration and visualisation of large data sets. Basic requirements for unsupervised data exploration are flexibility and scalability. However, current methods have computational limitations that restrict our ability to explore data structures to the lower range of scales. We focus on t-SNE and propose a chunk-and-mix protocol that enables the parallel implementation of this algorithm, as well as a self-adaptive parametric scheme that facilitates its parametric configuration. As a proof of concept, we present the pt-SNE algorithm, a parallel version of Barnes-Hat-SNE (an implementation of t-SNE). In pt-SNE, a single free parameter for the size of the neighbourhood, namely the perplexity, modulates the visualisation of the data structure at different scales, from local to global. Thanks to parallelisation, the runtime of the algorithm remains almost independent of the perplexity, which extends the range of scales to be analysed. The pt-SNE converges to a good global embedding comparable to current solutions, although it adds little noise at the local scale. This noise illustrates an unavoidable trade-off between computational speed and accuracy. We expect the same approach to be applicable to faster embedding algorithms than Barnes-Hat-SNE, such as Fast-Fourier Interpolation-based t-SNE or Uniform Manifold Approximation and Projection, thus extending the state of the art and allowing a more comprehensive visualisation and analysis of data structures.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"30 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the field of machine learning, the multi-category classification problem plays a crucial role. Solving the problem has a profound impact on driving the innovation and development of machine learning techniques and addressing complex problems in the real world. In recent years, researchers have begun to focus on utilizing quantum computing to solve the multi-category classification problem. Some studies have shown that the process of processing information in the brain may be related to quantum phenomena, with different brain regions having neurons with different structures. Inspired by this, we design a quantum multi-category classifier model from this perspective for the first time. The model employs a heterogeneous population of quantum neural networks (QNNs) to simulate the cooperative work of multiple different brain regions. When processing information, these heterogeneous clusters of QNNs allow for simultaneous execution on different quantum computers, thus simulating the brain’s ability to utilize multiple brain regions working in concert to maintain the robustness of the model. By setting the number of heterogeneous QNN clusters and parameterizing the number of stacks of unit layers in the quantum circuit, the model demonstrates excellent scalability in dealing with different types of data and different numbers of classes in the classification problem. Based on the attention mechanism of the brain, we integrate the processing results of heterogeneous QNN clusters to achieve high accuracy in classification. Finally, we conducted classification simulation experiments on different datasets. The results show that our method exhibits strong robustness and scalability. Among them, on different subsets of the MNIST dataset, its classification accuracy improves by up to about 5% compared to other quantum multiclassification algorithms. This result becomes the state-of-the-art simulation result for quantum classification models and exceeds the performance of classical classifiers with a considerable number of trainable parameters on some subsets of the MNIST dataset.
{"title":"Designing quantum multi-category classifier from the perspective of brain processing information","authors":"Xiaodong Ding, Jinchen Xu, Zhihui Song, Yifan Hou, Zheng Shan","doi":"10.1088/2632-2153/ad7570","DOIUrl":"https://doi.org/10.1088/2632-2153/ad7570","url":null,"abstract":"In the field of machine learning, the multi-category classification problem plays a crucial role. Solving the problem has a profound impact on driving the innovation and development of machine learning techniques and addressing complex problems in the real world. In recent years, researchers have begun to focus on utilizing quantum computing to solve the multi-category classification problem. Some studies have shown that the process of processing information in the brain may be related to quantum phenomena, with different brain regions having neurons with different structures. Inspired by this, we design a quantum multi-category classifier model from this perspective for the first time. The model employs a heterogeneous population of quantum neural networks (QNNs) to simulate the cooperative work of multiple different brain regions. When processing information, these heterogeneous clusters of QNNs allow for simultaneous execution on different quantum computers, thus simulating the brain’s ability to utilize multiple brain regions working in concert to maintain the robustness of the model. By setting the number of heterogeneous QNN clusters and parameterizing the number of stacks of unit layers in the quantum circuit, the model demonstrates excellent scalability in dealing with different types of data and different numbers of classes in the classification problem. Based on the attention mechanism of the brain, we integrate the processing results of heterogeneous QNN clusters to achieve high accuracy in classification. Finally, we conducted classification simulation experiments on different datasets. The results show that our method exhibits strong robustness and scalability. Among them, on different subsets of the MNIST dataset, its classification accuracy improves by up to about 5% compared to other quantum multiclassification algorithms. This result becomes the state-of-the-art simulation result for quantum classification models and exceeds the performance of classical classifiers with a considerable number of trainable parameters on some subsets of the MNIST dataset.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"27 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1088/2632-2153/ad743f
Jinyang Sun, Xi Chen, Xiumei Wang, Dandan Zhu, Xingping Zhou
The concept of photonic modes is the cornerstone in optics and photonics, which can describe the propagation of the light. The Maxwell’s equations play the role in calculating the mode field based on the structure information, while this process needs a great deal of computations, especially in the handle with a three-dimensional model. To overcome this obstacle, we introduce the multi-modal diffusion model to predict the photonic modes in one certain structure. The Contrastive Language–Image Pre-training (CLIP) model is used to build the connections between photonic structures and the corresponding modes. Then we exemplify Stable Diffusion (SD) model to realize the function of optical fields generation from structure information. Our work introduces multi-modal deep learning to construct complex mapping between structural information and optical field as high-dimensional vectors, and generates optical field images based on this mapping.
{"title":"Photonic modes prediction via multi-modal diffusion model","authors":"Jinyang Sun, Xi Chen, Xiumei Wang, Dandan Zhu, Xingping Zhou","doi":"10.1088/2632-2153/ad743f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad743f","url":null,"abstract":"The concept of photonic modes is the cornerstone in optics and photonics, which can describe the propagation of the light. The Maxwell’s equations play the role in calculating the mode field based on the structure information, while this process needs a great deal of computations, especially in the handle with a three-dimensional model. To overcome this obstacle, we introduce the multi-modal diffusion model to predict the photonic modes in one certain structure. The Contrastive Language–Image Pre-training (CLIP) model is used to build the connections between photonic structures and the corresponding modes. Then we exemplify Stable Diffusion (SD) model to realize the function of optical fields generation from structure information. Our work introduces multi-modal deep learning to construct complex mapping between structural information and optical field as high-dimensional vectors, and generates optical field images based on this mapping.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"4 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142197713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1088/2632-2153/ad7457
Aiden R Rosebush, Alexander C B Greenwood, Brian T Kirby, Li Qian
We propose a support vector machine (SVM) based approach for generating an entanglement witness that requires exponentially less training data than previously proposed methods. SVMs generate hyperplanes represented by a weighted sum of expectation values of local observables whose coefficients are optimized to sum to a positive number for all separable states and a negative number for as many entangled states as possible near a specific target state. Previous SVM-based approaches for entanglement witness generation used large amounts of randomly generated separable states to perform training, a task with considerable computational overhead. Here, we propose a method for orienting the witness hyperplane using only the significantly smaller set of states consisting of the eigenstates of the generalized Pauli matrices and a set of entangled states near the target entangled states. With the orientation of the witness hyperplane set by the SVM, we tune the plane’s placement using a differential program that ensures perfect classification accuracy on a limited test set as well as maximal noise tolerance. For N qubits, the SVM portion of this approach requires only