Pub Date : 2024-07-05DOI: 10.1088/2632-2153/ad5fde
Mert Onur Çakıroğlu, H. Kurban, Parichit Sharma, Oguzhan Kulekci, Elham Khorasani Buxton, Maryam Raeeszadeh-Sarmazdeh, Mehmet Dalkilic
In this study, we introduce a novel de Bruijn graph (dBG) based framework for feature engineering in biological sequential data such as proteins. This framework simplifies feature extraction by dynamically generating high-quality, interpretable features for traditional AI (TAI) algorithms. Our framework accounts for amino acid substitutions by efficiently adjusting the edge weights in the dBG using a secondary trie structure. We extract motifs from the dBG by traversing the heavy edges, and then incorporate alignment algorithms like BLAST and Smith-Waterman to generate features for TAI algorithms. Empirical validation on TIMP (tissue inhibitors of matrix metalloproteinase) data demonstrates significant accuracy improvements over a robust baseline, state-of-the-art (SOTA) PLM models, and those from the popular GLAM2 tool. Furthermore, our framework successfully identified Glycine and Arginine-rich (GAR) motifs with high coverage, highlighting it's potential in general pattern discovery. The software code is accessible at: https://github.com/parichit/TIMP_Classification
{"title":"An extended de Bruijn graph for feature engineering over biological sequential data","authors":"Mert Onur Çakıroğlu, H. Kurban, Parichit Sharma, Oguzhan Kulekci, Elham Khorasani Buxton, Maryam Raeeszadeh-Sarmazdeh, Mehmet Dalkilic","doi":"10.1088/2632-2153/ad5fde","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5fde","url":null,"abstract":"\u0000 In this study, we introduce a novel de Bruijn graph (dBG) based framework for feature engineering in biological sequential data such as proteins. This framework simplifies feature extraction by dynamically generating high-quality, interpretable features for traditional AI (TAI) algorithms. Our framework accounts for amino acid substitutions by efficiently adjusting the edge weights in the dBG using a secondary trie structure. We extract motifs from the dBG by traversing the heavy edges, and then incorporate alignment algorithms like BLAST and Smith-Waterman to generate features for TAI algorithms. Empirical validation on TIMP (tissue inhibitors of matrix metalloproteinase) data demonstrates significant accuracy improvements over a robust baseline, state-of-the-art (SOTA) PLM models, and those from the popular GLAM2 tool. Furthermore, our framework successfully identified Glycine and Arginine-rich (GAR) motifs with high coverage, highlighting it's potential in general pattern discovery. The software code is accessible at: https://github.com/parichit/TIMP_Classification","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141676107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-07DOI: 10.1088/2632-2153/ad55a5
Rui Liu, Lisheng Wei, Pinggai Zhang
The Particle Swarm Optimization (PSO) algorithm is easy to implement owing to its simple framework, and has been successfully applied to many optimization problems. However, the standard PSO easily falls into the local optimum and has weak search ability. To enhance the optimization ability of the algorithm, this paper proposes an adaptive particle swarm optimization with information interaction mechanism (APSOIIM). First, a chaotic sequence strategy was used to produce uniformly distributed particles and enhance their convergence speed at the initialization stage of the algorithm. Then, an interaction information mechanism is introduced to enhance the diversity of the population with the progress of the search, which can effectively interact with the best information of neighboring particles to maintain the balance between exploration and exploitation. Besides, the convergence was proven to verify the robustness and efficiency of the proposed APSOIIM algorithm. Finally, the proposed APSOIIM was applied to solve the CEC2014 benchmark functions and CEC2017 benchmark functions as well as famous engineering optimization problems. The experimental results show that the proposed APSOIIM has significant advantages over the compared algorithms.
{"title":"An Adaptive Particle Swarm Optimization with Information Interaction Mechanism","authors":"Rui Liu, Lisheng Wei, Pinggai Zhang","doi":"10.1088/2632-2153/ad55a5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad55a5","url":null,"abstract":"\u0000 The Particle Swarm Optimization (PSO) algorithm is easy to implement owing to its simple framework, and has been successfully applied to many optimization problems. However, the standard PSO easily falls into the local optimum and has weak search ability. To enhance the optimization ability of the algorithm, this paper proposes an adaptive particle swarm optimization with information interaction mechanism (APSOIIM). First, a chaotic sequence strategy was used to produce uniformly distributed particles and enhance their convergence speed at the initialization stage of the algorithm. Then, an interaction information mechanism is introduced to enhance the diversity of the population with the progress of the search, which can effectively interact with the best information of neighboring particles to maintain the balance between exploration and exploitation. Besides, the convergence was proven to verify the robustness and efficiency of the proposed APSOIIM algorithm. Finally, the proposed APSOIIM was applied to solve the CEC2014 benchmark functions and CEC2017 benchmark functions as well as famous engineering optimization problems. The experimental results show that the proposed APSOIIM has significant advantages over the compared algorithms.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141372768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-07DOI: 10.1088/2632-2153/ad55a4
Kritesh Kumar Gupta, Subrata Barman, S. Dey, T. Mukhopadhyay
Design of high entropy alloys (HEA) presents a significant challenge due to the large compositional space and composition-specific variation in their functional behavior. The traditional alloy design would include trial-and-error prototyping and high-throughput experimentation, which again is challenging due to large-scale fabrication and experimentation. To address these challenges, this article presents a computational strategy for HEA design based on the seamless integration of quasi-random sampling, molecular dynamics (MD) simulations and machine learning (ML). A limited number of algorithmically chosen molecular-level simulations are performed to create a Gaussian process-based computational mapping between the varying concentrations of constituent elements of the HEA and effective properties like Young’s modulus and density. The computationally efficient ML models are subsequently exploited for large-scale predictions and multi-objective functionality attainment with non-aligned goals. The study reveals that there exists a strong negative correlation between Al concentration and the desired effective properties of AlCoCrFeNi HEA, whereas the Ni concentration exhibits a strong positive correlation. The deformation mechanism further shows that excessive increase of Al concentration leads to a higher percentage of FCC to BCC phase transformation which is found to be relatively lower in the HEA with reduced Al concentration. Such physical insights during the deformation process would be crucial in the alloy design process along with the data-driven predictions. As an integral part of this investigation, the developed ML models are interpreted based on Shapley Additive exPlanations, which are essential to explain and understand the model’s mechanism along with meaningful deployment. The data-driven strategy presented here will lead to devising an efficient explainable machine learning-based bottom-up approach to alloy design for multi-objective non-aligned functionality attainment.
由于高熵合金(HEA)的组成空间很大,而且其功能行为因组成而异,因此高熵合金(HEA)的设计面临着巨大的挑战。传统的合金设计包括试错原型设计和高通量实验,而这又因大规模制造和实验而具有挑战性。为了应对这些挑战,本文提出了一种基于准随机抽样、分子动力学(MD)模拟和机器学习(ML)无缝集成的 HEA 设计计算策略。通过执行数量有限的算法选择的分子级模拟,在 HEA 组成元素的不同浓度与杨氏模量和密度等有效特性之间建立基于高斯过程的计算映射。计算效率高的 ML 模型随后被用于大规模预测和多目标功能实现,其目标并不一致。研究结果表明,铝浓度与铝钴铬铁镍 HEA 所需的有效特性之间存在很强的负相关性,而镍浓度则表现出很强的正相关性。变形机理进一步表明,过量增加铝浓度会导致更高的 FCC 到 BCC 相变比例,而在铝浓度降低的 HEA 中,这种比例相对较低。这种变形过程中的物理洞察力对于合金设计过程以及数据驱动的预测至关重要。作为这项研究不可分割的一部分,所开发的 ML 模型是基于 Shapley Additive exPlanations 进行解释的,这对于解释和理解模型的机制以及有意义的部署至关重要。本文介绍的数据驱动策略将有助于设计出一种基于机器学习的自下而上的高效可解释方法,用于合金设计,以实现多目标非对齐功能。
{"title":"Explainable machine learning assisted molecular-level insights for enhanced specific stiffness exploiting the large compositional space of AlCoCrFeNi high entropy alloys","authors":"Kritesh Kumar Gupta, Subrata Barman, S. Dey, T. Mukhopadhyay","doi":"10.1088/2632-2153/ad55a4","DOIUrl":"https://doi.org/10.1088/2632-2153/ad55a4","url":null,"abstract":"\u0000 Design of high entropy alloys (HEA) presents a significant challenge due to the large compositional space and composition-specific variation in their functional behavior. The traditional alloy design would include trial-and-error prototyping and high-throughput experimentation, which again is challenging due to large-scale fabrication and experimentation. To address these challenges, this article presents a computational strategy for HEA design based on the seamless integration of quasi-random sampling, molecular dynamics (MD) simulations and machine learning (ML). A limited number of algorithmically chosen molecular-level simulations are performed to create a Gaussian process-based computational mapping between the varying concentrations of constituent elements of the HEA and effective properties like Young’s modulus and density. The computationally efficient ML models are subsequently exploited for large-scale predictions and multi-objective functionality attainment with non-aligned goals. The study reveals that there exists a strong negative correlation between Al concentration and the desired effective properties of AlCoCrFeNi HEA, whereas the Ni concentration exhibits a strong positive correlation. The deformation mechanism further shows that excessive increase of Al concentration leads to a higher percentage of FCC to BCC phase transformation which is found to be relatively lower in the HEA with reduced Al concentration. Such physical insights during the deformation process would be crucial in the alloy design process along with the data-driven predictions. As an integral part of this investigation, the developed ML models are interpreted based on Shapley Additive exPlanations, which are essential to explain and understand the model’s mechanism along with meaningful deployment. The data-driven strategy presented here will lead to devising an efficient explainable machine learning-based bottom-up approach to alloy design for multi-objective non-aligned functionality attainment.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141371953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1088/2632-2153/ad5414
M. Sharifi Ghazijahani, C. Cierpka
This study aims at the prediction of the turbulent flow behind cylinder arrays by the application of Echo State Networks (ESN). Three different arrangements of arrays of seven cylinders are chosen for the current study. These represent different flow regimes: single bluff body flow, transient flow, and co-shedding flow. This allows the investigation of turbulent flows that fundamentally originate from wake flows yet exhibit highly diverse dynamics. The data is reduced by Proper Orthogonal Decomposition (POD) which is optimal in terms of kinetic energy. The Time Coefficients of the POD Modes (TCPM) are predicted by the ESN. The network architecture is optimized with respect to its three main hyperparameters, Input Scaling (INS), Spectral Radius (SR), and Leaking Rate (LR), in order to produce the best predictions in terms of Weighted Prediction Score (WPS), a metric leveling statistic and deterministic prediction. In general, the ESN is capable of imitating the complex dynamics of turbulent flows even for longer periods of several vortex shedding cycles. Furthermore, the mutual interdependencies of the TCPM are well preserved. However, optimal hyperparameters depend strongly on the flow characteristics. Generally, as flow dynamics become faster and more intermittent, larger LR and INS values result in better predictions, whereas less clear trends for SR are observable.
本研究旨在应用回声状态网络(ESN)预测气缸阵列后的湍流。本次研究选择了三种不同排列的七个圆柱体阵列。它们代表了不同的流态:单崖体流、瞬态流和共甩流。这样就可以研究从根本上源于唤醒流但又表现出高度多样化动态的湍流。通过适当正交分解(POD)对数据进行缩减,这是动能方面的最佳方法。POD 模式的时间系数(TCPM)由 ESN 预测。该网络架构针对其三个主要超参数(输入缩放(INS)、频谱半径(SR)和泄漏率(LR))进行了优化,以便在加权预测得分(WPS)、度量均衡统计和确定性预测方面产生最佳预测结果。总体而言,ESN 能够模仿湍流的复杂动态,甚至能够模仿几个涡流脱落周期的较长时间。此外,TCPM 的相互依赖关系也得到了很好的保留。不过,最佳超参数在很大程度上取决于流动特性。一般来说,随着流动动态变得越来越快,间歇性越来越强,LR 和 INS 值越大,预测结果越好,而 SR 的趋势则不太明显。
{"title":"On the prediction of the turbulent flow behind cylinder arrays via Echo State Networks","authors":"M. Sharifi Ghazijahani, C. Cierpka","doi":"10.1088/2632-2153/ad5414","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5414","url":null,"abstract":"\u0000 This study aims at the prediction of the turbulent flow behind cylinder arrays by the application of Echo State Networks (ESN). Three different arrangements of arrays of seven cylinders are chosen for the current study. These represent different flow regimes: single bluff body flow, transient flow, and co-shedding flow. This allows the investigation of turbulent flows that fundamentally originate from wake flows yet exhibit highly diverse dynamics. The data is reduced by Proper Orthogonal Decomposition (POD) which is optimal in terms of kinetic energy. The Time Coefficients of the POD Modes (TCPM) are predicted by the ESN. The network architecture is optimized with respect to its three main hyperparameters, Input Scaling (INS), Spectral Radius (SR), and Leaking Rate (LR), in order to produce the best predictions in terms of Weighted Prediction Score (WPS), a metric leveling statistic and deterministic prediction. In general, the ESN is capable of imitating the complex dynamics of turbulent flows even for longer periods of several vortex shedding cycles. Furthermore, the mutual interdependencies of the TCPM are well preserved. However, optimal hyperparameters depend strongly on the flow characteristics. Generally, as flow dynamics become faster and more intermittent, larger LR and INS values result in better predictions, whereas less clear trends for SR are observable.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"7 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141267795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1088/2632-2153/ad5412
Kyungmin Lee, Hyungjun Jeon, Dongkyu Lee, Bongsang Kim, Jeongho Bang, Taehyun Kim
Variational quantum machine learning (VQML) models based on parameterized quantum circuits (PQC) have been expected to offer a potential quantum advantage for machine learning applications. However, comparison between VQML models and their classical counterparts is hard due to the lack of interpretability of VQML models. In this study, we introduce a graphical approach to analyze the PQC and the corresponding operation of VQML models to deal with this problem. In particular, we utilize the Stokes representation of quantum states to treat VQML models as network models based on the corresponding representations of basic gates. From this approach, we suggest the notion of active paths in the networks and relate the expressivity of VQML models with it. We investigate the growth of active paths in VQML models and observe that the expressivity of VQML models can be significantly limited for certain cases. Then we construct classical models inspired by our graphical interpretation of VQML models and show that they can emulate or outperform the outputs of VQML models for these cases. Our result provides a new way to interpret the operation of VQML models and facilitates the interconnection between quantum and classical machine learning areas.
{"title":"Interpreting Variational Quantum Models with Active Paths in Parameterized Quantum Circuits","authors":"Kyungmin Lee, Hyungjun Jeon, Dongkyu Lee, Bongsang Kim, Jeongho Bang, Taehyun Kim","doi":"10.1088/2632-2153/ad5412","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5412","url":null,"abstract":"\u0000 Variational quantum machine learning (VQML) models based on parameterized quantum circuits (PQC) have been expected to offer a potential quantum advantage for machine learning applications. However, comparison between VQML models and their classical counterparts is hard due to the lack of interpretability of VQML models. In this study, we introduce a graphical approach to analyze the PQC and the corresponding operation of VQML models to deal with this problem. In particular, we utilize the Stokes representation of quantum states to treat VQML models as network models based on the corresponding representations of basic gates. From this approach, we suggest the notion of active paths in the networks and relate the expressivity of VQML models with it. We investigate the growth of active paths in VQML models and observe that the expressivity of VQML models can be significantly limited for certain cases. Then we construct classical models inspired by our graphical interpretation of VQML models and show that they can emulate or outperform the outputs of VQML models for these cases. Our result provides a new way to interpret the operation of VQML models and facilitates the interconnection between quantum and classical machine learning areas.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"4 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141267030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1088/2632-2153/ad5413
F. Zipoli, Zeineb Ayadi, P. Schwaller, Teodoro Laino, A. Vaucher
Inferring missing molecules in chemical equations is an important task in chemistry and drug discovery. In fact, the completion of chemical equations with necessary reagents is important for improving existing datasets by detecting missing compounds, making them compatible with deep learning models that require complete information about reactants, products, and reagents in a chemical equation for increased performance. Here, we present a deep learning model to predict missing molecules using a multi-task approach, which can ultimately be viewed as a generalization of the forward reaction prediction and retrosynthesis models, since both can be expressed in terms of incomplete chemical equations. We illustrate that a single trained model, based on the transformer architecture and acting on reaction SMILES strings, can address the prediction of products (forward), precursors (retro) or any other molecule in arbitrary positions such as solvents, catalysts or reagents (completion). Our aim is to assess whether a unified model trained simultaneously on different tasks can effectively leverage diverse knowledge from various prediction tasks within the chemical domain, compared to models trained individually on each application. The multi-task models demonstrate top-1 performance of 72.4 %, 16.1 %, and 30.5 % for the forward, retro, and completion tasks, respectively. For the same model we computed round-trip accuracy of 83.4 %. The completion task exhibiting improvements due to the multi-task approach.
{"title":"Completion of Partial Chemical Equations","authors":"F. Zipoli, Zeineb Ayadi, P. Schwaller, Teodoro Laino, A. Vaucher","doi":"10.1088/2632-2153/ad5413","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5413","url":null,"abstract":"\u0000 Inferring missing molecules in chemical equations is an important task in chemistry and drug discovery. In fact, the completion of chemical equations with necessary reagents is important for improving existing datasets by detecting missing compounds, making them compatible with deep learning models that require complete information about reactants, products, and reagents in a chemical equation for increased performance. Here, we present a deep learning model to predict missing molecules using a multi-task approach, which can ultimately be viewed as a generalization of the forward reaction prediction and retrosynthesis models, since both can be expressed in terms of incomplete chemical equations. We illustrate that a single trained model, based on the transformer architecture and acting on reaction SMILES strings, can address the prediction of products (forward), precursors (retro) or any other molecule in arbitrary positions such as solvents, catalysts or reagents (completion). Our aim is to assess whether a unified model trained simultaneously on different tasks can effectively leverage diverse knowledge from various prediction tasks within the chemical domain, compared to models trained individually on each application. The multi-task models demonstrate top-1 performance of 72.4 %, 16.1 %, and 30.5 % for the forward, retro, and completion tasks, respectively. For the same model we computed round-trip accuracy of 83.4 %. The completion task exhibiting improvements due to the multi-task approach.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"7 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141266145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-24DOI: 10.1088/2632-2153/ad5074
Thomas Penfold, Luke Watson, Clelia Middleton, Tudur David, Sneha Verma, thomas pope, Julia Kaczmarek, Conor Douglas Rankine
Computational spectroscopy has emerged as a critical tool for researchers looking to achieve both qualitative and quantitative interpretations of experimental spectra. Over the past decade, increased interactions between experiment and theory have created a positive feedback loop that has stimulated developments in both domains. In particular, the increased accuracy of calculations has led to them becoming an indispensable tool for the analysis of spectroscopies across the electromagnetic spectrum. This progress is especially well demonstrated for short-wavelength techniques, e.g. core-hole (X-ray) spectroscopies, whose prevalence has increased following the advent of modern X-ray facilities including third-generation synchrotrons and X-ray free-electron lasers (XFELs). While calculations based on well-established wavefunction or density-functional methods continue to dominate the greater part of spectral analyses in the literature, emerging developments in machine-learning algorithms are beginning to open up new opportunities to complement these traditional techniques with fast, accurate, and affordable 'black-box' approaches. This Topical Review recounts recent progress in data-driven/machine-learning approaches for computational X-ray spectroscopy. We discuss the achievements and limitations of the presently-available approaches and review the potential that these techniques have to expand the scope and reach of computational and experimental X-ray spectroscopic studies.
计算光谱学已成为研究人员对实验光谱进行定性和定量解释的重要工具。在过去十年中,实验与理论之间的互动日益频繁,形成了一个正反馈循环,促进了这两个领域的发展。特别是,计算精度的提高使其成为分析整个电磁波谱的不可或缺的工具。这种进步在短波长技术(如芯孔(X 射线)光谱)方面体现得尤为明显,随着包括第三代同步加速器和 X 射线自由电子激光器(XFEL)在内的现代 X 射线设备的出现,这种技术的普及率也在不断提高。虽然基于成熟的波函数或密度函数方法的计算仍在文献中的光谱分析中占主导地位,但机器学习算法的新兴发展已开始为利用快速、准确和经济实惠的 "黑盒 "方法补充这些传统技术带来新的机遇。本专题综述回顾了计算 X 射线光谱学数据驱动/机器学习方法的最新进展。我们讨论了目前可用方法的成就和局限性,并回顾了这些技术在扩大计算和实验 X 射线光谱研究的范围和影响力方面的潜力。
{"title":"Machine-Learning Strategies for the Accurate and Efficient Analysis of X-ray Spectroscopy","authors":"Thomas Penfold, Luke Watson, Clelia Middleton, Tudur David, Sneha Verma, thomas pope, Julia Kaczmarek, Conor Douglas Rankine","doi":"10.1088/2632-2153/ad5074","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5074","url":null,"abstract":"\u0000 Computational spectroscopy has emerged as a critical tool for researchers looking to achieve both qualitative and quantitative interpretations of experimental spectra. Over the past decade, increased interactions between experiment and theory have created a positive feedback loop that has stimulated developments in both domains. In particular, the increased accuracy of calculations has led to them becoming an indispensable tool for the analysis of spectroscopies across the electromagnetic spectrum. This progress is especially well demonstrated for short-wavelength techniques, e.g. core-hole (X-ray) spectroscopies, whose prevalence has increased following the advent of modern X-ray facilities including third-generation synchrotrons and X-ray free-electron lasers (XFELs). While calculations based on well-established wavefunction or density-functional methods continue to dominate the greater part of spectral analyses in the literature, emerging developments in machine-learning algorithms are beginning to open up new opportunities to complement these traditional techniques with fast, accurate, and affordable 'black-box' approaches. This Topical Review recounts recent progress in data-driven/machine-learning approaches for computational X-ray spectroscopy. We discuss the achievements and limitations of the presently-available approaches and review the potential that these techniques have to expand the scope and reach of computational and experimental X-ray spectroscopic studies.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"79 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141101474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20DOI: 10.1088/2632-2153/ad4e05
Cameron J LaMack, Eric M. Schearer
This paper explores the use of Gaussian Process Regression (GPR) for system iden- tification in control engineering. It introduces two novel approaches that utilize the data from a measured global system error. The paper demonstrates these approaches by identifying a simulated system with three subsystems, a one degree of freedom mass with two antagonist muscles. The first approach uses this whole-system error data alone, achieving accuracy on the same order of magnitude as subsystem-specific data (9.28 ± 0.87 N vs. 6.96 ± 0.32 N of total model errors). This is significant, as it shows that the same data set can be used to identify unique subsystems, as op- posed to requiring a set of data descriptive of only a single subsystem. The second approach demonstrated in this paper mixes traditional subsystem-specific data with the whole system error data, achieving up to 98.71% model improvement.
本文探讨了在控制工程中使用高斯过程回归(GPR)进行系统识别的问题。论文介绍了两种利用全局系统误差测量数据的新方法。本文通过识别一个具有三个子系统的模拟系统(一个具有两个拮抗肌的单自由度质量)来演示这些方法。第一种方法仅使用全系统误差数据,就达到了与特定子系统数据相同数量级的精度(9.28 ± 0.87 N 对 6.96 ± 0.32 N 的总模型误差)。这一点意义重大,因为它表明同一数据集可用于识别独特的子系统,而不需要仅描述单一子系统的数据集。本文展示的第二种方法将传统的特定子系统数据与整个系统误差数据相结合,实现了高达 98.71% 的模型改进。
{"title":"Global System Errors to Simultaneously Improve the Identification of Subsystems with Mixed Data Gaussian Process Regression","authors":"Cameron J LaMack, Eric M. Schearer","doi":"10.1088/2632-2153/ad4e05","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4e05","url":null,"abstract":"\u0000 This paper explores the use of Gaussian Process Regression (GPR) for system iden- tification in control engineering. It introduces two novel approaches that utilize the data from a measured global system error. The paper demonstrates these approaches by identifying a simulated system with three subsystems, a one degree of freedom mass with two antagonist muscles. The first approach uses this whole-system error data alone, achieving accuracy on the same order of magnitude as subsystem-specific data (9.28 ± 0.87 N vs. 6.96 ± 0.32 N of total model errors). This is significant, as it shows that the same data set can be used to identify unique subsystems, as op- posed to requiring a set of data descriptive of only a single subsystem. The second approach demonstrated in this paper mixes traditional subsystem-specific data with the whole system error data, achieving up to 98.71% model improvement.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"24 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20DOI: 10.1088/2632-2153/ad4e06
Agnese Marcato, E. Guiltinan, Hari S. Viswanathan, Dan O’Malley, Nicholas Lubbers, Javier E. Santos
Reconstructing complex, high-dimensional global fields from limited data points is a challenge across various scientific and industrial domains. This is particularly important for recovering spatio-temporal fields using sensor data from, for example, laboratory-based scientific experiments, weather forecasting, or drone surveys. Given the prohibitive costs of specialized sensors and the inaccessibility of certain regions of the domain, achieving full field coverage is typically not feasible. Therefore, the development of machine learning algorithms trained to reconstruct fields given a limited dataset is of critical importance. In this study, we introduce a general approach that employs moving sensors to enhance data exploitation during the training of an attention based neural network, thereby improving field reconstruction. The training of sensor locations is accomplished using an end-to-end workflow, ensuring differentiability in the interpolation of field values associated to the sensors, and is simple to implement using differentiable programming. Additionally, we have incorporated a correction mechanism to prevent sensors from entering invalid regions within the domain. We evaluated our method using two distinct datasets; the results show that our approach enhances learning, as evidenced by improved test scores.
{"title":"Journey over Destination: Dynamic Sensor Placement Enhances Generalization","authors":"Agnese Marcato, E. Guiltinan, Hari S. Viswanathan, Dan O’Malley, Nicholas Lubbers, Javier E. Santos","doi":"10.1088/2632-2153/ad4e06","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4e06","url":null,"abstract":"\u0000 Reconstructing complex, high-dimensional global fields from limited data points is a challenge across various scientific and industrial domains. This is particularly important for recovering spatio-temporal fields using sensor data from, for example, laboratory-based scientific experiments, weather forecasting, or drone surveys. Given the prohibitive costs of specialized sensors and the inaccessibility of certain regions of the domain, achieving full field coverage is typically not feasible. Therefore, the development of machine learning algorithms trained to reconstruct fields given a limited dataset is of critical importance. In this study, we introduce a general approach that employs moving sensors to enhance data exploitation during the training of an attention based neural network, thereby improving field reconstruction. The training of sensor locations is accomplished using an end-to-end workflow, ensuring differentiability in the interpolation of field values associated to the sensors, and is simple to implement using differentiable programming. Additionally, we have incorporated a correction mechanism to prevent sensors from entering invalid regions within the domain. We evaluated our method using two distinct datasets; the results show that our approach enhances learning, as evidenced by improved test scores.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"79 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141121371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-17DOI: 10.1088/2632-2153/ad4d3e
Max Klein, Niklas Dormagen, Christopher Dietz, Markus Thoma, Mike Schwarz
Under different plasma conditions and electric fields in a complex plasma the plasma particles organize themselves in a string-like or chain-like manner. A phase transition from string-like to an isotropic particle distribution is observed at different electrical conditions. The streaming of charged ions around plasma particles with the surrounding electric field gives the plasma its electrorheological properties. The visibility of individual particles in a complex plasma opens up the opportunity to examine properties and phase transitions of such electrorheological fluids in detail. Because of the limited one-dimensional symmetry, determining the configuration of a particle and recognizing strings in particle distributions is not always straightforward. Several approaches have already been used to analyse particle clouds while either considering each particle locally or considering the particle cloud as a whole without providing information about single particle configurations. This paper presents a new machine learning approach that takes advantage of particle distributions over the entire particle cloud and detects all string-like particles at once, using a convolutional neural network in form of an encoder-decoder network with asymmetric kernel convolutions. This not only enhances the result quality but also accelerates the evaluation process, possibly enabling real-time analyses on electrorheological phase transitions, while achieving an accuracy of over 95% on manually labelled data.
{"title":"Enhancing Particle String Detection in Electrorheological Plasmas Using Asymmetrical Kernel Convolutional Networks","authors":"Max Klein, Niklas Dormagen, Christopher Dietz, Markus Thoma, Mike Schwarz","doi":"10.1088/2632-2153/ad4d3e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4d3e","url":null,"abstract":"\u0000 Under different plasma conditions and electric fields in a complex plasma the plasma particles organize themselves in a string-like or chain-like manner. A phase transition from string-like to an isotropic particle distribution is observed at different electrical conditions. The streaming of charged ions around plasma particles with the surrounding electric field gives the plasma its electrorheological properties. The visibility of individual particles in a complex plasma opens up the opportunity to examine properties and phase transitions of such electrorheological fluids in detail. Because of the limited one-dimensional symmetry, determining the configuration of a particle and recognizing strings in particle distributions is not always straightforward. Several approaches have already been used to analyse particle clouds while either considering each particle locally or considering the particle cloud as a whole without providing information about single particle configurations. This paper presents a new machine learning approach that takes advantage of particle distributions over the entire particle cloud and detects all string-like particles at once, using a convolutional neural network in form of an encoder-decoder network with asymmetric kernel convolutions. This not only enhances the result quality but also accelerates the evaluation process, possibly enabling real-time analyses on electrorheological phase transitions, while achieving an accuracy of over 95% on manually labelled data.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"2 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140963636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}