首页 > 最新文献

Machine Learning Science and Technology最新文献

英文 中文
Synergizing human expertise and AI efficiency with language model for microscopy operation and automated experiment design * 通过显微镜操作和自动实验设计语言模型,实现人类专业知识与人工智能效率的协同 *
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-12 DOI: 10.1088/2632-2153/ad52e9
Yongtao Liu, Marti Checa and Rama K Vasudevan
With the advent of large language models (LLMs), in both the open source and proprietary domains, attention is turning to how to exploit such artificial intelligence (AI) systems in assisting complex scientific tasks, such as material synthesis, characterization, analysis and discovery. Here, we explore the utility of LLMs, particularly ChatGPT4, in combination with application program interfaces (APIs) in tasks of experimental design, programming workflows, and data analysis in scanning probe microscopy, using both in-house developed APIs and APIs given by a commercial vendor for instrument control. We find that the LLM can be especially useful in converting ideations of experimental workflows to executable code on microscope APIs. Beyond code generation, we find that the GPT4 is capable of analyzing microscopy images in a generic sense. At the same time, we find that GPT4 suffers from an inability to extend beyond basic analyses for more in-depth technical experimental design. We argue that an LLM specifically fine-tuned for individual scientific domains can potentially be a better language interface for converting scientific ideations from human experts to executable workflows. Such a synergy between human expertise and LLM efficiency in experimentation can open new doors for accelerating scientific research, enabling effective experimental protocols sharing in the scientific community.
随着大型语言模型(LLMs)在开源和专有领域的出现,人们开始关注如何利用这种人工智能(AI)系统来辅助复杂的科学任务,如材料合成、表征、分析和发现。在这里,我们探索了 LLM(尤其是 ChatGPT4)与应用程序接口(API)相结合,在扫描探针显微镜的实验设计、编程工作流和数据分析任务中的实用性,同时使用了内部开发的 API 和商业供应商提供的用于仪器控制的 API。我们发现,LLM 在将实验工作流程的构思转换为显微镜 API 的可执行代码方面特别有用。除了代码生成之外,我们还发现 GPT4 能够对显微图像进行一般意义上的分析。与此同时,我们发现 GPT4 无法超越基本分析,进行更深入的技术实验设计。我们认为,专门针对个别科学领域进行微调的 LLM 有可能成为更好的语言界面,将人类专家的科学想法转换为可执行的工作流程。人类的专业知识与 LLM 在实验中的效率之间的这种协同作用,可以为加速科学研究打开新的大门,使科学界能够共享有效的实验方案。
{"title":"Synergizing human expertise and AI efficiency with language model for microscopy operation and automated experiment design *","authors":"Yongtao Liu, Marti Checa and Rama K Vasudevan","doi":"10.1088/2632-2153/ad52e9","DOIUrl":"https://doi.org/10.1088/2632-2153/ad52e9","url":null,"abstract":"With the advent of large language models (LLMs), in both the open source and proprietary domains, attention is turning to how to exploit such artificial intelligence (AI) systems in assisting complex scientific tasks, such as material synthesis, characterization, analysis and discovery. Here, we explore the utility of LLMs, particularly ChatGPT4, in combination with application program interfaces (APIs) in tasks of experimental design, programming workflows, and data analysis in scanning probe microscopy, using both in-house developed APIs and APIs given by a commercial vendor for instrument control. We find that the LLM can be especially useful in converting ideations of experimental workflows to executable code on microscope APIs. Beyond code generation, we find that the GPT4 is capable of analyzing microscopy images in a generic sense. At the same time, we find that GPT4 suffers from an inability to extend beyond basic analyses for more in-depth technical experimental design. We argue that an LLM specifically fine-tuned for individual scientific domains can potentially be a better language interface for converting scientific ideations from human experts to executable workflows. Such a synergy between human expertise and LLM efficiency in experimentation can open new doors for accelerating scientific research, enabling effective experimental protocols sharing in the scientific community.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"39 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the ability of PINNs to solve Burgers’ PDE near finite-time blowup 研究 PINNs 在有限时间爆炸附近求解布尔格斯 PDE 的能力
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-11 DOI: 10.1088/2632-2153/ad51cd
Dibyakanti Kumar, Anirbit Mukherjee
Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated Partial Differential Equations (PDEs) numerically while offering an attractive trade-off between accuracy and speed of inference. A particularly challenging aspect of PDEs is that there exist simple PDEs which can evolve into singular solutions in finite time starting from smooth initial conditions. In recent times some striking experiments have suggested that PINNs might be good at even detecting such finite-time blow-ups. In this work, we embark on a program to investigate this stability of PINNs from a rigorous theoretical viewpoint. Firstly, we derive error bounds for PINNs for Burgers’ PDE, in arbitrary dimensions, under conditions that allow for a finite-time blow-up. Our bounds give a theoretical justification for the functional regularization terms that have been reported to be useful for training PINNs near finite-time blow-up. Then we demonstrate via experiments that our bounds are significantly correlated to the 2-distance of the neurally found surrogate from the true blow-up solution, when computed on sequences of PDEs that are getting increasingly close to a blow-up.
物理信息神经网络(PINNs)在数值求解复杂的偏微分方程(PDEs)方面取得了不断刷新的成就,同时在推理的准确性和速度之间做出了极具吸引力的权衡。偏微分方程的一个特别具有挑战性的方面是,存在一些简单的偏微分方程,它们可以在有限的时间内从平滑的初始条件演化成奇异的解。最近,一些引人注目的实验表明,PINNs 甚至可以很好地检测出这种有限时间内的炸裂。在这项工作中,我们着手从严格的理论角度研究 PINNs 的这种稳定性。首先,我们在允许有限时间炸毁的条件下,推导出任意维度下布尔格斯 PDE PINN 的误差边界。我们的界限为函数正则化项提供了理论依据,据报道,函数正则化项对训练接近有限时间膨胀的 PINNs 非常有用。然后,我们通过实验证明,当对越来越接近炸毁的 PDE 序列进行计算时,我们的边界与神经发现的替代方案与真正炸毁方案的 ℓ2 距离显著相关。
{"title":"Investigating the ability of PINNs to solve Burgers’ PDE near finite-time blowup","authors":"Dibyakanti Kumar, Anirbit Mukherjee","doi":"10.1088/2632-2153/ad51cd","DOIUrl":"https://doi.org/10.1088/2632-2153/ad51cd","url":null,"abstract":"Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated Partial Differential Equations (PDEs) numerically while offering an attractive trade-off between accuracy and speed of inference. A particularly challenging aspect of PDEs is that there exist simple PDEs which can evolve into singular solutions in finite time starting from smooth initial conditions. In recent times some striking experiments have suggested that PINNs might be good at even detecting such finite-time blow-ups. In this work, we embark on a program to investigate this stability of PINNs from a rigorous theoretical viewpoint. Firstly, we derive error bounds for PINNs for Burgers’ PDE, in arbitrary dimensions, under conditions that allow for a finite-time blow-up. Our bounds give a theoretical justification for the functional regularization terms that have been reported to be useful for training PINNs near finite-time blow-up. Then we demonstrate via experiments that our bounds are significantly correlated to the <inline-formula>\u0000<tex-math><?CDATA $ell_2$?></tex-math>\u0000<mml:math overflow=\"scroll\"><mml:mrow><mml:msub><mml:mi>ℓ</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>\u0000<inline-graphic xlink:href=\"mlstad51cdieqn1.gif\" xlink:type=\"simple\"></inline-graphic>\u0000</inline-formula>-distance of the neurally found surrogate from the true blow-up solution, when computed on sequences of PDEs that are getting increasingly close to a blow-up.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"134 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement learning pulses for transmon qubit entangling gates 用于跨文量子比特纠缠门的强化学习脉冲
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-11 DOI: 10.1088/2632-2153/ad4f4d
Ho Nam Nguyen, Felix Motzoi, Mekena Metcalf, K Birgitta Whaley, Marin Bukov and Markus Schmitt
The utility of a quantum computer is highly dependent on the ability to reliably perform accurate quantum logic operations. For finding optimal control solutions, it is of particular interest to explore model-free approaches, since their quality is not constrained by the limited accuracy of theoretical models for the quantum processor—in contrast to many established gate implementation strategies. In this work, we utilize a continuous control reinforcement learning algorithm to design entangling two-qubit gates for superconducting qubits; specifically, our agent constructs cross-resonance and CNOT gates without any prior information about the physical system. Using a simulated environment of fixed-frequency fixed-coupling transmon qubits, we demonstrate the capability to generate novel pulse sequences that outperform the standard cross-resonance gates in both fidelity and gate duration, while maintaining a comparable susceptibility to stochastic unitary noise. We further showcase an augmentation in training and input information that allows our agent to adapt its pulse design abilities to drifting hardware characteristics, importantly, with little to no additional optimization. Our results exhibit clearly the advantages of unbiased adaptive-feedback learning-based optimization methods for transmon gate design.
量子计算机的实用性高度依赖于可靠执行精确量子逻辑运算的能力。为了找到最优控制方案,探索无模型方法特别有意义,因为与许多既定的门实现策略相比,无模型方法的质量不受量子处理器理论模型有限精度的限制。在这项工作中,我们利用连续控制强化学习算法为超导量子比特设计纠缠双量子比特门;具体来说,我们的代理在没有任何物理系统信息的情况下构建交叉共振和 CNOT 门。通过使用固定频率固定耦合跨门量子比特的模拟环境,我们展示了生成新型脉冲序列的能力,这些脉冲序列在保真度和门持续时间方面都优于标准交叉共振门,同时还保持了对随机单元噪声的可比易感性。我们进一步展示了训练和输入信息的增强功能,使我们的代理能够根据硬件特性的变化调整其脉冲设计能力,重要的是,几乎不需要额外的优化。我们的研究结果清楚地表明了基于无偏自适应反馈学习的优化方法在跨导门设计中的优势。
{"title":"Reinforcement learning pulses for transmon qubit entangling gates","authors":"Ho Nam Nguyen, Felix Motzoi, Mekena Metcalf, K Birgitta Whaley, Marin Bukov and Markus Schmitt","doi":"10.1088/2632-2153/ad4f4d","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4f4d","url":null,"abstract":"The utility of a quantum computer is highly dependent on the ability to reliably perform accurate quantum logic operations. For finding optimal control solutions, it is of particular interest to explore model-free approaches, since their quality is not constrained by the limited accuracy of theoretical models for the quantum processor—in contrast to many established gate implementation strategies. In this work, we utilize a continuous control reinforcement learning algorithm to design entangling two-qubit gates for superconducting qubits; specifically, our agent constructs cross-resonance and CNOT gates without any prior information about the physical system. Using a simulated environment of fixed-frequency fixed-coupling transmon qubits, we demonstrate the capability to generate novel pulse sequences that outperform the standard cross-resonance gates in both fidelity and gate duration, while maintaining a comparable susceptibility to stochastic unitary noise. We further showcase an augmentation in training and input information that allows our agent to adapt its pulse design abilities to drifting hardware characteristics, importantly, with little to no additional optimization. Our results exhibit clearly the advantages of unbiased adaptive-feedback learning-based optimization methods for transmon gate design.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"6 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A quantum inspired approach to learning dynamical laws from data—block-sparsity and gauge-mediated weight sharing 从数据块稀疏性和规中介权重共享中学习动力学规律的量子启发式方法
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-11 DOI: 10.1088/2632-2153/ad4f4e
J Fuksa, M Götte, I Roth, J Eisert
Recent years have witnessed an increased interest in recovering dynamical laws of complex systems in a largely data-driven fashion under meaningful hypotheses. In this work, we propose a scalable and numerically robust method for this task, utilizing efficient block-sparse tensor train representations of dynamical laws, inspired by similar approaches in quantum many-body systems. Low-rank tensor train representations have been previously derived for dynamical laws of one-dimensional systems. We extend this result to efficient representations of systems with K-mode interactions and controlled approximations of systems with decaying interactions. We further argue that natural structure assumptions on dynamical laws, such as bounded polynomial degrees, can be exploited in the form of block-sparse support patterns of tensor-train cores. Additional structural similarities between interactions of certain modes can be accounted for by weight sharing within the ansatz. To make use of these structure assumptions, we propose a novel optimization algorithm, block-sparsity restricted alternating least squares with gauge-mediated weight sharing. The algorithm is inspired by similar notions in machine learning and achieves a significant improvement in performance over previous approaches. We demonstrate the performance of the method numerically on three one-dimensional systems—the Fermi–Pasta–Ulam–Tsingou system, rotating magnetic dipoles and point particles interacting via modified Lennard–Jones potentials, observing a highly accurate and noise-robust recovery.
近年来,人们越来越关注在有意义的假设条件下,以数据驱动的方式恢复复杂系统的动力学规律。受量子多体系统中类似方法的启发,我们在这项工作中提出了一种可扩展且数值稳健的方法,利用动态规律的高效块稀疏张量列车表示。低秩张量列车表示法之前已用于一维系统的动力学规律。我们将这一结果扩展到具有 K 模式相互作用的系统的高效表示,以及具有衰减相互作用的系统的受控近似。我们进一步论证了动力学规律的自然结构假设,例如有界多项式度,可以通过张量列车核心的块稀疏支持模式的形式加以利用。某些模式的相互作用之间的其他结构相似性可以通过反演中的权重共享来解释。为了利用这些结构假设,我们提出了一种新颖的优化算法--块稀疏性受限交替最小二乘法与轨距介导的权重共享。该算法受到机器学习中类似概念的启发,与之前的方法相比,性能有了显著提高。我们在三个一维系统--费米-帕斯塔-乌兰-钦古系统、旋转磁偶极子和通过修正的伦纳德-琼斯势相互作用的点粒子--上对该方法的性能进行了数值演示,观察到了高度精确和噪声抑制的恢复效果。
{"title":"A quantum inspired approach to learning dynamical laws from data—block-sparsity and gauge-mediated weight sharing","authors":"J Fuksa, M Götte, I Roth, J Eisert","doi":"10.1088/2632-2153/ad4f4e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4f4e","url":null,"abstract":"Recent years have witnessed an increased interest in recovering dynamical laws of complex systems in a largely data-driven fashion under meaningful hypotheses. In this work, we propose a scalable and numerically robust method for this task, utilizing efficient block-sparse tensor train representations of dynamical laws, inspired by similar approaches in quantum many-body systems. Low-rank tensor train representations have been previously derived for dynamical laws of one-dimensional systems. We extend this result to efficient representations of systems with <italic toggle=\"yes\">K</italic>-mode interactions and controlled approximations of systems with decaying interactions. We further argue that natural structure assumptions on dynamical laws, such as bounded polynomial degrees, can be exploited in the form of block-sparse support patterns of tensor-train cores. Additional structural similarities between interactions of certain modes can be accounted for by weight sharing within the ansatz. To make use of these structure assumptions, we propose a novel optimization algorithm, block-sparsity restricted alternating least squares with gauge-mediated weight sharing. The algorithm is inspired by similar notions in machine learning and achieves a significant improvement in performance over previous approaches. We demonstrate the performance of the method numerically on three one-dimensional systems—the Fermi–Pasta–Ulam–Tsingou system, rotating magnetic dipoles and point particles interacting via modified Lennard–Jones potentials, observing a highly accurate and noise-robust recovery.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"84 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symbolic regression as a feature engineering method for machine and deep learning regression tasks 符号回归作为机器和深度学习回归任务的特征工程方法
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-11 DOI: 10.1088/2632-2153/ad513a
Assaf Shmuel, Oren Glickman and Teddy Lazebnik
In the realm of machine and deep learning (DL) regression tasks, the role of effective feature engineering (FE) is pivotal in enhancing model performance. Traditional approaches of FE often rely on domain expertise to manually design features for machine learning (ML) models. In the context of DL models, the FE is embedded in the neural network’s architecture, making it hard for interpretation. In this study, we propose to integrate symbolic regression (SR) as an FE process before a ML model to improve its performance. We show, through extensive experimentation on synthetic and 21 real-world datasets, that the incorporation of SR-derived features significantly enhances the predictive capabilities of both machine and DL regression models with 34%–86% root mean square error (RMSE) improvement in synthetic datasets and 4%–11.5% improvement in real-world datasets. In an additional realistic use case, we show the proposed method improves the ML performance in predicting superconducting critical temperatures based on Eliashberg theory by more than 20% in terms of RMSE. These results outline the potential of SR as an FE component in data-driven models, improving them in terms of performance and interpretability.
在机器学习和深度学习(DL)回归任务领域,有效的特征工程(FE)在提高模型性能方面发挥着举足轻重的作用。传统的特征工程方法通常依赖于领域专业知识,为机器学习(ML)模型手动设计特征。在 DL 模型中,特征工程被嵌入到神经网络的架构中,因此很难对其进行解释。在本研究中,我们建议在 ML 模型之前集成符号回归(SR)作为 FE 流程,以提高其性能。我们通过在合成数据集和 21 个真实世界数据集上的大量实验表明,加入 SR 衍生特征可显著增强机器回归模型和 DL 回归模型的预测能力,合成数据集的均方根误差(RMSE)提高了 34%-86%,而真实世界数据集的均方根误差(RMSE)提高了 4%-11.5%。在另一个实际应用案例中,我们发现所提出的方法在基于埃利亚什伯格理论预测超导临界温度时,均方根误差(RMSE)提高了 20% 以上。这些结果概括了 SR 作为数据驱动模型中的 FE 组件的潜力,它可以提高模型的性能和可解释性。
{"title":"Symbolic regression as a feature engineering method for machine and deep learning regression tasks","authors":"Assaf Shmuel, Oren Glickman and Teddy Lazebnik","doi":"10.1088/2632-2153/ad513a","DOIUrl":"https://doi.org/10.1088/2632-2153/ad513a","url":null,"abstract":"In the realm of machine and deep learning (DL) regression tasks, the role of effective feature engineering (FE) is pivotal in enhancing model performance. Traditional approaches of FE often rely on domain expertise to manually design features for machine learning (ML) models. In the context of DL models, the FE is embedded in the neural network’s architecture, making it hard for interpretation. In this study, we propose to integrate symbolic regression (SR) as an FE process before a ML model to improve its performance. We show, through extensive experimentation on synthetic and 21 real-world datasets, that the incorporation of SR-derived features significantly enhances the predictive capabilities of both machine and DL regression models with 34%–86% root mean square error (RMSE) improvement in synthetic datasets and 4%–11.5% improvement in real-world datasets. In an additional realistic use case, we show the proposed method improves the ML performance in predicting superconducting critical temperatures based on Eliashberg theory by more than 20% in terms of RMSE. These results outline the potential of SR as an FE component in data-driven models, improving them in terms of performance and interpretability.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"49 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep artificial neural network-powered phase field model for predicting damage characteristic in brittle composite under varying configurations 人工神经网络驱动的深度相场模型用于预测不同配置下脆性复合材料的损伤特征
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-11 DOI: 10.1088/2632-2153/ad52e8
Hoang-Quan Nguyen, Ba-Anh Le, Bao-Viet Tran, Thai-Son Vu, Thi-Loan Bui
This work introduces a novel artificial neural network (ANN)-powered phase field model, offering rapid and precise predictions of fracture propagation in brittle materials. To improve the capabilities of the ANN model, we incorporate a loop of conditions into its core to regulate the absolute percentage error for each observation point, that filters and consistently selects the most accurate outcome. This algorithm enables our model to better adapt to the highly sensitive validation data arising from varying configurations. The effectiveness of the approach is illustrated through three examples involving changes in the microgeometry and material properties of steel fiber-reinforced high-strength concrete structures. Indeed, the predicted outcomes from the improved ANN phase field model in terms of stress–strain relationship, and crack propagation path demonstrates an outperformance compared with that based on the extreme gradient boosting method, a leading regression machine learning technique for tabular data. Additionally, the introduced model exhibits a remarkable speed advantage, being 180 times faster than traditional phase field simulations, and provides results at nearly any fiber location, demonstrating superiority over the phase field model. This study marks a significant advancement in the application of artificial intelligence for accurately predicting crack propagation paths in composite materials, particularly in cases involving the relative positioning of the fiber and initial crack location.
这项研究引入了一种新型人工神经网络(ANN)相场模型,可快速、精确地预测脆性材料的断裂扩展。为了提高人工神经网络模型的能力,我们在其核心中加入了一个条件循环,以调节每个观测点的绝对百分比误差,从而过滤并持续选择最准确的结果。这种算法使我们的模型能够更好地适应由不同配置产生的高度敏感的验证数据。通过三个涉及钢纤维加固高强度混凝土结构的微观几何和材料属性变化的实例,说明了该方法的有效性。事实上,改进后的 ANN 相场模型在应力-应变关系和裂缝扩展路径方面的预测结果优于基于极端梯度提升方法的预测结果,而极端梯度提升方法是一种针对表格数据的领先回归机器学习技术。此外,所引入的模型还具有显著的速度优势,比传统的相场模拟快 180 倍,并且几乎可以在任何纤维位置提供结果,显示出比相场模型更优越的性能。这项研究标志着在应用人工智能准确预测复合材料裂纹扩展路径方面取得了重大进展,尤其是在涉及纤维相对位置和初始裂纹位置的情况下。
{"title":"Deep artificial neural network-powered phase field model for predicting damage characteristic in brittle composite under varying configurations","authors":"Hoang-Quan Nguyen, Ba-Anh Le, Bao-Viet Tran, Thai-Son Vu, Thi-Loan Bui","doi":"10.1088/2632-2153/ad52e8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad52e8","url":null,"abstract":"This work introduces a novel artificial neural network (ANN)-powered phase field model, offering rapid and precise predictions of fracture propagation in brittle materials. To improve the capabilities of the ANN model, we incorporate a loop of conditions into its core to regulate the absolute percentage error for each observation point, that filters and consistently selects the most accurate outcome. This algorithm enables our model to better adapt to the highly sensitive validation data arising from varying configurations. The effectiveness of the approach is illustrated through three examples involving changes in the microgeometry and material properties of steel fiber-reinforced high-strength concrete structures. Indeed, the predicted outcomes from the improved ANN phase field model in terms of stress–strain relationship, and crack propagation path demonstrates an outperformance compared with that based on the extreme gradient boosting method, a leading regression machine learning technique for tabular data. Additionally, the introduced model exhibits a remarkable speed advantage, being 180 times faster than traditional phase field simulations, and provides results at nearly any fiber location, demonstrating superiority over the phase field model. This study marks a significant advancement in the application of artificial intelligence for accurately predicting crack propagation paths in composite materials, particularly in cases involving the relative positioning of the fiber and initial crack location.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"63 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The twin peaks of learning neural networks 学习神经网络的双峰
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-10 DOI: 10.1088/2632-2153/ad524d
Elizaveta Demyanenko, Christoph Feinauer, Enrico M Malatesta, Luca Saglietti
Recent works demonstrated the existence of a double-descent phenomenon for the generalization error of neural networks, where highly overparameterized models escape overfitting and achieve good test performance, at odds with the standard bias-variance trade-off described by statistical learning theory. In the present work, we explore a link between this phenomenon and the increase of complexity and sensitivity of the function represented by neural networks. In particular, we study the Boolean mean dimension (BMD), a metric developed in the context of Boolean function analysis. Focusing on a simple teacher-student setting for the random feature model, we derive a theoretical analysis based on the replica method that yields an interpretable expression for the BMD, in the high dimensional regime where the number of data points, the number of features, and the input size grow to infinity. We find that, as the degree of overparameterization of the network is increased, the BMD reaches an evident peak at the interpolation threshold, in correspondence with the generalization error peak, and then slowly approaches a low asymptotic value. The same phenomenology is then traced in numerical experiments with different model classes and training setups. Moreover, we find empirically that adversarially initialized models tend to show higher BMD values, and that models that are more robust to adversarial attacks exhibit a lower BMD.
最近的研究表明,神经网络的泛化误差存在双下降现象,即高度过参数化的模型既能摆脱过拟合,又能获得良好的测试性能,这与统计学习理论所描述的标准偏差-方差权衡是不一致的。在本研究中,我们探讨了这一现象与神经网络所代表函数的复杂性和灵敏度增加之间的联系。我们特别研究了布尔平均维度 (BMD),这是一种在布尔函数分析中开发的度量方法。我们以随机特征模型的简单师生设置为重点,推导出了基于复制法的理论分析,在数据点数量、特征数量和输入大小增长到无穷大的高维系统中,该分析得出了布尔平均维度的可解释表达式。我们发现,随着网络过参数化程度的增加,BMD 在插值阈值处达到一个明显的峰值,与泛化误差峰值相对应,然后慢慢接近一个较低的渐近值。在使用不同模型类别和训练设置的数值实验中,我们也发现了同样的现象。此外,我们还根据经验发现,对抗性初始化的模型往往显示出更高的 BMD 值,而对对抗性攻击更具鲁棒性的模型则显示出更低的 BMD 值。
{"title":"The twin peaks of learning neural networks","authors":"Elizaveta Demyanenko, Christoph Feinauer, Enrico M Malatesta, Luca Saglietti","doi":"10.1088/2632-2153/ad524d","DOIUrl":"https://doi.org/10.1088/2632-2153/ad524d","url":null,"abstract":"Recent works demonstrated the existence of a double-descent phenomenon for the generalization error of neural networks, where highly overparameterized models escape overfitting and achieve good test performance, at odds with the standard bias-variance trade-off described by statistical learning theory. In the present work, we explore a link between this phenomenon and the increase of complexity and sensitivity of the function represented by neural networks. In particular, we study the Boolean mean dimension (BMD), a metric developed in the context of Boolean function analysis. Focusing on a simple teacher-student setting for the random feature model, we derive a theoretical analysis based on the replica method that yields an interpretable expression for the BMD, in the high dimensional regime where the number of data points, the number of features, and the input size grow to infinity. We find that, as the degree of overparameterization of the network is increased, the BMD reaches an evident peak at the interpolation threshold, in correspondence with the generalization error peak, and then slowly approaches a low asymptotic value. The same phenomenology is then traced in numerical experiments with different model classes and training setups. Moreover, we find empirically that adversarially initialized models tend to show higher BMD values, and that models that are more robust to adversarial attacks exhibit a lower BMD.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"375 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding characteristics of key physical properties in silver nanoparticles by attaining centroids for cytotoxicity prediction through data cleansing 通过数据清洗获得用于细胞毒性预测的中心点,从而解码银纳米粒子的关键物理性质特征
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-07 DOI: 10.1088/2632-2153/ad51cb
Anjana S Desai, Anindita Bandopadhyaya, Aparna Ashok, Maneesha4*maneesha@dubai.bits-pilani.ac.i, Neeru Bhagat
This research underscores the profound impact of data cleansing, ensuring dataset integrity and providing a structured foundation for unraveling convoluted connections between diverse physical properties and cytotoxicity. As the scientific community delves deeper into this interplay, it becomes clear that precise data purification is a fundamental aspect of investigating parameters within datasets. The study presents the need for data filtration in the background of machine learning (ML) that has widened its horizon into the field of biological application through the amalgamation of predictive systems and algorithms that delve into the intricate characteristics of cytotoxicity of nanoparticles. The reliability and accuracy of models in the ML landscape hinge on the quality of input data, making data cleansing a critical component of the pre-processing pipeline. The main encounter faced here is the lengthy, broad and complex datasets that have to be toned down for further studies. Through a thorough data cleansing process, this study addresses the complexities arising from diverse sources, resulting in a refined dataset. The filtration process employs K-means clustering to derive centroids, revealing the correlation between the physical properties of nanoparticles, viz, concentration, zeta potential, hydrodynamic diameter, morphology, and absorbance wavelength, and cytotoxicity outcomes measured in terms of cell viability. The cell lines considered for determining the centroid values that predicts the cytotoxicity of silver nanoparticles are human and animal cell lines which were categorized as normal and carcinoma type. The objective of the study is to simplify the high-dimensional data for accurate analysis of the parameters that affect the cytotoxicity of silver NPs through centroids.
这项研究强调了数据净化的深远影响,它确保了数据集的完整性,并为揭示各种物理特性与细胞毒性之间错综复杂的联系奠定了结构化基础。随着科学界对这种相互作用的深入研究,精确的数据净化显然是研究数据集参数的一个基本方面。本研究介绍了在机器学习(ML)背景下进行数据过滤的必要性,通过融合预测系统和算法,ML 已将其视野扩大到生物应用领域,从而深入研究纳米粒子细胞毒性的复杂特性。ML 模型的可靠性和准确性取决于输入数据的质量,因此数据清洗是预处理管道的关键组成部分。这里面临的主要问题是冗长、宽泛和复杂的数据集,这些数据集必须经过精简才能用于进一步研究。通过彻底的数据清理过程,本研究解决了不同来源产生的复杂性,从而得到了一个精致的数据集。过滤过程采用 K-means 聚类方法得出中心点,揭示了纳米粒子的物理特性(即浓度、ZETA电位、流体力学直径、形态和吸光波长)与以细胞存活率衡量的细胞毒性结果之间的相关性。在确定预测银纳米粒子细胞毒性的中心值时,考虑的细胞系是人类和动物细胞系,这些细胞系被分为正常细胞系和癌细胞系。该研究的目的是简化高维数据,通过中心点准确分析影响银纳米粒子细胞毒性的参数。
{"title":"Decoding characteristics of key physical properties in silver nanoparticles by attaining centroids for cytotoxicity prediction through data cleansing","authors":"Anjana S Desai, Anindita Bandopadhyaya, Aparna Ashok, Maneesha4*maneesha@dubai.bits-pilani.ac.i, Neeru Bhagat","doi":"10.1088/2632-2153/ad51cb","DOIUrl":"https://doi.org/10.1088/2632-2153/ad51cb","url":null,"abstract":"This research underscores the profound impact of data cleansing, ensuring dataset integrity and providing a structured foundation for unraveling convoluted connections between diverse physical properties and cytotoxicity. As the scientific community delves deeper into this interplay, it becomes clear that precise data purification is a fundamental aspect of investigating parameters within datasets. The study presents the need for data filtration in the background of machine learning (ML) that has widened its horizon into the field of biological application through the amalgamation of predictive systems and algorithms that delve into the intricate characteristics of cytotoxicity of nanoparticles. The reliability and accuracy of models in the ML landscape hinge on the quality of input data, making data cleansing a critical component of the pre-processing pipeline. The main encounter faced here is the lengthy, broad and complex datasets that have to be toned down for further studies. Through a thorough data cleansing process, this study addresses the complexities arising from diverse sources, resulting in a refined dataset. The filtration process employs K-means clustering to derive centroids, revealing the correlation between the physical properties of nanoparticles, viz, concentration, zeta potential, hydrodynamic diameter, morphology, and absorbance wavelength, and cytotoxicity outcomes measured in terms of cell viability. The cell lines considered for determining the centroid values that predicts the cytotoxicity of silver nanoparticles are human and animal cell lines which were categorized as normal and carcinoma type. The objective of the study is to simplify the high-dimensional data for accurate analysis of the parameters that affect the cytotoxicity of silver NPs through centroids.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"21 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning inspired models for Hall effects in non-collinear magnets 受机器学习启发的非共轭磁体霍尔效应模型
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-07 DOI: 10.1088/2632-2153/ad51ca
Jonathan Kipp, Fabian R Lux, Thorben Pürling, Abigail Morrison, Stefan Blügel, Daniele Pinna, Yuriy Mokrousov
The anomalous Hall effect has been front and center in solid state research and material science for over a century now, and the complex transport phenomena in nontrivial magnetic textures have gained an increasing amount of attention, both in theoretical and experimental studies. However, a clear path forward to capturing the influence of magnetization dynamics on anomalous Hall effect even in smallest frustrated magnets or spatially extended magnetic textures is still intensively sought after. In this work, we present an expansion of the anomalous Hall tensor into symmetrically invariant objects, encoding the magnetic configuration up to arbitrary power of spin. We show that these symmetric invariants can be utilized in conjunction with advanced regularization techniques in order to build models for the electric transport in magnetic textures which are, on one hand, complete with respect to the point group symmetry of the underlying lattice, and on the other hand, depend on a minimal number of order parameters only. Here, using a four-band tight-binding model on a honeycomb lattice, we demonstrate that the developed method can be used to address the importance and properties of higher-order contributions to transverse transport. The efficiency and breadth enabled by this method provides an ideal systematic approach to tackle the inherent complexity of response properties of noncollinear magnets, paving the way to the exploration of electric transport in intrinsically frustrated magnets as well as large-scale magnetic textures.
一个多世纪以来,反常霍尔效应一直是固态研究和材料科学的前沿和中心,而非三维磁性纹理中的复杂传输现象也在理论和实验研究中获得了越来越多的关注。然而,即使在最小的受挫磁体或空间扩展磁性纹理中,如何捕捉磁化动力学对反常霍尔效应的影响,仍然是人们孜孜以求的明确研究方向。在这项工作中,我们将反常霍尔张量扩展为对称不变对象,编码任意自旋幂的磁配置。我们表明,这些对称不变量可与先进的正则化技术结合使用,以建立磁纹理中的电输运模型,这些模型一方面与底层晶格的点群对称性有关,另一方面只依赖于极少数量的阶次参数。在这里,我们利用蜂巢晶格上的四带紧束缚模型,证明了所开发的方法可用于解决高阶贡献对横向传输的重要性和特性问题。这种方法的效率和广度为解决非共轭磁体响应特性的内在复杂性提供了一种理想的系统方法,为探索内在受挫磁体中的电输运以及大尺度磁纹理铺平了道路。
{"title":"Machine learning inspired models for Hall effects in non-collinear magnets","authors":"Jonathan Kipp, Fabian R Lux, Thorben Pürling, Abigail Morrison, Stefan Blügel, Daniele Pinna, Yuriy Mokrousov","doi":"10.1088/2632-2153/ad51ca","DOIUrl":"https://doi.org/10.1088/2632-2153/ad51ca","url":null,"abstract":"The anomalous Hall effect has been front and center in solid state research and material science for over a century now, and the complex transport phenomena in nontrivial magnetic textures have gained an increasing amount of attention, both in theoretical and experimental studies. However, a clear path forward to capturing the influence of magnetization dynamics on anomalous Hall effect even in smallest frustrated magnets or spatially extended magnetic textures is still intensively sought after. In this work, we present an expansion of the anomalous Hall tensor into symmetrically invariant objects, encoding the magnetic configuration up to arbitrary power of spin. We show that these symmetric invariants can be utilized in conjunction with advanced regularization techniques in order to build models for the electric transport in magnetic textures which are, on one hand, complete with respect to the point group symmetry of the underlying lattice, and on the other hand, depend on a minimal number of order parameters only. Here, using a four-band tight-binding model on a honeycomb lattice, we demonstrate that the developed method can be used to address the importance and properties of higher-order contributions to transverse transport. The efficiency and breadth enabled by this method provides an ideal systematic approach to tackle the inherent complexity of response properties of noncollinear magnets, paving the way to the exploration of electric transport in intrinsically frustrated magnets as well as large-scale magnetic textures.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"154 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing training data needs with minimal multilevel machine learning (M3L) 用最小多级机器学习(M3L)减少训练数据需求
IF 6.8 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-06 DOI: 10.1088/2632-2153/ad4ae5
Stefan Heinen, Danish Khan, Guido Falk von Rudorff, Konstantin Karandashev, Daniel Jose Arismendi Arrieta, Alastair J A Price, Surajit Nandi, Arghya Bhowmik, Kersti Hermansson, O Anatole von Lilienfeld
For many machine learning applications in science, data acquisition, not training, is the bottleneck even when avoiding experiments and relying on computation and simulation. Correspondingly, and in order to reduce cost and carbon footprint, training data efficiency is key. We introduce minimal multilevel machine learning (M3L) which optimizes training data set sizes using a loss function at multiple levels of reference data in order to minimize a combination of prediction error with overall training data acquisition costs (as measured by computational wall-times). Numerical evidence has been obtained for calculated atomization energies and electron affinities of thousands of organic molecules at various levels of theory including HF, MP2, DLPNO-CCSD(T), DFHFCABS, PNOMP2F12, and PNOCCSD(T)F12, and treating them with basis sets TZ, cc-pVTZ, and AVTZ-F12. Our M3L benchmarks for reaching chemical accuracy in distinct chemical compound sub-spaces indicate substantial computational cost reductions by factors of ∼1.01, 1.1, 3.8, 13.8, and 25.8 when compared to heuristic sub-optimal multilevel machine learning (M2L) for the data sets QM7b, QM9<inline-formula><tex-math><?CDATA $^mathrm{LCCSD(T)}$?></tex-math><mml:math overflow="scroll"><mml:mrow><mml:msup><mml:mi></mml:mi><mml:mrow><mml:mi>LCCSD</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">T</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math><inline-graphic xlink:href="mlstad4ae5ieqn1.gif" xlink:type="simple"></inline-graphic></inline-formula>, Electrolyte Genome Project, QM9<inline-formula><tex-math><?CDATA $^mathrm{CCSD(T)}_mathrm{AE}$?></tex-math><mml:math overflow="scroll"><mml:mrow><mml:msubsup><mml:mi></mml:mi><mml:mrow><mml:mi>AE</mml:mi></mml:mrow><mml:mrow><mml:mi>CCSD</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">T</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:mrow></mml:math><inline-graphic xlink:href="mlstad4ae5ieqn2.gif" xlink:type="simple"></inline-graphic></inline-formula>, and QM9<inline-formula><tex-math><?CDATA $^mathrm{CCSD(T)}_mathrm{EA}$?></tex-math><mml:math overflow="scroll"><mml:mrow><mml:msubsup><mml:mi></mml:mi><mml:mrow><mml:mi>EA</mml:mi></mml:mrow><mml:mrow><mml:mi>CCSD</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">T</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:mrow></mml:math><inline-graphic xlink:href="mlstad4ae5ieqn3.gif" xlink:type="simple"></inline-graphic></inline-formula>, respectively. Furthermore, we use M2L to investigate the performance for 76 density functionals when used within multilevel learning and building on the following levels drawn from the hierarchy of Jacobs Ladder: LDA, GGA, mGGA, and hybrid functionals. Within M2L and the molecules considered, mGGAs do not provide any noticeable advantage over GGAs. Among the functionals considered and in combination with LDA, the three
对于许多科学领域的机器学习应用来说,即使避免实验并依靠计算和模拟,数据获取而非训练也是瓶颈所在。相应地,为了降低成本和碳足迹,训练数据的效率是关键。我们引入了最小多层次机器学习(M3L),它利用多层次参考数据的损失函数来优化训练数据集的大小,以最大限度地减少预测误差与总体训练数据获取成本(以计算墙时间衡量)的组合。在不同的理论水平上,包括 HF、MP2、DLPNO-CCSD(T)、DFHFCABS、PNOMP2F12 和 PNOCCSD(T)F12,并使用基集 TZ、cc-pVTZ 和 AVTZ-F12 对数千种有机分子的雾化能和电子亲和力进行了计算,获得了数值证据。在数据集 QM7b、QM9LCCSD(T)、电解质基因组计划、QM9AECCSD(T)和 QM9EACCSD(T)中,与启发式次优多层次机器学习(M2L)相比,我们在不同化合物子空间中达到化学准确性的 M3L 基准表明计算成本大幅降低了 1.01、1.1、3.8、13.8 和 25.8 倍。此外,我们还使用 M2L 研究了 76 个密度函数在多层次学习中的性能,并根据雅各布梯形图的层次结构建立了以下层次:LDA、GGA、mGGA 和混合函数。在 M2L 和所考虑的分子中,mGGA 与 GGA 相比没有明显优势。在所考虑的函数中,结合 LDA,使用 M3L 在 QM9 上原子化能平均表现最好的三个 GGA 和混合级分别对应 PW91、KT2、B97D 和 τ-HTH、B3LYP∗(VWN5) 和 TPSSH。
{"title":"Reducing training data needs with minimal multilevel machine learning (M3L)","authors":"Stefan Heinen, Danish Khan, Guido Falk von Rudorff, Konstantin Karandashev, Daniel Jose Arismendi Arrieta, Alastair J A Price, Surajit Nandi, Arghya Bhowmik, Kersti Hermansson, O Anatole von Lilienfeld","doi":"10.1088/2632-2153/ad4ae5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4ae5","url":null,"abstract":"For many machine learning applications in science, data acquisition, not training, is the bottleneck even when avoiding experiments and relying on computation and simulation. Correspondingly, and in order to reduce cost and carbon footprint, training data efficiency is key. We introduce minimal multilevel machine learning (M3L) which optimizes training data set sizes using a loss function at multiple levels of reference data in order to minimize a combination of prediction error with overall training data acquisition costs (as measured by computational wall-times). Numerical evidence has been obtained for calculated atomization energies and electron affinities of thousands of organic molecules at various levels of theory including HF, MP2, DLPNO-CCSD(T), DFHFCABS, PNOMP2F12, and PNOCCSD(T)F12, and treating them with basis sets TZ, cc-pVTZ, and AVTZ-F12. Our M3L benchmarks for reaching chemical accuracy in distinct chemical compound sub-spaces indicate substantial computational cost reductions by factors of ∼1.01, 1.1, 3.8, 13.8, and 25.8 when compared to heuristic sub-optimal multilevel machine learning (M2L) for the data sets QM7b, QM9&lt;inline-formula&gt;\u0000&lt;tex-math&gt;&lt;?CDATA $^mathrm{LCCSD(T)}$?&gt;&lt;/tex-math&gt;\u0000&lt;mml:math overflow=\"scroll\"&gt;&lt;mml:mrow&gt;&lt;mml:msup&gt;&lt;mml:mi&gt;&lt;/mml:mi&gt;&lt;mml:mrow&gt;&lt;mml:mi&gt;LCCSD&lt;/mml:mi&gt;&lt;mml:mo stretchy=\"false\"&gt;(&lt;/mml:mo&gt;&lt;mml:mi mathvariant=\"normal\"&gt;T&lt;/mml:mi&gt;&lt;mml:mo stretchy=\"false\"&gt;)&lt;/mml:mo&gt;&lt;/mml:mrow&gt;&lt;/mml:msup&gt;&lt;/mml:mrow&gt;&lt;/mml:math&gt;\u0000&lt;inline-graphic xlink:href=\"mlstad4ae5ieqn1.gif\" xlink:type=\"simple\"&gt;&lt;/inline-graphic&gt;\u0000&lt;/inline-formula&gt;, Electrolyte Genome Project, QM9&lt;inline-formula&gt;\u0000&lt;tex-math&gt;&lt;?CDATA $^mathrm{CCSD(T)}_mathrm{AE}$?&gt;&lt;/tex-math&gt;\u0000&lt;mml:math overflow=\"scroll\"&gt;&lt;mml:mrow&gt;&lt;mml:msubsup&gt;&lt;mml:mi&gt;&lt;/mml:mi&gt;&lt;mml:mrow&gt;&lt;mml:mi&gt;AE&lt;/mml:mi&gt;&lt;/mml:mrow&gt;&lt;mml:mrow&gt;&lt;mml:mi&gt;CCSD&lt;/mml:mi&gt;&lt;mml:mo stretchy=\"false\"&gt;(&lt;/mml:mo&gt;&lt;mml:mi mathvariant=\"normal\"&gt;T&lt;/mml:mi&gt;&lt;mml:mo stretchy=\"false\"&gt;)&lt;/mml:mo&gt;&lt;/mml:mrow&gt;&lt;/mml:msubsup&gt;&lt;/mml:mrow&gt;&lt;/mml:math&gt;\u0000&lt;inline-graphic xlink:href=\"mlstad4ae5ieqn2.gif\" xlink:type=\"simple\"&gt;&lt;/inline-graphic&gt;\u0000&lt;/inline-formula&gt;, and QM9&lt;inline-formula&gt;\u0000&lt;tex-math&gt;&lt;?CDATA $^mathrm{CCSD(T)}_mathrm{EA}$?&gt;&lt;/tex-math&gt;\u0000&lt;mml:math overflow=\"scroll\"&gt;&lt;mml:mrow&gt;&lt;mml:msubsup&gt;&lt;mml:mi&gt;&lt;/mml:mi&gt;&lt;mml:mrow&gt;&lt;mml:mi&gt;EA&lt;/mml:mi&gt;&lt;/mml:mrow&gt;&lt;mml:mrow&gt;&lt;mml:mi&gt;CCSD&lt;/mml:mi&gt;&lt;mml:mo stretchy=\"false\"&gt;(&lt;/mml:mo&gt;&lt;mml:mi mathvariant=\"normal\"&gt;T&lt;/mml:mi&gt;&lt;mml:mo stretchy=\"false\"&gt;)&lt;/mml:mo&gt;&lt;/mml:mrow&gt;&lt;/mml:msubsup&gt;&lt;/mml:mrow&gt;&lt;/mml:math&gt;\u0000&lt;inline-graphic xlink:href=\"mlstad4ae5ieqn3.gif\" xlink:type=\"simple\"&gt;&lt;/inline-graphic&gt;\u0000&lt;/inline-formula&gt;, respectively. Furthermore, we use M2L to investigate the performance for 76 density functionals when used within multilevel learning and building on the following levels drawn from the hierarchy of Jacobs Ladder: LDA, GGA, mGGA, and hybrid functionals. Within M2L and the molecules considered, mGGAs do not provide any noticeable advantage over GGAs. Among the functionals considered and in combination with LDA, the three ","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Learning Science and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1