{"title":"用于预测钢中相变温度的分类数据集、回归和机器学习模型","authors":"Jinlei Lu, Guanglong Xu, Fuwen Chen, Yuwen Cui","doi":"10.1016/j.calphad.2024.102748","DOIUrl":null,"url":null,"abstract":"<div><p>The prediction of the characteristic Martensite Start (<em>M</em><sub><em>s</em></sub>) temperature and Austenitic Nose Tip Temperature (ANTT) in steels is of scientific and technological importance; however, it faces significant challenges due to multiphysical complexity.</p><p>In this study, we introduced a structured framework for data classification and hierarchical iterations aimed at predicting Ms (Martensite start temperature) and ANTT (Austenite non-transforming temperature). This framework was incorporated into two optimization models, leading to enhancements in accuracy, extrapolation capabilities, and generalization performance. First, we classified the collected Ms datasets hierarchically based on the alloying elements presented in steels, including carbon, austenite stabilizers, non-austenitization elements, and data credibility. Regression analyses of Ms temperatures concerning chemical compositions were then carried out using phenomenological variables from binary systems to multi-component systems in alignment with the spirit of CALPHAD modeling, which is renowned for its robust extrapolation abilities. By iteratively fitting the hierarchically classified datasets and implementing hierarchical iterations, we developed the CALPHAD-guided phenomenological variable (CGPV) Ms regression model. This model achieved improved accuracy levels, with R<sup>2</sup> values of 0.9 for training and 0.87 for testing, surpassing most conventional regression models that do not account for compositional interactions. Furthermore, the CALPHAD-guided machine learning (CGML) model, constructed based on the classified datasets and hierarchical iterations but without utilizing phenomenological variables, demonstrated strong performance with R<sup>2</sup> values of 0.98 and 0.86 for training and testing, respectively. The CGML model was demonstrated not only to reliably filter out problematic data in a dataset but also to unveil the unnoticed coupling between carbon and other alloying elements on <em>M</em><sub><em>s</em></sub>. Finally, the CGML method has been readily transferred to predict ANTT with high accuracy as well.</p></div>","PeriodicalId":9436,"journal":{"name":"Calphad-computer Coupling of Phase Diagrams and Thermochemistry","volume":"87 ","pages":"Article 102748"},"PeriodicalIF":1.9000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classified dataset, regression and machine learning modeling for prediction of phase transformation temperatures in steels\",\"authors\":\"Jinlei Lu, Guanglong Xu, Fuwen Chen, Yuwen Cui\",\"doi\":\"10.1016/j.calphad.2024.102748\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The prediction of the characteristic Martensite Start (<em>M</em><sub><em>s</em></sub>) temperature and Austenitic Nose Tip Temperature (ANTT) in steels is of scientific and technological importance; however, it faces significant challenges due to multiphysical complexity.</p><p>In this study, we introduced a structured framework for data classification and hierarchical iterations aimed at predicting Ms (Martensite start temperature) and ANTT (Austenite non-transforming temperature). This framework was incorporated into two optimization models, leading to enhancements in accuracy, extrapolation capabilities, and generalization performance. First, we classified the collected Ms datasets hierarchically based on the alloying elements presented in steels, including carbon, austenite stabilizers, non-austenitization elements, and data credibility. Regression analyses of Ms temperatures concerning chemical compositions were then carried out using phenomenological variables from binary systems to multi-component systems in alignment with the spirit of CALPHAD modeling, which is renowned for its robust extrapolation abilities. By iteratively fitting the hierarchically classified datasets and implementing hierarchical iterations, we developed the CALPHAD-guided phenomenological variable (CGPV) Ms regression model. This model achieved improved accuracy levels, with R<sup>2</sup> values of 0.9 for training and 0.87 for testing, surpassing most conventional regression models that do not account for compositional interactions. Furthermore, the CALPHAD-guided machine learning (CGML) model, constructed based on the classified datasets and hierarchical iterations but without utilizing phenomenological variables, demonstrated strong performance with R<sup>2</sup> values of 0.98 and 0.86 for training and testing, respectively. The CGML model was demonstrated not only to reliably filter out problematic data in a dataset but also to unveil the unnoticed coupling between carbon and other alloying elements on <em>M</em><sub><em>s</em></sub>. Finally, the CGML method has been readily transferred to predict ANTT with high accuracy as well.</p></div>\",\"PeriodicalId\":9436,\"journal\":{\"name\":\"Calphad-computer Coupling of Phase Diagrams and Thermochemistry\",\"volume\":\"87 \",\"pages\":\"Article 102748\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Calphad-computer Coupling of Phase Diagrams and Thermochemistry\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0364591624000907\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Calphad-computer Coupling of Phase Diagrams and Thermochemistry","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0364591624000907","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
摘要
预测钢材中的马氏体起始温度(Ms)和奥氏体鼻尖温度(ANTT)具有重要的科学和技术意义;然而,由于多重物理复杂性,预测工作面临着巨大挑战。该框架被纳入两个优化模型,从而提高了准确性、外推能力和泛化性能。首先,我们根据钢中的合金元素(包括碳、奥氏体稳定剂、非奥氏体化元素和数据可信度)对收集的 Ms 数据集进行了分级分类。然后,利用从二元体系到多组分体系的现象变量,对有关化学成分的 Ms 温度进行回归分析,这与 CALPHAD 建模的精神是一致的,CALPHAD 以其强大的外推能力而著称。通过迭代拟合分层分类数据集和实施分层迭代,我们开发出了 CALPHAD 引导的现象学变量(CGPV)Ms 回归模型。该模型提高了准确度,训练的 R2 值为 0.9,测试的 R2 值为 0.87,超过了大多数不考虑成分相互作用的传统回归模型。此外,基于分类数据集和分层迭代构建的 CALPHAD 引导的机器学习(CGML)模型表现出色,训练和测试的 R2 值分别为 0.98 和 0.86。结果表明,CGML 模型不仅能可靠地过滤掉数据集中的问题数据,还能揭示出 Ms 上碳与其他合金元素之间未被注意到的耦合关系。
Classified dataset, regression and machine learning modeling for prediction of phase transformation temperatures in steels
The prediction of the characteristic Martensite Start (Ms) temperature and Austenitic Nose Tip Temperature (ANTT) in steels is of scientific and technological importance; however, it faces significant challenges due to multiphysical complexity.
In this study, we introduced a structured framework for data classification and hierarchical iterations aimed at predicting Ms (Martensite start temperature) and ANTT (Austenite non-transforming temperature). This framework was incorporated into two optimization models, leading to enhancements in accuracy, extrapolation capabilities, and generalization performance. First, we classified the collected Ms datasets hierarchically based on the alloying elements presented in steels, including carbon, austenite stabilizers, non-austenitization elements, and data credibility. Regression analyses of Ms temperatures concerning chemical compositions were then carried out using phenomenological variables from binary systems to multi-component systems in alignment with the spirit of CALPHAD modeling, which is renowned for its robust extrapolation abilities. By iteratively fitting the hierarchically classified datasets and implementing hierarchical iterations, we developed the CALPHAD-guided phenomenological variable (CGPV) Ms regression model. This model achieved improved accuracy levels, with R2 values of 0.9 for training and 0.87 for testing, surpassing most conventional regression models that do not account for compositional interactions. Furthermore, the CALPHAD-guided machine learning (CGML) model, constructed based on the classified datasets and hierarchical iterations but without utilizing phenomenological variables, demonstrated strong performance with R2 values of 0.98 and 0.86 for training and testing, respectively. The CGML model was demonstrated not only to reliably filter out problematic data in a dataset but also to unveil the unnoticed coupling between carbon and other alloying elements on Ms. Finally, the CGML method has been readily transferred to predict ANTT with high accuracy as well.
期刊介绍:
The design of industrial processes requires reliable thermodynamic data. CALPHAD (Computer Coupling of Phase Diagrams and Thermochemistry) aims to promote computational thermodynamics through development of models to represent thermodynamic properties for various phases which permit prediction of properties of multicomponent systems from those of binary and ternary subsystems, critical assessment of data and their incorporation into self-consistent databases, development of software to optimize and derive thermodynamic parameters and the development and use of databanks for calculations to improve understanding of various industrial and technological processes. This work is disseminated through the CALPHAD journal and its annual conference.