Luca Montanelli, Vineeth Venugopal, Elsa A. Olivetti, Marat I. Latypov
{"title":"High-Throughput Extraction of Phase–Property Relationships from Literature Using Natural Language Processing and Large Language Models","authors":"Luca Montanelli, Vineeth Venugopal, Elsa A. Olivetti, Marat I. Latypov","doi":"10.1007/s40192-024-00344-8","DOIUrl":null,"url":null,"abstract":"<p>Consolidating published research on aluminum alloys into insights about microstructure–property relationships can simplify and reduce the costs involved in alloy design. One critical design consideration for many heat-treatable alloys deriving superior properties from precipitation are phases as key microstructure constituents because they can have a decisive impact on the engineering properties of alloys. Here, we present a computational framework for high-throughput extraction of phases and their impact on properties from scientific papers. Our framework includes transformer-based and large language models to identify sentences with phase-property information in papers, recognize phase and property entities, and extract phase-property relationships and their “sentiment.” We demonstrate the application of our framework on aluminum alloys, for which we build a database of 7,675 phase–property relationships extracted from a corpus of almost 5000 full-text papers. We comment on the extracted relationships based on common metallurgical knowledge.</p>","PeriodicalId":13604,"journal":{"name":"Integrating Materials and Manufacturing Innovation","volume":"70 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integrating Materials and Manufacturing Innovation","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1007/s40192-024-00344-8","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MANUFACTURING","Score":null,"Total":0}
引用次数: 0
Abstract
Consolidating published research on aluminum alloys into insights about microstructure–property relationships can simplify and reduce the costs involved in alloy design. One critical design consideration for many heat-treatable alloys deriving superior properties from precipitation are phases as key microstructure constituents because they can have a decisive impact on the engineering properties of alloys. Here, we present a computational framework for high-throughput extraction of phases and their impact on properties from scientific papers. Our framework includes transformer-based and large language models to identify sentences with phase-property information in papers, recognize phase and property entities, and extract phase-property relationships and their “sentiment.” We demonstrate the application of our framework on aluminum alloys, for which we build a database of 7,675 phase–property relationships extracted from a corpus of almost 5000 full-text papers. We comment on the extracted relationships based on common metallurgical knowledge.
期刊介绍:
The journal will publish: Research that supports building a model-based definition of materials and processes that is compatible with model-based engineering design processes and multidisciplinary design optimization; Descriptions of novel experimental or computational tools or data analysis techniques, and their application, that are to be used for ICME; Best practices in verification and validation of computational tools, sensitivity analysis, uncertainty quantification, and data management, as well as standards and protocols for software integration and exchange of data; In-depth descriptions of data, databases, and database tools; Detailed case studies on efforts, and their impact, that integrate experiment and computation to solve an enduring engineering problem in materials and manufacturing.