Predicting possible recommendations related to causes and consequences in the HAZOP study worksheet using natural language processing and machine learning: BERT, clustering, and classification
Ali Ekramipooya , Mehrdad Boroushaki , Davood Rashtchian
{"title":"Predicting possible recommendations related to causes and consequences in the HAZOP study worksheet using natural language processing and machine learning: BERT, clustering, and classification","authors":"Ali Ekramipooya , Mehrdad Boroushaki , Davood Rashtchian","doi":"10.1016/j.jlp.2024.105310","DOIUrl":null,"url":null,"abstract":"<div><p>A set of recommendations is one of the most valuable outputs of the hazard and operability (HAZOP) study. The HAZOP study team provides recommendations when deficiencies are detected in the chemical process plant. These deficiencies can cause chemical process accidents and operability issues. This study employed a data-driven approach using natural language processing (NLP) and machine learning (ML) to predict potential recommendations based on causes and consequences. The dataset had no label; thus, clustering was used to label it. Firstly, bidirectional encoder representations from transformers (BERT) converted recommendation sentences into vectors. Secondly, uniform manifold approximation and projection (UMAP) and hierarchical density-based spatial clustering of applications with noise (HDBSCAN) were utilized to determine recommendation categories and label the dataset. Then, BERT was used to convert causes and consequences into vectors. Finally, a multi-layer perceptron (MLP) classifier was employed to predict possible recommendations based on causes and consequences. The class imbalance problem was handled by random over-sampling. The prediction accuracy of possible recommendations based on causes and consequences equals 93.7% and 89.5%, respectively. As a result of predicting potential recommendations utilizing causes and consequences, major recommendations will not be overlooked during the HAZOP study. This can further expand NLP and ML applications in HAZOP study automation.</p></div>","PeriodicalId":16291,"journal":{"name":"Journal of Loss Prevention in The Process Industries","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Loss Prevention in The Process Industries","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950423024000688","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
A set of recommendations is one of the most valuable outputs of the hazard and operability (HAZOP) study. The HAZOP study team provides recommendations when deficiencies are detected in the chemical process plant. These deficiencies can cause chemical process accidents and operability issues. This study employed a data-driven approach using natural language processing (NLP) and machine learning (ML) to predict potential recommendations based on causes and consequences. The dataset had no label; thus, clustering was used to label it. Firstly, bidirectional encoder representations from transformers (BERT) converted recommendation sentences into vectors. Secondly, uniform manifold approximation and projection (UMAP) and hierarchical density-based spatial clustering of applications with noise (HDBSCAN) were utilized to determine recommendation categories and label the dataset. Then, BERT was used to convert causes and consequences into vectors. Finally, a multi-layer perceptron (MLP) classifier was employed to predict possible recommendations based on causes and consequences. The class imbalance problem was handled by random over-sampling. The prediction accuracy of possible recommendations based on causes and consequences equals 93.7% and 89.5%, respectively. As a result of predicting potential recommendations utilizing causes and consequences, major recommendations will not be overlooked during the HAZOP study. This can further expand NLP and ML applications in HAZOP study automation.
期刊介绍:
The broad scope of the journal is process safety. Process safety is defined as the prevention and mitigation of process-related injuries and damage arising from process incidents involving fire, explosion and toxic release. Such undesired events occur in the process industries during the use, storage, manufacture, handling, and transportation of highly hazardous chemicals.