Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, Bo Wang
{"title":"考虑字串TF-IDF特征的电源故障对策文本集成机器学习识别","authors":"Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, Bo Wang","doi":"10.1109/IICSPI.2018.8690443","DOIUrl":null,"url":null,"abstract":"A large amount of fault countermeasure texts are used to guide fault handing operations in power dispatching systems. This paper transforms the identification problem of fault countermeasure disposal text into a classification problem for identifying the content of text, understanding the meaning of text and building an intelligent power grid fault handling automation system. Firstly, the characteristics of fault countermeasure text are analyzed, and the text is preprocessed according to the characteristics and classification requirements. (Term Frequency-Inverse Document Frequency) TF-IDF is used to analyze and extract and vectorize the features of words and word strings in text, then concatenated feature vectors are used to vectorize the text. classification models are built based on a variety of machine learning methods. Examples show that the feature extraction method taking TF-IDF of word string into consideration is superior to word TF-IDF, and the classification effects of different machine learning methods are compared. The examples also show that the ensemble machine learning classification model has better classification effect than the single classifier, and can identify the text of fault countermeasure disposal more accurately and efficiently.","PeriodicalId":6673,"journal":{"name":"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)","volume":"1 1","pages":"610-616"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ensemble Machine Learning Identification of Power Fault Countermeasure Text Considering Word String TF-IDF Feature\",\"authors\":\"Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, Bo Wang\",\"doi\":\"10.1109/IICSPI.2018.8690443\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A large amount of fault countermeasure texts are used to guide fault handing operations in power dispatching systems. This paper transforms the identification problem of fault countermeasure disposal text into a classification problem for identifying the content of text, understanding the meaning of text and building an intelligent power grid fault handling automation system. Firstly, the characteristics of fault countermeasure text are analyzed, and the text is preprocessed according to the characteristics and classification requirements. (Term Frequency-Inverse Document Frequency) TF-IDF is used to analyze and extract and vectorize the features of words and word strings in text, then concatenated feature vectors are used to vectorize the text. classification models are built based on a variety of machine learning methods. Examples show that the feature extraction method taking TF-IDF of word string into consideration is superior to word TF-IDF, and the classification effects of different machine learning methods are compared. The examples also show that the ensemble machine learning classification model has better classification effect than the single classifier, and can identify the text of fault countermeasure disposal more accurately and efficiently.\",\"PeriodicalId\":6673,\"journal\":{\"name\":\"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)\",\"volume\":\"1 1\",\"pages\":\"610-616\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IICSPI.2018.8690443\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICSPI.2018.8690443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ensemble Machine Learning Identification of Power Fault Countermeasure Text Considering Word String TF-IDF Feature
A large amount of fault countermeasure texts are used to guide fault handing operations in power dispatching systems. This paper transforms the identification problem of fault countermeasure disposal text into a classification problem for identifying the content of text, understanding the meaning of text and building an intelligent power grid fault handling automation system. Firstly, the characteristics of fault countermeasure text are analyzed, and the text is preprocessed according to the characteristics and classification requirements. (Term Frequency-Inverse Document Frequency) TF-IDF is used to analyze and extract and vectorize the features of words and word strings in text, then concatenated feature vectors are used to vectorize the text. classification models are built based on a variety of machine learning methods. Examples show that the feature extraction method taking TF-IDF of word string into consideration is superior to word TF-IDF, and the classification effects of different machine learning methods are compared. The examples also show that the ensemble machine learning classification model has better classification effect than the single classifier, and can identify the text of fault countermeasure disposal more accurately and efficiently.