考虑字串TF-IDF特征的电源故障对策文本集成机器学习识别

Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, Bo Wang
{"title":"考虑字串TF-IDF特征的电源故障对策文本集成机器学习识别","authors":"Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, Bo Wang","doi":"10.1109/IICSPI.2018.8690443","DOIUrl":null,"url":null,"abstract":"A large amount of fault countermeasure texts are used to guide fault handing operations in power dispatching systems. This paper transforms the identification problem of fault countermeasure disposal text into a classification problem for identifying the content of text, understanding the meaning of text and building an intelligent power grid fault handling automation system. Firstly, the characteristics of fault countermeasure text are analyzed, and the text is preprocessed according to the characteristics and classification requirements. (Term Frequency-Inverse Document Frequency) TF-IDF is used to analyze and extract and vectorize the features of words and word strings in text, then concatenated feature vectors are used to vectorize the text. classification models are built based on a variety of machine learning methods. Examples show that the feature extraction method taking TF-IDF of word string into consideration is superior to word TF-IDF, and the classification effects of different machine learning methods are compared. The examples also show that the ensemble machine learning classification model has better classification effect than the single classifier, and can identify the text of fault countermeasure disposal more accurately and efficiently.","PeriodicalId":6673,"journal":{"name":"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)","volume":"1 1","pages":"610-616"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ensemble Machine Learning Identification of Power Fault Countermeasure Text Considering Word String TF-IDF Feature\",\"authors\":\"Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, Bo Wang\",\"doi\":\"10.1109/IICSPI.2018.8690443\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A large amount of fault countermeasure texts are used to guide fault handing operations in power dispatching systems. This paper transforms the identification problem of fault countermeasure disposal text into a classification problem for identifying the content of text, understanding the meaning of text and building an intelligent power grid fault handling automation system. Firstly, the characteristics of fault countermeasure text are analyzed, and the text is preprocessed according to the characteristics and classification requirements. (Term Frequency-Inverse Document Frequency) TF-IDF is used to analyze and extract and vectorize the features of words and word strings in text, then concatenated feature vectors are used to vectorize the text. classification models are built based on a variety of machine learning methods. Examples show that the feature extraction method taking TF-IDF of word string into consideration is superior to word TF-IDF, and the classification effects of different machine learning methods are compared. The examples also show that the ensemble machine learning classification model has better classification effect than the single classifier, and can identify the text of fault countermeasure disposal more accurately and efficiently.\",\"PeriodicalId\":6673,\"journal\":{\"name\":\"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)\",\"volume\":\"1 1\",\"pages\":\"610-616\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IICSPI.2018.8690443\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference of Safety Produce Informatization (IICSPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICSPI.2018.8690443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

大量的故障对策文本用于指导电力调度系统的故障处理操作。本文将故障对策处置文本的识别问题转化为识别文本内容、理解文本含义、构建智能电网故障处理自动化系统的分类问题。首先,分析故障对策文本的特征,并根据特征和分类要求对文本进行预处理。(Term Frequency- inverse Document Frequency) TF-IDF首先对文本中的词和词串进行特征分析提取和向量化,然后使用串联的特征向量对文本进行向量化。分类模型是基于各种机器学习方法建立的。实例表明,考虑词串TF-IDF的特征提取方法优于词TF-IDF,并比较了不同机器学习方法的分类效果。实例还表明,集成机器学习分类模型比单一分类器具有更好的分类效果,能够更准确、高效地识别故障对策处置文本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Ensemble Machine Learning Identification of Power Fault Countermeasure Text Considering Word String TF-IDF Feature
A large amount of fault countermeasure texts are used to guide fault handing operations in power dispatching systems. This paper transforms the identification problem of fault countermeasure disposal text into a classification problem for identifying the content of text, understanding the meaning of text and building an intelligent power grid fault handling automation system. Firstly, the characteristics of fault countermeasure text are analyzed, and the text is preprocessed according to the characteristics and classification requirements. (Term Frequency-Inverse Document Frequency) TF-IDF is used to analyze and extract and vectorize the features of words and word strings in text, then concatenated feature vectors are used to vectorize the text. classification models are built based on a variety of machine learning methods. Examples show that the feature extraction method taking TF-IDF of word string into consideration is superior to word TF-IDF, and the classification effects of different machine learning methods are compared. The examples also show that the ensemble machine learning classification model has better classification effect than the single classifier, and can identify the text of fault countermeasure disposal more accurately and efficiently.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Functional Safety Analysis and Design of Dual-Motor Hybrid Bus Clutch System Methods of Resource Allocation with Conflict Detection Exploration and Application of Sheet Metal Technology on Pit Package Repairing Study on Standardization of Electrolytic Trace Moisture Meter in Safety Construction of CNG Refueling Station The Research and Analysis of Big Data Application on Distribution Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1