J. J. Mesa-Jiménez, L. Stokes, QingPing Yang, V. Livina
{"title":"建筑管理系统中文本分类的机器学习","authors":"J. J. Mesa-Jiménez, L. Stokes, QingPing Yang, V. Livina","doi":"10.3846/jcem.2022.16012","DOIUrl":null,"url":null,"abstract":"In building management systems (BMS), a medium building may have between 200 and 1000 sensor points. Their labels need to be translated into a naming standard so they can be automatically recognised by the BMS platform. The current industrial practices often manually translate these points into labels (this is known as the tagging process), which takes around 8 hours for every 100 points. We introduce an AI-based multi-stage text classification that translates BMS points into formatted BMS labels. After comparing five different techniques for text classification (logistic regression, random forests, XGBoost, multinomial Naive Bayes and linear support vector classification), we demonstrate that XGBoost is the top performer with 90.29% of true positives, and use the prediction confidence to filter out false positives. This approach can be applied in sensors networks in various applications, where manual free-text data pre-processing remains cumbersome.","PeriodicalId":15524,"journal":{"name":"Journal of Civil Engineering and Management","volume":" ","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MACHINE LEARNING FOR TEXT CLASSIFICATION IN BUILDING MANAGEMENT SYSTEMS\",\"authors\":\"J. J. Mesa-Jiménez, L. Stokes, QingPing Yang, V. Livina\",\"doi\":\"10.3846/jcem.2022.16012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In building management systems (BMS), a medium building may have between 200 and 1000 sensor points. Their labels need to be translated into a naming standard so they can be automatically recognised by the BMS platform. The current industrial practices often manually translate these points into labels (this is known as the tagging process), which takes around 8 hours for every 100 points. We introduce an AI-based multi-stage text classification that translates BMS points into formatted BMS labels. After comparing five different techniques for text classification (logistic regression, random forests, XGBoost, multinomial Naive Bayes and linear support vector classification), we demonstrate that XGBoost is the top performer with 90.29% of true positives, and use the prediction confidence to filter out false positives. This approach can be applied in sensors networks in various applications, where manual free-text data pre-processing remains cumbersome.\",\"PeriodicalId\":15524,\"journal\":{\"name\":\"Journal of Civil Engineering and Management\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2022-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Civil Engineering and Management\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.3846/jcem.2022.16012\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Civil Engineering and Management","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3846/jcem.2022.16012","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
MACHINE LEARNING FOR TEXT CLASSIFICATION IN BUILDING MANAGEMENT SYSTEMS
In building management systems (BMS), a medium building may have between 200 and 1000 sensor points. Their labels need to be translated into a naming standard so they can be automatically recognised by the BMS platform. The current industrial practices often manually translate these points into labels (this is known as the tagging process), which takes around 8 hours for every 100 points. We introduce an AI-based multi-stage text classification that translates BMS points into formatted BMS labels. After comparing five different techniques for text classification (logistic regression, random forests, XGBoost, multinomial Naive Bayes and linear support vector classification), we demonstrate that XGBoost is the top performer with 90.29% of true positives, and use the prediction confidence to filter out false positives. This approach can be applied in sensors networks in various applications, where manual free-text data pre-processing remains cumbersome.
期刊介绍:
The Journal of Civil Engineering and Management is a peer-reviewed journal that provides an international forum for the dissemination of the latest original research, achievements and developments. We publish for researchers, designers, users and manufacturers in the different fields of civil engineering and management.
The journal publishes original articles that present new information and reviews. Our objective is to provide essential information and new ideas to help improve civil engineering competency, efficiency and productivity in world markets.
The Journal of Civil Engineering and Management publishes articles in the following fields:
building materials and structures,
structural mechanics and physics,
geotechnical engineering,
road and bridge engineering,
urban engineering and economy,
constructions technology, economy and management,
information technologies in construction,
fire protection, thermoinsulation and renovation of buildings,
labour safety in construction.