Xunsheng Du, Yuchen Jin, Xuqing Wu, Yu Liu, Xianping Wu, Omar Awan, Joey Roth, K. C. See, Nicolas Tognini, Jiefu Chen, Zhu Han
{"title":"The Embedded VGG-Net Video Stream Processing Framework for Real-Time Classification of Cutting Volume at Shale Shaker","authors":"Xunsheng Du, Yuchen Jin, Xuqing Wu, Yu Liu, Xianping Wu, Omar Awan, Joey Roth, K. C. See, Nicolas Tognini, Jiefu Chen, Zhu Han","doi":"10.2523/IPTC-19312-MS","DOIUrl":null,"url":null,"abstract":"\n A deep learning framework is proposed and implemented for monitoring the cutting volumes of a shaker in real-time on a deep-water drilling rig. The framework aims at performing classification and quantification with a live video streaming. Compared to the traditional video analytics method that is time-consuming, the proposed framework is more efficient and can be implemented for a real-time video analysis application. The real-time deep learning video analysis model consists of two parts for processing. The first part is a multi-thread video processing engine. A modularized service named Rig-Site Virtual Presence (RSVP) provides real-time video streaming from the rig. The multi-thread video processing engine implements real-time decoding, preprocessing and encoding of the video stream. The second part is a customized deep classification model. Based on the deep neural network (DNN), we implement the following adaptations: 1) Applied whitening and instance normalization to video frames; 2) Optimized the number of convolutional layers and the number of nodes in fully-connected layers; 3) Applied L2-norm regularization. The customized model is embedded in the multi-thread video processing engine, which ensures the capability for the real-time inference. The deep learning model categorizes every video frame into \"ExtraHeavy\", \"Heavy\", \"Light\" or \"None\". The model also outputs the corresponding numerical probabilities of each outcome. The training of the model is accomplished on a Nvidia GeForce 1070 GPU using the video stream with 137Kbps bitrate, 5.84 frames/s, and a frame size of 720×486. With only a common CPU support, the inference of the pre-trained model can be conducted in real-time. Both labeled frames and numerical results will be saved for later examination. Compared to the manual labeling results, the proposed deep learning framework achieves very promising results for analyzing video streaming in real-time.","PeriodicalId":11267,"journal":{"name":"Day 3 Thu, March 28, 2019","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Thu, March 28, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2523/IPTC-19312-MS","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A deep learning framework is proposed and implemented for monitoring the cutting volumes of a shaker in real-time on a deep-water drilling rig. The framework aims at performing classification and quantification with a live video streaming. Compared to the traditional video analytics method that is time-consuming, the proposed framework is more efficient and can be implemented for a real-time video analysis application. The real-time deep learning video analysis model consists of two parts for processing. The first part is a multi-thread video processing engine. A modularized service named Rig-Site Virtual Presence (RSVP) provides real-time video streaming from the rig. The multi-thread video processing engine implements real-time decoding, preprocessing and encoding of the video stream. The second part is a customized deep classification model. Based on the deep neural network (DNN), we implement the following adaptations: 1) Applied whitening and instance normalization to video frames; 2) Optimized the number of convolutional layers and the number of nodes in fully-connected layers; 3) Applied L2-norm regularization. The customized model is embedded in the multi-thread video processing engine, which ensures the capability for the real-time inference. The deep learning model categorizes every video frame into "ExtraHeavy", "Heavy", "Light" or "None". The model also outputs the corresponding numerical probabilities of each outcome. The training of the model is accomplished on a Nvidia GeForce 1070 GPU using the video stream with 137Kbps bitrate, 5.84 frames/s, and a frame size of 720×486. With only a common CPU support, the inference of the pre-trained model can be conducted in real-time. Both labeled frames and numerical results will be saved for later examination. Compared to the manual labeling results, the proposed deep learning framework achieves very promising results for analyzing video streaming in real-time.