The lack of reliable segmentation labels is the major obstacles to weakly supervised semantic segmentation. We provide a pseudo-label generation approach based on a deep convolutional neural network, which is supervised by the image-level category labels only. However, the limitation of the region of interest in the targets influences the effectiveness and integrity of the traditional methods in obtaining pixel-level mask annotations. This paper studies the characteristics of class activation mapping in classification network, focusing on the methods to enhance the localization ability of class activation mapping. We propose a Region-guided Pixel-label Generation framework (RPG) for semantic segmentation. The proposed region guidance mechanism decreases the influence of category supervision and makes use of the known high-level semantic information to guide the network, attaining more complete pixel-level annotations via expanding the regions of interest. Experimental results of training and validation on the PASCALVOC 2012 data set prove to achieve better pixel labeling and segmentation accuracy comparing with state-of-the-art methods.
{"title":"Region-Guided Pixel-Level Label Generation for Weakly Supervised Semantic Segmentation","authors":"Xinyu Fu, Xiao Yao","doi":"10.1145/3484274.3484275","DOIUrl":"https://doi.org/10.1145/3484274.3484275","url":null,"abstract":"The lack of reliable segmentation labels is the major obstacles to weakly supervised semantic segmentation. We provide a pseudo-label generation approach based on a deep convolutional neural network, which is supervised by the image-level category labels only. However, the limitation of the region of interest in the targets influences the effectiveness and integrity of the traditional methods in obtaining pixel-level mask annotations. This paper studies the characteristics of class activation mapping in classification network, focusing on the methods to enhance the localization ability of class activation mapping. We propose a Region-guided Pixel-label Generation framework (RPG) for semantic segmentation. The proposed region guidance mechanism decreases the influence of category supervision and makes use of the known high-level semantic information to guide the network, attaining more complete pixel-level annotations via expanding the regions of interest. Experimental results of training and validation on the PASCALVOC 2012 data set prove to achieve better pixel labeling and segmentation accuracy comparing with state-of-the-art methods.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121456354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhanti Liang, Yongkang Xing, Kexin Guan, Zheng Da, Jianwen Fan, Gan Wu
In 2020, the COVID-19 epidemic swept the world and continued to spread. The global epidemic reflects the severe inadequacy of public education on infectious disease prevention in many countries. Therefore, it is necessary to popularize the knowledge of infectious disease prevention by designing a new epidemic education system. As a new interactive technology, Virtual Reality (VR) technology profoundly changes the human-computer interaction experience. According to the characteristics of VR, this research designed a virtual simulation system based on epidemic education research. This system is developed with Unreal 4 engine and simulates scenarios that require epidemic prevention education. The user interacts with the three-dimensional (3D) scene model through the VR controllers and visual user interface. This system allows users to effectively acquire knowledge about the epidemic and experience how medical staff deal with emergencies. This article discusses the methods of popularizing epidemic knowledge education by virtual simulation technology. It illustrates the system and knowledge popularization of the new epidemic health education in virtual simulation. The research will assist the public in acquiring epidemic knowledge effectively.
{"title":"Design Virtual Reality Simulation System for Epidemic (Covid-19) Education to Public","authors":"Zhanti Liang, Yongkang Xing, Kexin Guan, Zheng Da, Jianwen Fan, Gan Wu","doi":"10.1145/3484274.3484297","DOIUrl":"https://doi.org/10.1145/3484274.3484297","url":null,"abstract":"In 2020, the COVID-19 epidemic swept the world and continued to spread. The global epidemic reflects the severe inadequacy of public education on infectious disease prevention in many countries. Therefore, it is necessary to popularize the knowledge of infectious disease prevention by designing a new epidemic education system. As a new interactive technology, Virtual Reality (VR) technology profoundly changes the human-computer interaction experience. According to the characteristics of VR, this research designed a virtual simulation system based on epidemic education research. This system is developed with Unreal 4 engine and simulates scenarios that require epidemic prevention education. The user interacts with the three-dimensional (3D) scene model through the VR controllers and visual user interface. This system allows users to effectively acquire knowledge about the epidemic and experience how medical staff deal with emergencies. This article discusses the methods of popularizing epidemic knowledge education by virtual simulation technology. It illustrates the system and knowledge popularization of the new epidemic health education in virtual simulation. The research will assist the public in acquiring epidemic knowledge effectively.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122822288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It is challenging and imperative to achieve unbiased control for field outlet temperature of parabolic trough solar field (PTSF) as it is a critical part of solar plants but is interrupted by multiple disturbances. To tackle this problem, an active disturbance rejection control (ADRC) is designed to alleviate these disturbances. Firstly, all the disturbances, including external disturbances, model mismatch and parameter perturbation rather than just direct normal irradiation, are lumped into one disturbance. And then, the lumped disturbance is estimated and rejected by an ADRC, where the order and gain of ADRC are determined based on an operating point of a nonlinear PTSF. The performance of the ADRC controller is also compared to 2-DOF-PID, and simulation results show remarkable merits in disturbance rejection and reference tracking of the proposed controller.
{"title":"Active Disturbance Rejection Control for a Parabolic Trough Solar Field","authors":"Xian-hua Gao, Zhigang Su","doi":"10.1145/3484274.3484299","DOIUrl":"https://doi.org/10.1145/3484274.3484299","url":null,"abstract":"It is challenging and imperative to achieve unbiased control for field outlet temperature of parabolic trough solar field (PTSF) as it is a critical part of solar plants but is interrupted by multiple disturbances. To tackle this problem, an active disturbance rejection control (ADRC) is designed to alleviate these disturbances. Firstly, all the disturbances, including external disturbances, model mismatch and parameter perturbation rather than just direct normal irradiation, are lumped into one disturbance. And then, the lumped disturbance is estimated and rejected by an ADRC, where the order and gain of ADRC are determined based on an operating point of a nonlinear PTSF. The performance of the ADRC controller is also compared to 2-DOF-PID, and simulation results show remarkable merits in disturbance rejection and reference tracking of the proposed controller.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"10 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121001446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To solve the problem of defect detection in straw pipeline production, this paper proposes an efficient and fast straw defect detection algorithm (IPOY) based on pruned YOLOv3. Algorithm adopts YOLOv3 model, and then trains the model with channel sparsity regularization, prunes channels with small scaling factors after sparse training, finally fine-tune the pruned network. This process was iterated several times to compress the YOLOv3 model to achieve a lighter model volume, reduce the computational cost of the model, and make the model suitable for industrial production to facilitate application migration to mobile devices. Experimental results show that the proposed algorithm can compress the volume of YOLOv3 model to the maximum extent and maintain the high precision of detection.
{"title":"Straw Defect Detection Algorithm Based on Pruned YOLOv3","authors":"Qi-chang Xu, Liang Zhou","doi":"10.1145/3484274.3484285","DOIUrl":"https://doi.org/10.1145/3484274.3484285","url":null,"abstract":"To solve the problem of defect detection in straw pipeline production, this paper proposes an efficient and fast straw defect detection algorithm (IPOY) based on pruned YOLOv3. Algorithm adopts YOLOv3 model, and then trains the model with channel sparsity regularization, prunes channels with small scaling factors after sparse training, finally fine-tune the pruned network. This process was iterated several times to compress the YOLOv3 model to achieve a lighter model volume, reduce the computational cost of the model, and make the model suitable for industrial production to facilitate application migration to mobile devices. Experimental results show that the proposed algorithm can compress the volume of YOLOv3 model to the maximum extent and maintain the high precision of detection.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127702978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Qiu, P. Su, Shanshan Jiang, Xingyu Yue, Yitian Zhao, Jiang Liu
Modern deep neural networks are able to beat human annotators in several medical image processing tasks. In practical manual annotation for medical image segmentation tasks, the labels of annotators often show inter-observer variability (IOV) which is mainly caused by annotators’ different understandings of expertise. In order to build a trustworthy segmentation system, robust models should consider how to capture uncertainty in samples and labels. Different from the conventional way of handling IOV with label fusion such as majority voting, a fuzzy integral based ensemble framework of multiple deep learning models for optic disc segmentation is proposed. Each component segmentation model is trained with respect to an annotator. Then, a powerful nonlinear aggregation function, the Choquet integral, is employed in form of a neural network to integrate the segmentation results of multiple annotators. The proposed method is validated on the public RIM-ONE dataset consisting of 169 fundus images and each image is annotated by 5 experts. Compared with conventional segmentation ensemble methods, the proposed methods achieves a higher Dice score (98.69%).
{"title":"Learning from Human Uncertainty by Choquet Integral for Optic Disc Segmentation","authors":"H. Qiu, P. Su, Shanshan Jiang, Xingyu Yue, Yitian Zhao, Jiang Liu","doi":"10.1145/3484274.3484276","DOIUrl":"https://doi.org/10.1145/3484274.3484276","url":null,"abstract":"Modern deep neural networks are able to beat human annotators in several medical image processing tasks. In practical manual annotation for medical image segmentation tasks, the labels of annotators often show inter-observer variability (IOV) which is mainly caused by annotators’ different understandings of expertise. In order to build a trustworthy segmentation system, robust models should consider how to capture uncertainty in samples and labels. Different from the conventional way of handling IOV with label fusion such as majority voting, a fuzzy integral based ensemble framework of multiple deep learning models for optic disc segmentation is proposed. Each component segmentation model is trained with respect to an annotator. Then, a powerful nonlinear aggregation function, the Choquet integral, is employed in form of a neural network to integrate the segmentation results of multiple annotators. The proposed method is validated on the public RIM-ONE dataset consisting of 169 fundus images and each image is annotated by 5 experts. Compared with conventional segmentation ensemble methods, the proposed methods achieves a higher Dice score (98.69%).","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131354883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Acute lymphoblastic Leukemia (ALL) is dangerous cancer in which the infected blood cells disturb the blood and bone marrow. It attacks the body's immune and the ability of bone marrow to produce white blood cells have diminished. This research aims to classify the ALL image using the whole feature information. We proposed a method to decrease the image's size using the whole co-occurrence matrix to represent the object. The research performances have produced 90.77%, 96,67%, and 95.38% for the maximum accuracy, sensitivity, and specificity. This research has also compared to separate channels, which are red, green, and blue. Our novel method shows that the whole feature information has yielded higher accuracy, sensitivity, and specificity than the others, which are the red, green, as well as blue channels. Furthermore, this research has a novelty, i.e., to prove that the whole feature information method is better for the implementation system. Additionally, this research contributes by proposing a method about whole feature information for the implementation system.
{"title":"Comparison between a Whole and Separated Feature Information for Acute Lymphoblastic Leukemia (ALL) Classification","authors":"A. Muntasa, Muhammad Yusuf","doi":"10.1145/3484274.3484289","DOIUrl":"https://doi.org/10.1145/3484274.3484289","url":null,"abstract":"Acute lymphoblastic Leukemia (ALL) is dangerous cancer in which the infected blood cells disturb the blood and bone marrow. It attacks the body's immune and the ability of bone marrow to produce white blood cells have diminished. This research aims to classify the ALL image using the whole feature information. We proposed a method to decrease the image's size using the whole co-occurrence matrix to represent the object. The research performances have produced 90.77%, 96,67%, and 95.38% for the maximum accuracy, sensitivity, and specificity. This research has also compared to separate channels, which are red, green, and blue. Our novel method shows that the whole feature information has yielded higher accuracy, sensitivity, and specificity than the others, which are the red, green, as well as blue channels. Furthermore, this research has a novelty, i.e., to prove that the whole feature information method is better for the implementation system. Additionally, this research contributes by proposing a method about whole feature information for the implementation system.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122648088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In visual sentiment analysis, sentiment estimation from images is a challenging research problem. Previous studies focused on a few specific sentiments and their intensities and have not captured abundant psychological human feelings. In addition, multi-label sentiment estimation from images has not been sufficiently investigated. The purpose of this research is to build a visual sentiment dataset, accurately estimate the sentiments as a multi-label multi-class problem from images that simultaneously evoke multiple emotions. We built a visual sentiment dataset based on Plutchik's wheel of emotions. We describe this ‘Senti8PW’ dataset, then perform multi-label sentiment analysis using the dataset, where we propose a combined deep neural network model that enables inputs from both hand-crafted features and CNN features. We also introduce a threshold-based multi-label prediction algorithm, in which we assume that each emotion has a probability distribution. In other words, after training our deep neural network, we predict evoked emotions for an image if the intensity of the emotion is larger than the threshold of the corresponding emotion. Extensive experiments were conducted on our dataset. Our model achieves superior results compared to the state-of-the-art algorithms in terms of subsets.
{"title":"Multi-label Prediction for Visual Sentiment Analysis using Eight Different Emotions based on Psychology","authors":"Tetsuya Asakawa, Masaki Aono","doi":"10.1145/3484274.3484296","DOIUrl":"https://doi.org/10.1145/3484274.3484296","url":null,"abstract":"In visual sentiment analysis, sentiment estimation from images is a challenging research problem. Previous studies focused on a few specific sentiments and their intensities and have not captured abundant psychological human feelings. In addition, multi-label sentiment estimation from images has not been sufficiently investigated. The purpose of this research is to build a visual sentiment dataset, accurately estimate the sentiments as a multi-label multi-class problem from images that simultaneously evoke multiple emotions. We built a visual sentiment dataset based on Plutchik's wheel of emotions. We describe this ‘Senti8PW’ dataset, then perform multi-label sentiment analysis using the dataset, where we propose a combined deep neural network model that enables inputs from both hand-crafted features and CNN features. We also introduce a threshold-based multi-label prediction algorithm, in which we assume that each emotion has a probability distribution. In other words, after training our deep neural network, we predict evoked emotions for an image if the intensity of the emotion is larger than the threshold of the corresponding emotion. Extensive experiments were conducted on our dataset. Our model achieves superior results compared to the state-of-the-art algorithms in terms of subsets.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128159149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given the complex structure and long failure time of the flight automation control system, which affect the aircraft's operational efficiency, a fault diagnosis scheme with a one-class support vector machine(OCSVM) optimized by an ant colony optimization(ACO) is proposed. Firstly, this paper analyses the fault characteristics of flight automation systems and constructs a noise filter. Then, a residual decision algorithm based on an improved support vector machine is proposed to judge the residuals in the case of complex flight control system output coupling. Third, experimental simulation results show that the decision algorithm takes about 0.5s for fault detection at a sampling time of 0.1s, significantly reducing fault detection time and an effective fault detection rate of greater than 90%.
{"title":"Fault Diagnosis of UAV System Base On One-Class Support Vector Machine","authors":"Zaifei Fu, Xin Chen, Yu-juan Guo, Jing Chen","doi":"10.1145/3484274.3484301","DOIUrl":"https://doi.org/10.1145/3484274.3484301","url":null,"abstract":"Given the complex structure and long failure time of the flight automation control system, which affect the aircraft's operational efficiency, a fault diagnosis scheme with a one-class support vector machine(OCSVM) optimized by an ant colony optimization(ACO) is proposed. Firstly, this paper analyses the fault characteristics of flight automation systems and constructs a noise filter. Then, a residual decision algorithm based on an improved support vector machine is proposed to judge the residuals in the case of complex flight control system output coupling. Third, experimental simulation results show that the decision algorithm takes about 0.5s for fault detection at a sampling time of 0.1s, significantly reducing fault detection time and an effective fault detection rate of greater than 90%.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129734797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Doohee Lee, Gi Soon Cha, Ehtesham Iqbal, H. Song, Kwang-nam Choi
Artificial intelligence (AI) has been developing in a variety of methods over the past decade. However most AI experts worried to build a deep or wide network because the accuracy of AI models depends heavily on the depth of the network. In general deep and wide networks are better at learning than those that are less deep and wide and wide. On the other hand deeper networks are more complex and have many disadvantages such as computational cost and system specification dependency. We propose a novel method to improve the average recall rate for small objects in the deep convolutional network in the paper. The proposed method added pre-processing layer before the network rather than stacking the networks deeper or wide. The presented pre-processing layer consists of two major parts: up-sampling and down-sampling of the data. The overall objective of up-sampling and down-sampling is to enhance the resolution of small objects in the input image. The pre-processing network improves the average recall rate of the base network to 3.56%. This experiment result depicts that the proposed method outperforms the small object detection performance. CCS CONCEPTS • Computing methodologies • Object detection
{"title":"Improvement of Detection Rate for Small Objects Using Pre-processing Network","authors":"Doohee Lee, Gi Soon Cha, Ehtesham Iqbal, H. Song, Kwang-nam Choi","doi":"10.1145/3484274.3484283","DOIUrl":"https://doi.org/10.1145/3484274.3484283","url":null,"abstract":"Artificial intelligence (AI) has been developing in a variety of methods over the past decade. However most AI experts worried to build a deep or wide network because the accuracy of AI models depends heavily on the depth of the network. In general deep and wide networks are better at learning than those that are less deep and wide and wide. On the other hand deeper networks are more complex and have many disadvantages such as computational cost and system specification dependency. We propose a novel method to improve the average recall rate for small objects in the deep convolutional network in the paper. The proposed method added pre-processing layer before the network rather than stacking the networks deeper or wide. The presented pre-processing layer consists of two major parts: up-sampling and down-sampling of the data. The overall objective of up-sampling and down-sampling is to enhance the resolution of small objects in the input image. The pre-processing network improves the average recall rate of the base network to 3.56%. This experiment result depicts that the proposed method outperforms the small object detection performance. CCS CONCEPTS • Computing methodologies • Object detection","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120966421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Air pollution is a growing worldwide problem. Accurate prediction of PM2.5 concentration has a vital role to reduce the dramatic toll of air pollution on health. Due to the non-linearity and complexity of air pollution process and the influence of multiple factors, such as meteorological conditions, human activities and other chemical components, traditional pollution-related models have challenges in dealing with PM2.5 modeling. Based on atmospheric domain knowledge, we proposed a novel and interpretable deep learning model (iDeepAir) to predict hourly PM2.5 concentration by incorporating traffic data, meteorological data and air quality data. We designed feature interaction module and temporal interaction module to simulate pollution chemical reaction process and temporal accumulated process respectively, which makes the model has better understood and improves prediction accuracy of PM2.5 concentration. Compared to the best comparison model, mean absolute error (MAE) and rooted mean squared error (RMSE) were improved by 20.1% and 14.4% in 24h respectively. Furthermore, with the embedded Layerwise Relevance Propagation (LRP) algorithm, iDeepAir allows us to observe the spatial influence patterns of regional traffic emissions in a high-resolution way and evaluate the impact of traffic emissions on PM2.5 formation. Taking Shanghai as an example, we discover that although there are serious traffic emissions in some areas of Shanghai, they do not always directly aggravate air pollution, which is also affected by local buildings, meteorological conditions, and other human activities. These results show the spatial interpretability of our model and provide a quantitive decision-making basis for the government to control air pollution.
{"title":"Forecasting PM2.5 and Tracking Spatial Influence Patterns of Traffic Using Interpretable Deep Learning","authors":"Lianliang Chen, Z. Shan","doi":"10.1145/3484274.3484302","DOIUrl":"https://doi.org/10.1145/3484274.3484302","url":null,"abstract":"Air pollution is a growing worldwide problem. Accurate prediction of PM2.5 concentration has a vital role to reduce the dramatic toll of air pollution on health. Due to the non-linearity and complexity of air pollution process and the influence of multiple factors, such as meteorological conditions, human activities and other chemical components, traditional pollution-related models have challenges in dealing with PM2.5 modeling. Based on atmospheric domain knowledge, we proposed a novel and interpretable deep learning model (iDeepAir) to predict hourly PM2.5 concentration by incorporating traffic data, meteorological data and air quality data. We designed feature interaction module and temporal interaction module to simulate pollution chemical reaction process and temporal accumulated process respectively, which makes the model has better understood and improves prediction accuracy of PM2.5 concentration. Compared to the best comparison model, mean absolute error (MAE) and rooted mean squared error (RMSE) were improved by 20.1% and 14.4% in 24h respectively. Furthermore, with the embedded Layerwise Relevance Propagation (LRP) algorithm, iDeepAir allows us to observe the spatial influence patterns of regional traffic emissions in a high-resolution way and evaluate the impact of traffic emissions on PM2.5 formation. Taking Shanghai as an example, we discover that although there are serious traffic emissions in some areas of Shanghai, they do not always directly aggravate air pollution, which is also affected by local buildings, meteorological conditions, and other human activities. These results show the spatial interpretability of our model and provide a quantitive decision-making basis for the government to control air pollution.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129220701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}