Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116908
Erfan Basiri, Reza P. R. Hasanzadeh, Saman Hadi, M. Kersemans
Carbon fiber reinforced polymer (CFRP) materials, due to their specific strength and high consistency against erosion and corrosion, are widely used in industrial applications and high-tech engineering structures. However, there are also disadvantages: e.g. they are prone to different kinds of internal defects which could jeopardize the structural integrity of the CFRP material and therefore early detection of such defects can be an important task. Recently, local defect resonance (LDR), which is a subcategory of ultrasonic nondestructive testing, has been successfully used to solve this issue. However, the drawback of utilizing this technique is that the frequency at which the LDR occurs must be known. Further, the LDR-based technique has difficulty in assessing deep defects. In this paper, deep neural network (DNN) methodology is employed to remove this limitation and to acquire a better defect image retrieval process and also to achieve a model for the approximate depth estimation of such defects. In these regards, two types of defects called flat bottom holes (FBH) and barely visible impact damage (BVID) which are made in two CFRP coupons are used to evaluate the ability of the proposed method. Then, these two CFRPs are excited with a piezoelectric patch, and their corresponding laser Doppler vibrometry (LDV) response is collected through a scanning laser Doppler vibrometer (SLDV). Eventually, the superiority of our DNN-based approach is evaluated in comparison with other well-known classification methodologies.
{"title":"A DNN-based Image Retrieval Approach for Detection of Defective Area in Carbon Fiber Reinforced Polymers through LDV Data","authors":"Erfan Basiri, Reza P. R. Hasanzadeh, Saman Hadi, M. Kersemans","doi":"10.1109/MVIP49855.2020.9116908","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116908","url":null,"abstract":"Carbon fiber reinforced polymer (CFRP) materials, due to their specific strength and high consistency against erosion and corrosion, are widely used in industrial applications and high-tech engineering structures. However, there are also disadvantages: e.g. they are prone to different kinds of internal defects which could jeopardize the structural integrity of the CFRP material and therefore early detection of such defects can be an important task. Recently, local defect resonance (LDR), which is a subcategory of ultrasonic nondestructive testing, has been successfully used to solve this issue. However, the drawback of utilizing this technique is that the frequency at which the LDR occurs must be known. Further, the LDR-based technique has difficulty in assessing deep defects. In this paper, deep neural network (DNN) methodology is employed to remove this limitation and to acquire a better defect image retrieval process and also to achieve a model for the approximate depth estimation of such defects. In these regards, two types of defects called flat bottom holes (FBH) and barely visible impact damage (BVID) which are made in two CFRP coupons are used to evaluate the ability of the proposed method. Then, these two CFRPs are excited with a piezoelectric patch, and their corresponding laser Doppler vibrometry (LDV) response is collected through a scanning laser Doppler vibrometer (SLDV). Eventually, the superiority of our DNN-based approach is evaluated in comparison with other well-known classification methodologies.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134514756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116903
M. J. Seikavandi
Sensor-fusion has gained much popularity in 3Dscanning in recent years. There are a variety of sensors like depth-camera, camera, laser-scanner, ultrasonic sensor, which are widely used in the area. The robotic research community has studied ultrasound sensors for decades. While they have lost attention with the advent of laser scanners and cameras, they remain successful for special applications due to their robustness and simplicity; additionally, ultrasound measurement is more robust than depth-camera for illumination-varying scenarios and work with glassy pieces. In this work, we choose camera and ultrasonic sensor fusion method for a CNC Engraving Machine concerning their lower cost and a particular act of applying. We use a heuristic, hand-crafted fusion to prepare 3D presentation of different pieces. The output data of the camera were down-sampled to ultrasonic data scale. By using an image processing method, the image, and ultrasonic data will be used to prepare a principle scheme and final 3D map.
{"title":"Low-cost 3D scanning using ultrasonic and camera data fusion for CNC Engraving Laser-Based Machine","authors":"M. J. Seikavandi","doi":"10.1109/MVIP49855.2020.9116903","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116903","url":null,"abstract":"Sensor-fusion has gained much popularity in 3Dscanning in recent years. There are a variety of sensors like depth-camera, camera, laser-scanner, ultrasonic sensor, which are widely used in the area. The robotic research community has studied ultrasound sensors for decades. While they have lost attention with the advent of laser scanners and cameras, they remain successful for special applications due to their robustness and simplicity; additionally, ultrasound measurement is more robust than depth-camera for illumination-varying scenarios and work with glassy pieces. In this work, we choose camera and ultrasonic sensor fusion method for a CNC Engraving Machine concerning their lower cost and a particular act of applying. We use a heuristic, hand-crafted fusion to prepare 3D presentation of different pieces. The output data of the camera were down-sampled to ultrasonic data scale. By using an image processing method, the image, and ultrasonic data will be used to prepare a principle scheme and final 3D map.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"201 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116160440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116905
Fateme Bafghi, B. Shoushtarian
In recent decades, due to the groundbreaking improvements in machine vision, many daily tasks are performed by computers. One of these tasks is multiple-vehicle tracking, which is widely used in different areas such as video surveillance and traffic monitoring. This paper focuses on introducing an efficient novel approach with acceptable accuracy. This is achieved through an efficient appearance and motion model based on the features extracted from each object. For this purpose, two different approaches have been used to extract features, i.e. features extracted from a deep neural network, and traditional features. Then the results from these two approaches are compared with state-of-the-art trackers. The results are obtained by executing the methods on the UA-DETRACK benchmark. The first method led to 58.9% accuracy while the second method caused up to 15.9%. The proposed methods can still be improved by extracting more distinguishable features.
{"title":"Multiple-Vehicle Tracking in the Highway Using Appearance Model and Visual Object Tracking","authors":"Fateme Bafghi, B. Shoushtarian","doi":"10.1109/MVIP49855.2020.9116905","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116905","url":null,"abstract":"In recent decades, due to the groundbreaking improvements in machine vision, many daily tasks are performed by computers. One of these tasks is multiple-vehicle tracking, which is widely used in different areas such as video surveillance and traffic monitoring. This paper focuses on introducing an efficient novel approach with acceptable accuracy. This is achieved through an efficient appearance and motion model based on the features extracted from each object. For this purpose, two different approaches have been used to extract features, i.e. features extracted from a deep neural network, and traditional features. Then the results from these two approaches are compared with state-of-the-art trackers. The results are obtained by executing the methods on the UA-DETRACK benchmark. The first method led to 58.9% accuracy while the second method caused up to 15.9%. The proposed methods can still be improved by extracting more distinguishable features.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127394040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents an algorithm which incorporates spatial and temporal gradients for full reference video quality assessment. In the proposed method the frame-based gradient magnitude similarity deviation is calculated to form the spatial quality vector. To capture the temporal distortion, the similarity of frame difference is measured. In the proposed method, we extract the worst scores in both the spatial and temporal vectors by introducing the variable-length temporal window for max-pooling operation. The resultant vectors are then combined to form the final score. The performance of the proposed method is evaluated on LIVE SD and EPFL- PoliMI datasets. The results clearly illustrate that, despite the computational efficiency, the predictions are highly correlated with human visual system.
{"title":"Exploring the Gradient for Video Quality Assessment","authors":"Hossein Motamednia, Pooryaa Cheraaqee, Azadeh Mansouri","doi":"10.1109/MVIP49855.2020.9116869","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116869","url":null,"abstract":"This paper presents an algorithm which incorporates spatial and temporal gradients for full reference video quality assessment. In the proposed method the frame-based gradient magnitude similarity deviation is calculated to form the spatial quality vector. To capture the temporal distortion, the similarity of frame difference is measured. In the proposed method, we extract the worst scores in both the spatial and temporal vectors by introducing the variable-length temporal window for max-pooling operation. The resultant vectors are then combined to form the final score. The performance of the proposed method is evaluated on LIVE SD and EPFL- PoliMI datasets. The results clearly illustrate that, despite the computational efficiency, the predictions are highly correlated with human visual system.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130128600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116899
Maryam Karimi, Erfan Entezami
Transmission, saving and many processing methods cause different damage in images. Image Quality Assessment (IQA) is necessary to benchmark processing algorithms, to optimize them, and to monitor the quality of images in quality control systems. Traditional quality metrics have low correlations with subjective perception. The key problem is to evaluate the distorted images as human do. Subjective quality assessment is more reliable but is cumbersome and time-consuming, so it is impossible to embed it in online applications. Therefore, many objective perceptual IQA models have been developed until now. Content-aware retargeting methods aim to adapt source images to target display devices with different sizes and aspect ratios so that salient areas will be less distorted. The size mismatch and the completely different distortions caused by retargeting have made common IQA methods useless in this area. Therefore, retargeted Image Quality Assessment (RIQA) methods are designed for this purpose. The quality of retargeted images is different depending to image content and retargeting algorithm. This paper provides a literature review and a new categorization of the current subjective and objective retargeted image quality measures. Also, we intend to compare and analyze the performance of these measures. It is demonstrated that the performance of RIQA methods can be further improved by using high-level descriptors in addition to low-level ones.
{"title":"Quality Assessment for Retargeted Images: A Review","authors":"Maryam Karimi, Erfan Entezami","doi":"10.1109/MVIP49855.2020.9116899","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116899","url":null,"abstract":"Transmission, saving and many processing methods cause different damage in images. Image Quality Assessment (IQA) is necessary to benchmark processing algorithms, to optimize them, and to monitor the quality of images in quality control systems. Traditional quality metrics have low correlations with subjective perception. The key problem is to evaluate the distorted images as human do. Subjective quality assessment is more reliable but is cumbersome and time-consuming, so it is impossible to embed it in online applications. Therefore, many objective perceptual IQA models have been developed until now. Content-aware retargeting methods aim to adapt source images to target display devices with different sizes and aspect ratios so that salient areas will be less distorted. The size mismatch and the completely different distortions caused by retargeting have made common IQA methods useless in this area. Therefore, retargeted Image Quality Assessment (RIQA) methods are designed for this purpose. The quality of retargeted images is different depending to image content and retargeting algorithm. This paper provides a literature review and a new categorization of the current subjective and objective retargeted image quality measures. Also, we intend to compare and analyze the performance of these measures. It is demonstrated that the performance of RIQA methods can be further improved by using high-level descriptors in addition to low-level ones.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122624111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/mvip49855.2020.9116904
{"title":"MVIP 2020 Table of Contents","authors":"","doi":"10.1109/mvip49855.2020.9116904","DOIUrl":"https://doi.org/10.1109/mvip49855.2020.9116904","url":null,"abstract":"","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"2008 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123916642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116872
A. Ghofrani, Rahil Mahdian Toroghi, Seyed Mojtaba Tabatabaie
Face anti-spoofing aims at identifying the real face, as well as the fake one, and gains a high attention in security sensitive applications, liveness detection, fingerprinting, and so on. In this paper, we address the anti-spoofing problem by proposing two end-to-end systems of convolutional neural networks. One model is developed based on the EfficientNet B0 network which has been modified in the final dense layers. The second one, is a very light model of the MobileNet V2, which has been contracted, modified and retrained efficiently on the data being created based on the Rose-Youtu dataset, for this purpose. The experiments show that, both of the proposed architectures achieve remarkable results on detecting the real and fake images of the face input data. The experiments clearly show that the heavy-weight model could be efficiently employed in server side implementations, whereas the low-weight model could be easily implemented on the hand-held devices and both perform perfectly well using merely RGB input images.
{"title":"Attention-Based Face AntiSpoofing of RGB Camera using a Minimal End-2-End Neural Network","authors":"A. Ghofrani, Rahil Mahdian Toroghi, Seyed Mojtaba Tabatabaie","doi":"10.1109/MVIP49855.2020.9116872","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116872","url":null,"abstract":"Face anti-spoofing aims at identifying the real face, as well as the fake one, and gains a high attention in security sensitive applications, liveness detection, fingerprinting, and so on. In this paper, we address the anti-spoofing problem by proposing two end-to-end systems of convolutional neural networks. One model is developed based on the EfficientNet B0 network which has been modified in the final dense layers. The second one, is a very light model of the MobileNet V2, which has been contracted, modified and retrained efficiently on the data being created based on the Rose-Youtu dataset, for this purpose. The experiments show that, both of the proposed architectures achieve remarkable results on detecting the real and fake images of the face input data. The experiments clearly show that the heavy-weight model could be efficiently employed in server side implementations, whereas the low-weight model could be easily implemented on the hand-held devices and both perform perfectly well using merely RGB input images.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123727142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116927
Vahid Hajihashemi, Mohammad Mehdi Arab Ameri, A. Alavi Gharahbagh, Hassan Pahlouvary
Digital holography is one of the 3D imaging systems that suffer Speckle noise. With respect to the importance of quality in 3D images, we develop an efficient general-purpose blind/no-reference holography image quality assessment metric for evaluating the quality of digital holography images. The main novelty of our approach to blind image quality assessment is based on the hypothesis that each digital holography has statistical properties that are changing in the presence of speckle noise. This change can be measured by some full reference metrics that are applied to input image and a new image, which were made by adding a known level of speckle noise to input image. These full reference measurements have the ability of identifying the distortion afflicting the input image and perform a no-reference quality assessment. In fact, adding noise to input image leads to quality loss, and the value of this loss give information about the input image quality. Finally, the result of the proposed method in estimating the quality of digital holography images were compared with some well-known full reference methods in order to demonstrate its ability.
{"title":"A weighted, statistical based, No-Reference metric for holography Image Quality Assessment","authors":"Vahid Hajihashemi, Mohammad Mehdi Arab Ameri, A. Alavi Gharahbagh, Hassan Pahlouvary","doi":"10.1109/MVIP49855.2020.9116927","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116927","url":null,"abstract":"Digital holography is one of the 3D imaging systems that suffer Speckle noise. With respect to the importance of quality in 3D images, we develop an efficient general-purpose blind/no-reference holography image quality assessment metric for evaluating the quality of digital holography images. The main novelty of our approach to blind image quality assessment is based on the hypothesis that each digital holography has statistical properties that are changing in the presence of speckle noise. This change can be measured by some full reference metrics that are applied to input image and a new image, which were made by adding a known level of speckle noise to input image. These full reference measurements have the ability of identifying the distortion afflicting the input image and perform a no-reference quality assessment. In fact, adding noise to input image leads to quality loss, and the value of this loss give information about the input image quality. Finally, the result of the proposed method in estimating the quality of digital holography images were compared with some well-known full reference methods in order to demonstrate its ability.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114666306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116918
Hadi Moazen, M. Jamzad
Melanoma is the deadliest form of skin cancer if not treated early. The best way to cure melanoma is to treat it in its earliest stage of development. Since melanoma is similar to benign moles in its shape and appearance, it is often mistaken for moles and left untreated. Automatic melanoma detection is an essential way to increase the survival rate of patients by detecting melanoma in its early stages. In this paper, a new method for automatic diagnosis of melanoma using segmented dermatoscopic images is provided. Almost all related methods follow similar approaches but using different features. We have introduced several new features which could improve the accuracy of diagnosing melanoma. For evaluation we have implemented and tested all methods on the ISIC archive, which is the largest openly available dataset of dermatoscopic melanoma images. Our method outperforms most recent previous works’ accuracy on the ISIC dataset by 1.5 percent. It also achieves a 2.32-point higher F1 score while obtaining a comparable sensitivity.
{"title":"Automatic Skin Cancer (Melanoma) Detection by Processing Dermatoscopic images","authors":"Hadi Moazen, M. Jamzad","doi":"10.1109/MVIP49855.2020.9116918","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116918","url":null,"abstract":"Melanoma is the deadliest form of skin cancer if not treated early. The best way to cure melanoma is to treat it in its earliest stage of development. Since melanoma is similar to benign moles in its shape and appearance, it is often mistaken for moles and left untreated. Automatic melanoma detection is an essential way to increase the survival rate of patients by detecting melanoma in its early stages. In this paper, a new method for automatic diagnosis of melanoma using segmented dermatoscopic images is provided. Almost all related methods follow similar approaches but using different features. We have introduced several new features which could improve the accuracy of diagnosing melanoma. For evaluation we have implemented and tested all methods on the ISIC archive, which is the largest openly available dataset of dermatoscopic melanoma images. Our method outperforms most recent previous works’ accuracy on the ISIC dataset by 1.5 percent. It also achieves a 2.32-point higher F1 score while obtaining a comparable sensitivity.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121435495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116875
Nader Karimi Bavandpour, S. Kasaei
In this paper, a novel method for capturing the information of a powerful and trained deep convolutional neural network and distilling it into a training smaller network is proposed. This is the first time that a saliency map method is employed to extract useful knowledge from a convolutional neural network for distillation. This method, despite of many others which work on final layers, can successfully extract suitable information for distillation from intermediate layers of a network by making class specific attention maps and then forcing the student network to mimic producing those attentions. This novel knowledge distillation training is implemented using state-of-the-art DeepLab and PSPNet segmentation networks and its effectiveness is shown by experiments on the standard Pascal Voc 2012 dataset.
{"title":"Class Attention Map Distillation for Efficient Semantic Segmentation","authors":"Nader Karimi Bavandpour, S. Kasaei","doi":"10.1109/MVIP49855.2020.9116875","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116875","url":null,"abstract":"In this paper, a novel method for capturing the information of a powerful and trained deep convolutional neural network and distilling it into a training smaller network is proposed. This is the first time that a saliency map method is employed to extract useful knowledge from a convolutional neural network for distillation. This method, despite of many others which work on final layers, can successfully extract suitable information for distillation from intermediate layers of a network by making class specific attention maps and then forcing the student network to mimic producing those attentions. This novel knowledge distillation training is implemented using state-of-the-art DeepLab and PSPNet segmentation networks and its effectiveness is shown by experiments on the standard Pascal Voc 2012 dataset.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124145571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}