Multi-focus image fusion is a popular research direction of image fusion, however, because of the complexity of the image, it has always been difficult in scientific research to accurately judge the clear area, especially in the clear and fuzzy edge of the complex environment. To better determine the focus area of the source image and obtain a clear image, the improved U2-Net model is used to analyze the focus area, and the multi-scale feature extraction scheme is used to generate the decision map. At the same time, the algorithm uses the NYU-D2 depth image as the training dataset in this paper. To achieve a better training effect, the method of image segmentation, Graph Cut, is combined with manual adjustment to make the training dataset. The experimental results show that comparedwith several existing latest algorithms, this fusionmethod can obtain accurate decision diagrams and has better performance in visual perception and objective evaluation.
{"title":"A multi-focus image fusion method based on nested U-Net","authors":"Wangping Zhou, Yuanqing Wu, Hao Wu","doi":"10.1145/3511176.3511188","DOIUrl":"https://doi.org/10.1145/3511176.3511188","url":null,"abstract":"Multi-focus image fusion is a popular research direction of image fusion, however, because of the complexity of the image, it has always been difficult in scientific research to accurately judge the clear area, especially in the clear and fuzzy edge of the complex environment. To better determine the focus area of the source image and obtain a clear image, the improved U2-Net model is used to analyze the focus area, and the multi-scale feature extraction scheme is used to generate the decision map. At the same time, the algorithm uses the NYU-D2 depth image as the training dataset in this paper. To achieve a better training effect, the method of image segmentation, Graph Cut, is combined with manual adjustment to make the training dataset. The experimental results show that comparedwith several existing latest algorithms, this fusionmethod can obtain accurate decision diagrams and has better performance in visual perception and objective evaluation.","PeriodicalId":120826,"journal":{"name":"International Conference on Video and Image Processing","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116174524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sub-pixel convolutional neural network is efficient for image super-resolution. However, the images generated are relatively smooth. Improving the learning ability of high-frequency features is of great significance for sub-pixel convolutional neural network to get better performance. In the paper, we propose an improved algorithm of sub-pixel convolutional neural network based on high-frequency feature learning for image super-resolution, it optimizes the traditional sub-pixel convolutional structure. Firstly we introduce a residual convolutional layer in the generation net. it assigns the residual factor to each sub-pixel feature map and forces each pixel feature map to adaptively use the input information. Furthermore, a method for high frequency feature mapping is proposed. During image super-resolution training stage, the multi-task learning function, combining the pixel-level loss function with high-frequency contrast loss function, make the generation images getting closer to the target super-resolution images in high-frequency domain. The experiments on CelebA dataset show that our proposed method can effectively improve the quality of super-resolution images by contrast to the traditional sub-pixel convolutional neural network.
{"title":"High-Frequency Feature Learning in Image Super-Resolution with Sub-Pixel Convolutional Neural Network","authors":"Xiao-Yuan Jiang, Xi-Hai Chen","doi":"10.1145/3376067.3376099","DOIUrl":"https://doi.org/10.1145/3376067.3376099","url":null,"abstract":"Sub-pixel convolutional neural network is efficient for image super-resolution. However, the images generated are relatively smooth. Improving the learning ability of high-frequency features is of great significance for sub-pixel convolutional neural network to get better performance. In the paper, we propose an improved algorithm of sub-pixel convolutional neural network based on high-frequency feature learning for image super-resolution, it optimizes the traditional sub-pixel convolutional structure. Firstly we introduce a residual convolutional layer in the generation net. it assigns the residual factor to each sub-pixel feature map and forces each pixel feature map to adaptively use the input information. Furthermore, a method for high frequency feature mapping is proposed. During image super-resolution training stage, the multi-task learning function, combining the pixel-level loss function with high-frequency contrast loss function, make the generation images getting closer to the target super-resolution images in high-frequency domain. The experiments on CelebA dataset show that our proposed method can effectively improve the quality of super-resolution images by contrast to the traditional sub-pixel convolutional neural network.","PeriodicalId":120826,"journal":{"name":"International Conference on Video and Image Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116358734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengfei Shi, Li Wan, Zhengdong Huang, Tifan Xiong
For the problem of low-rank tensor completion, rank estimation plays an extremely important role. And among some outstanding researches, nuclear norm is often used as a substitute of rank in the optimization due to its convex property. However, recent advances show that some non-convex functions could approximate the rank better, which can significantly improve the precision of the algorithm. While, the complexity of non-convex functions also lead to much higher computation cost, especially in handling large scale matrices from the mode-n unfolding of a tensor. This paper proposes a mixture model for tensor completion by combining logDet function with Tucker decomposition to achieve a better performance in precision and a lower cost in computation as well. In the implementation of the method, alternating direction method of multipliers (ADMM) is employed to obtain the optimal tensor completion. Experiments on image restoration are carried out to validate the effective and efficiency of the method.
{"title":"An Efficient Non-convex Mixture Method for Low-rank Tensor Completion","authors":"Chengfei Shi, Li Wan, Zhengdong Huang, Tifan Xiong","doi":"10.1145/3301506.3301516","DOIUrl":"https://doi.org/10.1145/3301506.3301516","url":null,"abstract":"For the problem of low-rank tensor completion, rank estimation plays an extremely important role. And among some outstanding researches, nuclear norm is often used as a substitute of rank in the optimization due to its convex property. However, recent advances show that some non-convex functions could approximate the rank better, which can significantly improve the precision of the algorithm. While, the complexity of non-convex functions also lead to much higher computation cost, especially in handling large scale matrices from the mode-n unfolding of a tensor. This paper proposes a mixture model for tensor completion by combining logDet function with Tucker decomposition to achieve a better performance in precision and a lower cost in computation as well. In the implementation of the method, alternating direction method of multipliers (ADMM) is employed to obtain the optimal tensor completion. Experiments on image restoration are carried out to validate the effective and efficiency of the method.","PeriodicalId":120826,"journal":{"name":"International Conference on Video and Image Processing","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124635426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a vision-based approach to lane identification and estimation of service rate, arrival rate, and queue saturation. The method is based on analyzing object trajectories produced. Experiments are demonstrated by applying the proposed method to different traffic scenarios: light, moderate, and heavy. The accuracy of the test is examined by comparing the queue analysis results against the ground truth. Results show that the approach is able to yield satisfactory results when the vehicle movement stays within the lane. However, the error increases when vehicle movement overlaps or switches lanes. In conclusion, the algorithm works to identify the lane membership of trajectories under different conditions. The proposed method could also be used to automate the estimation of traffic congestion levels at sections covered by surveillance cameras.
{"title":"Vision-Based Analysis for Queue Characteristics and Lane Identification","authors":"C. G. V. Ya-On, Jonathan Paul C. Cempron, J. Ilao","doi":"10.1145/3447450.3447474","DOIUrl":"https://doi.org/10.1145/3447450.3447474","url":null,"abstract":"This paper presents a vision-based approach to lane identification and estimation of service rate, arrival rate, and queue saturation. The method is based on analyzing object trajectories produced. Experiments are demonstrated by applying the proposed method to different traffic scenarios: light, moderate, and heavy. The accuracy of the test is examined by comparing the queue analysis results against the ground truth. Results show that the approach is able to yield satisfactory results when the vehicle movement stays within the lane. However, the error increases when vehicle movement overlaps or switches lanes. In conclusion, the algorithm works to identify the lane membership of trajectories under different conditions. The proposed method could also be used to automate the estimation of traffic congestion levels at sections covered by surveillance cameras.","PeriodicalId":120826,"journal":{"name":"International Conference on Video and Image Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125590478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In light of the rapid progress in building smart cities and smart traffic systems, the need for an accurate and real-time counting vehicles system has become a very urgent need. Finding a robust and accurate counting system is a challenge, as the system must detect, classify and track multi vehicles in complex and dynamic scene situations, different models and classes, and various traffic densities. Several hardware and software systems have emerged for this purpose and their results have varied. In recent years, and due to the great growth in computational capacities and deep learning techniques, deep learning based vehicle counting systems have delivered an impressive performance at low costs. In this study, several state-of-the-art detection and tracking algorithms are studied and combined with each other to render different models. These models are applied in automatic vehicle counting frameworks in traffic videos to assess how accurate are their results against the ground truth. Experiments on these models present the existing challenges that hinder their ability to extract the distinctive object features and thus undermine their efficiency such as problems of occlusion, large scale objects detection, illumination, and various weather conditions. The study revealed that the detectors coupled with the Deep Sort tracker, such as YOLOv4, Detectron2 and CenterNet, achieved the best results compared to the rest of the models.
{"title":"Vehicle Counting Using Detecting-Tracking Combinations: A Comparative Analysis","authors":"Ala Alsanabani, Mohammed A. Ahmed, Ahmad Al Smadi","doi":"10.1145/3447450.3447458","DOIUrl":"https://doi.org/10.1145/3447450.3447458","url":null,"abstract":"In light of the rapid progress in building smart cities and smart traffic systems, the need for an accurate and real-time counting vehicles system has become a very urgent need. Finding a robust and accurate counting system is a challenge, as the system must detect, classify and track multi vehicles in complex and dynamic scene situations, different models and classes, and various traffic densities. Several hardware and software systems have emerged for this purpose and their results have varied. In recent years, and due to the great growth in computational capacities and deep learning techniques, deep learning based vehicle counting systems have delivered an impressive performance at low costs. In this study, several state-of-the-art detection and tracking algorithms are studied and combined with each other to render different models. These models are applied in automatic vehicle counting frameworks in traffic videos to assess how accurate are their results against the ground truth. Experiments on these models present the existing challenges that hinder their ability to extract the distinctive object features and thus undermine their efficiency such as problems of occlusion, large scale objects detection, illumination, and various weather conditions. The study revealed that the detectors coupled with the Deep Sort tracker, such as YOLOv4, Detectron2 and CenterNet, achieved the best results compared to the rest of the models.","PeriodicalId":120826,"journal":{"name":"International Conference on Video and Image Processing","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127665499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}