S. Miyata, H. Saito, Kosuke Takahashi, Dan Mikami, Mariko Isogawa, H. Kimata
This paper proposes a method for reconstructing 3D ball trajectories by using multiple temporally and geometrically uncalibrated cameras. To use cameras to measure the trajectory of a fast-moving object, such as a ball thrown by a pitcher, the cameras must be temporally synchronized and their position and orientation should be calibrated. In some cases, these conditions cannot be met, e.g., one cannot geometrically calibrate cameras when one cannot step into a baseball stadium. The basic idea of the proposed method is to use a ball captured by multiple cameras as a corresponding point. The method first detects a ball. Then, it estimates temporal difference between cameras. After that, the ball positions are used as corresponding points for geometrically calibrating the cameras. Experiments using actual pitching videos verify the effectiveness of our method.
{"title":"Ball 3D Trajectory Reconstruction without Preliminary Temporal and Geometrical Camera Calibration","authors":"S. Miyata, H. Saito, Kosuke Takahashi, Dan Mikami, Mariko Isogawa, H. Kimata","doi":"10.1109/CVPRW.2017.26","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.26","url":null,"abstract":"This paper proposes a method for reconstructing 3D ball trajectories by using multiple temporally and geometrically uncalibrated cameras. To use cameras to measure the trajectory of a fast-moving object, such as a ball thrown by a pitcher, the cameras must be temporally synchronized and their position and orientation should be calibrated. In some cases, these conditions cannot be met, e.g., one cannot geometrically calibrate cameras when one cannot step into a baseball stadium. The basic idea of the proposed method is to use a ball captured by multiple cameras as a corresponding point. The method first detects a ball. Then, it estimates temporal difference between cameras. After that, the ball positions are used as corresponding points for geometrically calibrating the cameras. Experiments using actual pitching videos verify the effectiveness of our method.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"164-169"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81705468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Zafeiriou, D. Kollias, M. Nicolaou, A. Papaioannou, Guoying Zhao, I. Kotsia
The Affect-in-the-Wild (Aff-Wild) Challenge proposes a new comprehensive benchmark for assessing the performance of facial affect/behaviour analysis/understanding 'in-the-wild'. The Aff-wild benchmark contains about 300 videos (over 2,000 minutes of data) annotated with regards to valence and arousal, all captured 'in-the-wild' (the main source being Youtube videos). The paper presents the database description, the experimental set up, the baseline method used for the Challenge and finally the summary of the performance of the different methods submitted to the Affect-in-the-Wild Challenge for Valence and Arousal estimation. The challenge demonstrates that meticulously designed deep neural networks can achieve very good performance when trained with in-the-wild data.
{"title":"Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge","authors":"S. Zafeiriou, D. Kollias, M. Nicolaou, A. Papaioannou, Guoying Zhao, I. Kotsia","doi":"10.1109/CVPRW.2017.248","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.248","url":null,"abstract":"The Affect-in-the-Wild (Aff-Wild) Challenge proposes a new comprehensive benchmark for assessing the performance of facial affect/behaviour analysis/understanding 'in-the-wild'. The Aff-wild benchmark contains about 300 videos (over 2,000 minutes of data) annotated with regards to valence and arousal, all captured 'in-the-wild' (the main source being Youtube videos). The paper presents the database description, the experimental set up, the baseline method used for the Challenge and finally the summary of the performance of the different methods submitted to the Affect-in-the-Wild Challenge for Valence and Arousal estimation. The challenge demonstrates that meticulously designed deep neural networks can achieve very good performance when trained with in-the-wild data.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"120 1","pages":"1980-1987"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85705189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Johannsen, Katrin Honauer, Bastian Goldlücke, A. Alperovich, F. Battisti, Yunsu Bok, Michele Brizzi, M. Carli, Gyeongmin Choe, M. Diebold, M. Gutsche, Hae-Gon Jeon, In-So Kweon, Jaesik Park, Jinsun Park, H. Schilling, Hao Sheng, Lipeng Si, Michael Strecke, Antonin Sulc, Yu-Wing Tai, Qing Wang, Tingxian Wang, S. Wanner, Z. Xiong, Jingyi Yu, Shuo Zhang, Hao Zhu
This paper presents the results of the depth estimation challenge for dense light fields, which took place at the second workshop on Light Fields for Computer Vision (LF4CV) in conjunction with CVPR 2017. The challenge consisted of submission to a recent benchmark [7], which allows a thorough performance analysis. While individual results are readily available on the benchmark web page http://www.lightfield-analysis.net, we take this opportunity to give a detailed overview of the current participants. Based on the algorithms submitted to our challenge, we develop a taxonomy of light field disparity estimation algorithms and give a report on the current state-ofthe- art. In addition, we include more comparative metrics, and discuss the relative strengths and weaknesses of the algorithms. Thus, we obtain a snapshot of where light field algorithm development stands at the moment and identify aspects with potential for further improvement.
{"title":"A Taxonomy and Evaluation of Dense Light Field Depth Estimation Algorithms","authors":"O. Johannsen, Katrin Honauer, Bastian Goldlücke, A. Alperovich, F. Battisti, Yunsu Bok, Michele Brizzi, M. Carli, Gyeongmin Choe, M. Diebold, M. Gutsche, Hae-Gon Jeon, In-So Kweon, Jaesik Park, Jinsun Park, H. Schilling, Hao Sheng, Lipeng Si, Michael Strecke, Antonin Sulc, Yu-Wing Tai, Qing Wang, Tingxian Wang, S. Wanner, Z. Xiong, Jingyi Yu, Shuo Zhang, Hao Zhu","doi":"10.1109/CVPRW.2017.226","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.226","url":null,"abstract":"This paper presents the results of the depth estimation challenge for dense light fields, which took place at the second workshop on Light Fields for Computer Vision (LF4CV) in conjunction with CVPR 2017. The challenge consisted of submission to a recent benchmark [7], which allows a thorough performance analysis. While individual results are readily available on the benchmark web page http://www.lightfield-analysis.net, we take this opportunity to give a detailed overview of the current participants. Based on the algorithms submitted to our challenge, we develop a taxonomy of light field disparity estimation algorithms and give a report on the current state-ofthe- art. In addition, we include more comparative metrics, and discuss the relative strengths and weaknesses of the algorithms. Thus, we obtain a snapshot of where light field algorithm development stands at the moment and identify aspects with potential for further improvement.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"70 1","pages":"1795-1812"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73890287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haiying Guan, Paul Lee, A. Dienstfrey, M. Theofanos, C. Lamp, Brian C. Stanton, Matthew T. Schwarz
Latent fingerprints obtained from crime scenes are rarely immediately suitable for identification purposes. Instead, most latent fingerprint images must be preprocessed to enhance the fingerprint information held within the digital image, while suppressing interference arising from noise and otherwise unwanted image features. In the following we present results of our ongoing research to assess this critical step in the forensic workflow. Previously we discussed the creation of a new database of latent fingerprint images to support such research. The new contributions of this paper are twofold. First, we implement a study in which a group of trained Latent Print Examiners provide Extended Feature Set markups of all images. We discuss the experimental design of this study, and its execution. Next, we propose metrics for measuring the increase of fingerprint information provided by latent fingerprint image preprocessing, and we present preliminary analysis of these metrics when applied to the images in our database. We consider formally defined quality scales (Good, Bad, Ugly), and minutiae identifications of latent fingerprint images before and after preprocessing. All analyses show that latent fingerprint image preprocessing results in a statistically significant increase in fingerprint information and quality.
{"title":"Analysis, Comparison, and Assessment of Latent Fingerprint Image Preprocessing","authors":"Haiying Guan, Paul Lee, A. Dienstfrey, M. Theofanos, C. Lamp, Brian C. Stanton, Matthew T. Schwarz","doi":"10.1109/CVPRW.2017.91","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.91","url":null,"abstract":"Latent fingerprints obtained from crime scenes are rarely immediately suitable for identification purposes. Instead, most latent fingerprint images must be preprocessed to enhance the fingerprint information held within the digital image, while suppressing interference arising from noise and otherwise unwanted image features. In the following we present results of our ongoing research to assess this critical step in the forensic workflow. Previously we discussed the creation of a new database of latent fingerprint images to support such research. The new contributions of this paper are twofold. First, we implement a study in which a group of trained Latent Print Examiners provide Extended Feature Set markups of all images. We discuss the experimental design of this study, and its execution. Next, we propose metrics for measuring the increase of fingerprint information provided by latent fingerprint image preprocessing, and we present preliminary analysis of these metrics when applied to the images in our database. We consider formally defined quality scales (Good, Bad, Ugly), and minutiae identifications of latent fingerprint images before and after preprocessing. All analyses show that latent fingerprint image preprocessing results in a statistically significant increase in fingerprint information and quality.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"196 1","pages":"628-635"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77436937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rectified linear units (ReLU) are known to be effective in many deep learning methods. Inspired by linear-mapping technique used in other super-resolution (SR) methods, we reinterpret ReLU into point-wise multiplication of an identity mapping and a switch, and finally present a novel nonlinear unit, called a selection unit (SU). While conventional ReLU has no direct control through which data is passed, the proposed SU optimizes this on-off switching control, and is therefore capable of better handling nonlinearity functionality than ReLU in a more flexible way. Our proposed deep network with SUs, called SelNet, was top-5th ranked in NTIRE2017 Challenge, which has a much lower computation complexity compared to the top-4 entries. Further experiment results show that our proposed SelNet outperforms our baseline only with ReLU (without SUs), and other state-of-the-art deep-learning-based SR methods.
{"title":"A Deep Convolutional Neural Network with Selection Units for Super-Resolution","authors":"Jae-Seok Choi, Munchurl Kim","doi":"10.1109/CVPRW.2017.153","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.153","url":null,"abstract":"Rectified linear units (ReLU) are known to be effective in many deep learning methods. Inspired by linear-mapping technique used in other super-resolution (SR) methods, we reinterpret ReLU into point-wise multiplication of an identity mapping and a switch, and finally present a novel nonlinear unit, called a selection unit (SU). While conventional ReLU has no direct control through which data is passed, the proposed SU optimizes this on-off switching control, and is therefore capable of better handling nonlinearity functionality than ReLU in a more flexible way. Our proposed deep network with SUs, called SelNet, was top-5th ranked in NTIRE2017 Challenge, which has a much lower computation complexity compared to the top-4 entries. Further experiment results show that our proposed SelNet outperforms our baseline only with ReLU (without SUs), and other state-of-the-art deep-learning-based SR methods.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"37 4 1","pages":"1150-1156"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80293065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Perot, M. Jaritz, Marin Toromanoff, Raoul de Charette
We address the problem of autonomous race car driving. Using a recent rally game (WRC6) with realistic physics and graphics we train an Asynchronous Actor Critic (A3C) in an end-to-end fashion and propose an improved reward function to learn faster. The network is trained simultaneously on three very different tracks (snow, mountain, and coast) with various road structures, graphics and physics. Despite the more complex environments the trained agent learns significant features and exhibits good performance while driving in a more stable way than existing end-to-end approaches.
{"title":"End-to-End Driving in a Realistic Racing Game with Deep Reinforcement Learning","authors":"E. Perot, M. Jaritz, Marin Toromanoff, Raoul de Charette","doi":"10.1109/CVPRW.2017.64","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.64","url":null,"abstract":"We address the problem of autonomous race car driving. Using a recent rally game (WRC6) with realistic physics and graphics we train an Asynchronous Actor Critic (A3C) in an end-to-end fashion and propose an improved reward function to learn faster. The network is trained simultaneously on three very different tracks (snow, mountain, and coast) with various road structures, graphics and physics. Despite the more complex environments the trained agent learns significant features and exhibits good performance while driving in a more stable way than existing end-to-end approaches.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"103 1","pages":"474-475"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89931752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic infrared target recognition (ATR) is a traditionally unsolved problem in military applications because of the wide range of infrared (IR) image variations and limited number of training images, which is caused by various 3D target poses, non-cooperative weather conditions, and difficult target acquisition environments. Recently, deep convolutional neural network-based approaches in RGB images (RGB-CNN) showed breakthrough performance in computer vision problems, such as object detection and classification. The direct use of the RGB-CNN to IR ATR problem fails to work because of the IR database problems. This paper presents a novel infrared variation-optimized deep convolutional neural network (IVO-CNN) by considering database management, such as increasing the database by a thermal simulator, controlling the image contrast automatically and suppressing the thermal noise to reduce the effects of infrared image variations in deep convolutional neural network-based automatic ground target recognition. The experimental results on the synthesized infrared images generated by the thermal simulator (OKTAL-SE) validated the feasibility of IVO-CNN for military ATR applications.
{"title":"Infrared Variation Optimized Deep Convolutional Neural Network for Robust Automatic Ground Target Recognition","authors":"Sungho Kim, Woo‐Jin Song, Sohyeon Kim","doi":"10.1109/CVPRW.2017.30","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.30","url":null,"abstract":"Automatic infrared target recognition (ATR) is a traditionally unsolved problem in military applications because of the wide range of infrared (IR) image variations and limited number of training images, which is caused by various 3D target poses, non-cooperative weather conditions, and difficult target acquisition environments. Recently, deep convolutional neural network-based approaches in RGB images (RGB-CNN) showed breakthrough performance in computer vision problems, such as object detection and classification. The direct use of the RGB-CNN to IR ATR problem fails to work because of the IR database problems. This paper presents a novel infrared variation-optimized deep convolutional neural network (IVO-CNN) by considering database management, such as increasing the database by a thermal simulator, controlling the image contrast automatically and suppressing the thermal noise to reduce the effects of infrared image variations in deep convolutional neural network-based automatic ground target recognition. The experimental results on the synthesized infrared images generated by the thermal simulator (OKTAL-SE) validated the feasibility of IVO-CNN for military ATR applications.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"15 1","pages":"195-202"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87390410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hrushikesh Garud, S. Karri, D. Sheet, J. Chatterjee, M. Mahadevappa, A. Ray, Arindam Ghosh, A. Maity
Fine needle aspiration cytology is commonly used for diagnosis of breast cancer, with traditional practice being based on the subjective visual assessment of the breast cytopathology cell samples under a microscope to evaluate the state of various cytological features. Therefore, there are many challenges in maintaining consistency and reproducibility of findings. However, digital imaging and computational aid in diagnosis can improve the diagnostic accuracy and reduce the effective workload of pathologists. This paper presents a deep convolutional neural network (CNN) based classification approach for the diagnosis of the cell samples using their microscopic high-magnification multi-views. The proposed approach has been tested using GoogLeNet architecture of CNN on an image dataset of 37 breast cytopathology samples (24 benign and 13 malignant), where the network was trained using images of ~54% cell samples and tested on the rest, achieving 89.7% mean accuracy in 8 fold validation.
{"title":"High-Magnification Multi-views Based Classification of Breast Fine Needle Aspiration Cytology Cell Samples Using Fusion of Decisions from Deep Convolutional Networks","authors":"Hrushikesh Garud, S. Karri, D. Sheet, J. Chatterjee, M. Mahadevappa, A. Ray, Arindam Ghosh, A. Maity","doi":"10.1109/CVPRW.2017.115","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.115","url":null,"abstract":"Fine needle aspiration cytology is commonly used for diagnosis of breast cancer, with traditional practice being based on the subjective visual assessment of the breast cytopathology cell samples under a microscope to evaluate the state of various cytological features. Therefore, there are many challenges in maintaining consistency and reproducibility of findings. However, digital imaging and computational aid in diagnosis can improve the diagnostic accuracy and reduce the effective workload of pathologists. This paper presents a deep convolutional neural network (CNN) based classification approach for the diagnosis of the cell samples using their microscopic high-magnification multi-views. The proposed approach has been tested using GoogLeNet architecture of CNN on an image dataset of 37 breast cytopathology samples (24 benign and 13 malignant), where the network was trained using images of ~54% cell samples and tested on the rest, achieving 89.7% mean accuracy in 8 fold validation.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"7 1","pages":"828-833"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88358026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present an anomaly detection system based on an autonomous robot performing a patrol task. Using a generative adversarial network (GAN), we compare the robot's current view with a learned model of normality. Our preliminary experimental results show that the approach is well suited for anomaly detection, providing efficient results with a low false positive rate.
{"title":"Finding Anomalies with Generative Adversarial Networks for a Patrolbot","authors":"W. Lawson, Esube Bekele, Keith Sullivan","doi":"10.1109/CVPRW.2017.68","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.68","url":null,"abstract":"We present an anomaly detection system based on an autonomous robot performing a patrol task. Using a generative adversarial network (GAN), we compare the robot's current view with a learned model of normality. Our preliminary experimental results show that the approach is well suited for anomaly detection, providing efficient results with a low false positive rate.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"484-485"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90439113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, S. Morishima
In this paper, we propose a video summarization system for volleyball videos. Our system automatically detects rally scenes as self-consumable video segments and evaluates rally-rank for each rally scene to decide priority. In the priority decision, features representing the contents of the game are necessary; however such features have not been considered in most previous methods. Although several visual features such as the position of a ball and players should be used, acquisition of such features is still non-robust and unreliable in low resolution or low frame rate volleyball videos. Instead, we utilize the court transition information caused by camera operation. Experimental results demonstrate the robustness of our rally scene detection and the effectiveness of our rally-rank to reflect viewers' preferences over previous methods.
{"title":"Court-Based Volleyball Video Summarization Focusing on Rally Scene","authors":"Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, S. Morishima","doi":"10.1109/CVPRW.2017.28","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.28","url":null,"abstract":"In this paper, we propose a video summarization system for volleyball videos. Our system automatically detects rally scenes as self-consumable video segments and evaluates rally-rank for each rally scene to decide priority. In the priority decision, features representing the contents of the game are necessary; however such features have not been considered in most previous methods. Although several visual features such as the position of a ball and players should be used, acquisition of such features is still non-robust and unreliable in low resolution or low frame rate volleyball videos. Instead, we utilize the court transition information caused by camera operation. Experimental results demonstrate the robustness of our rally scene detection and the effectiveness of our rally-rank to reflect viewers' preferences over previous methods.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"70 1","pages":"179-186"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72979838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}