Oscar Alejandro Mendez Maldonado, Simon Hadfield, N. Pugeault, R. Bowden
Most 3D reconstruction approaches passively optimise over all data, exhaustively matching pairs, rather than actively selecting data to process. This is costly both in terms of time and computer resources, and quickly becomes intractable for large datasets. This work proposes an approach to intelligently filter large amounts of data for 3D reconstructions of unknown scenes using monocular cameras. Our contributions are twofold: First, we present a novel approach to efficiently optimise the Next-Best View ( NBV ) in terms of accuracy and coverage using partial scene geometry. Second, we extend this to intelligently selecting stereo pairs by jointly optimising the baseline and vergence to find the NBV ’s best stereo pair to perform reconstruction. Both contributions are extremely efficient, taking 0.8ms and 0.3ms per pose, respectively. Experimental evaluation shows that the proposed method allows efficient selection of stereo pairs for reconstruction, such that a dense model can be obtained with only a small number of images. Once a complete model has been obtained, the remaining computational budget is used to intelligently refine areas of uncertainty, achieving results comparable to state-of-the-art batch approaches on the Middlebury dataset, using as little as 3.8% of the views.
{"title":"Next-Best Stereo: Extending Next-Best View Optimisation For Collaborative Sensors","authors":"Oscar Alejandro Mendez Maldonado, Simon Hadfield, N. Pugeault, R. Bowden","doi":"10.5244/C.30.65","DOIUrl":"https://doi.org/10.5244/C.30.65","url":null,"abstract":"Most 3D reconstruction approaches passively optimise over all data, exhaustively matching pairs, rather than actively selecting data to process. This is costly both in terms of time and computer resources, and quickly becomes intractable for large datasets. This work proposes an approach to intelligently filter large amounts of data for 3D reconstructions of unknown scenes using monocular cameras. Our contributions are twofold: First, we present a novel approach to efficiently optimise the Next-Best View ( NBV ) in terms of accuracy and coverage using partial scene geometry. Second, we extend this to intelligently selecting stereo pairs by jointly optimising the baseline and vergence to find the NBV ’s best stereo pair to perform reconstruction. Both contributions are extremely efficient, taking 0.8ms and 0.3ms per pose, respectively. Experimental evaluation shows that the proposed method allows efficient selection of stereo pairs for reconstruction, such that a dense model can be obtained with only a small number of images. Once a complete model has been obtained, the remaining computational budget is used to intelligently refine areas of uncertainty, achieving results comparable to state-of-the-art batch approaches on the Middlebury dataset, using as little as 3.8% of the views.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130817690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.
{"title":"Learning Neural Network Architectures using Backpropagation","authors":"Suraj Srinivas, R. Venkatesh Babu","doi":"10.5244/C.30.104","DOIUrl":"https://doi.org/10.5244/C.30.104","url":null,"abstract":"Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"449 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131799223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-H – an efficient method for the recovery of the tangent planes of a set of point correspondences satisfying the epipolar constraint is proposed. The problem is formulated as a search for a labeling minimizing an energy that includes a data and spatial regularization terms. The number of planes is controlled by a combination of MeanShift [6] and α-expansion [3]. Experiments on the fountain-P11 3D dataset show that Multi-H provides highly accurate tangent plane estimates. It also outperforms all state-of-the-art techniques for multihomography estimation on the publicly available AdelaideRMF dataset. Since Multi-H achieves nearly error-free performance, we introduce and make public a more challenging dataset for multi-plane fitting evaluation.
{"title":"Multi-H: Efficient recovery of tangent planes in stereo images","authors":"D. Baráth, Jiri Matas, Levente Hajder","doi":"10.5244/C.30.13","DOIUrl":"https://doi.org/10.5244/C.30.13","url":null,"abstract":"Multi-H – an efficient method for the recovery of the tangent planes of a set of point correspondences satisfying the epipolar constraint is proposed. The problem is formulated as a search for a labeling minimizing an energy that includes a data and spatial regularization terms. The number of planes is controlled by a combination of MeanShift [6] and α-expansion [3]. Experiments on the fountain-P11 3D dataset show that Multi-H provides highly accurate tangent plane estimates. It also outperforms all state-of-the-art techniques for multihomography estimation on the publicly available AdelaideRMF dataset. Since Multi-H achieves nearly error-free performance, we introduce and make public a more challenging dataset for multi-plane fitting evaluation.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134261174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giorgos Karvounas, I. Oikonomidis, Antonis A. Argyros
Periodicity detection is a problem that has received a lot of attention, thus several important tools exist to analyse purely periodic signals. However, in many real world scenarios (time series, videos of human activities, etc) periodic signals appear in the context of non-periodic ones. In this work we propose a method that, given a time series representing a periodic signal that has a non-periodic prefix and tail, estimates the start, the end and the period of the periodic part of the signal. We formulate this as an optimization problem that is solved based on evolutionary optimization techniques. Quantitative experiments on synthetic data demonstrate that the proposed method is successful in localizing the periodic part of a signal and exhibits robustness in the presence of noisy measurements. Also, it does so even when the periodic part of the signal is too short compared to its non-periodic prefix and tail. We also provide quantitative and qualitative results obtained from the application of the proposed method to the problem of unsupervised localization and segmentation of periodic activities in real world videos.
{"title":"Localizing Periodicity in Time Series and Videos","authors":"Giorgos Karvounas, I. Oikonomidis, Antonis A. Argyros","doi":"10.5244/C.30.47","DOIUrl":"https://doi.org/10.5244/C.30.47","url":null,"abstract":"Periodicity detection is a problem that has received a lot of attention, thus several important tools exist to analyse purely periodic signals. However, in many real world scenarios (time series, videos of human activities, etc) periodic signals appear in the context of non-periodic ones. In this work we propose a method that, given a time series representing a periodic signal that has a non-periodic prefix and tail, estimates the start, the end and the period of the periodic part of the signal. We formulate this as an optimization problem that is solved based on evolutionary optimization techniques. Quantitative experiments on synthetic data demonstrate that the proposed method is successful in localizing the periodic part of a signal and exhibits robustness in the presence of noisy measurements. Also, it does so even when the periodic part of the signal is too short compared to its non-periodic prefix and tail. We also provide quantitative and qualitative results obtained from the application of the proposed method to the problem of unsupervised localization and segmentation of periodic activities in real world videos.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115066343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Kandemir, Manuel Haussmann, Ferran Diego, K. Rajamani, J. Laak, F. Hamprecht
We introduce the first model to perform weakly supervised learning with Gaussian processes on up to millions of instances. The key ingredient to achieve this scalability is to replace the standard assumption of MIL that the bag-level prediction is the maximum of instance-level estimates with the accumulated evidence of instances within a bag. This enables us to devise a novel variational inference scheme that operates solely by closedform updates. Keeping all its parameters but one fixed, our model updates the remaining parameter to the global optimum. This virtue leads to charmingly fast convergence, fitting perfectly to large-scale learning setups. Our model performs significantly better in two medical applications than adaptation of GPMIL to scalable inference and various scalable MIL algorithms. It also proves to be very competitive in object classification against state-of-the-art adaptations of deep learning to weakly supervised learning.
{"title":"Variational Weakly Supervised Gaussian Processes","authors":"M. Kandemir, Manuel Haussmann, Ferran Diego, K. Rajamani, J. Laak, F. Hamprecht","doi":"10.5244/C.30.71","DOIUrl":"https://doi.org/10.5244/C.30.71","url":null,"abstract":"We introduce the first model to perform weakly supervised learning with Gaussian processes on up to millions of instances. The key ingredient to achieve this scalability is to replace the standard assumption of MIL that the bag-level prediction is the maximum of instance-level estimates with the accumulated evidence of instances within a bag. This enables us to devise a novel variational inference scheme that operates solely by closedform updates. Keeping all its parameters but one fixed, our model updates the remaining parameter to the global optimum. This virtue leads to charmingly fast convergence, fitting perfectly to large-scale learning setups. Our model performs significantly better in two medical applications than adaptation of GPMIL to scalable inference and various scalable MIL algorithms. It also proves to be very competitive in object classification against state-of-the-art adaptations of deep learning to weakly supervised learning.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133244300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the problem of weakly supervised learning for object localization. Given a collection of images with image-level annotations indicating the presence/absence of an object, our goal is to localize the object in each image. We propose a neural network architecture called the attention network for this problem. Given a set of candidate regions in an image, the attention network first computes an attention score on each candidate region in the image. Then these candidate regions are combined together with their attention scores to form a whole-image feature vector. This feature vector is used for classifying the image. The object localization is implicitly achieved via the attention scores on candidate regions. We demonstrate that our approach achieves superior performance on several benchmark datasets.
{"title":"Attention Networks for Weakly Supervised Object Localization","authors":"Eu Wern Teh, Mrigank Rochan, Yang Wang","doi":"10.5244/C.30.52","DOIUrl":"https://doi.org/10.5244/C.30.52","url":null,"abstract":"We consider the problem of weakly supervised learning for object localization. Given a collection of images with image-level annotations indicating the presence/absence of an object, our goal is to localize the object in each image. We propose a neural network architecture called the attention network for this problem. Given a set of candidate regions in an image, the attention network first computes an attention score on each candidate region in the image. Then these candidate regions are combined together with their attention scores to form a whole-image feature vector. This feature vector is used for classifying the image. The object localization is implicitly achieved via the attention scores on candidate regions. We demonstrate that our approach achieves superior performance on several benchmark datasets.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121291423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}