首页 > 最新文献

Foundations and Trends in Computer Graphics and Vision最新文献

英文 中文
Crowdsourcing in Computer Vision 计算机视觉中的众包
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2016-11-07 DOI: 10.1561/0600000073
Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, K. Grauman
Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. Crowdsourcing in Computer Vision describes the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. It begins by discussing data collection on both classic vision tasks, such as object recognition, and recent vision tasks, such as visual story-telling. It then summarizes key design decisions for creating effective data collection interfaces and workflows, and presents strategies for intelligently selecting the most important data instances to annotate. It concludes with some thoughts on the future of crowdsourcing in computer vision. Crowdsourcing in Computer Vision provides an overview of how crowdsourcing has been used in computer vision, enabling a computer vision researcher who has previously not collected non-expert data to devise a data collection strategy. It will also be of help to researchers who focus broadly on crowdsourcing to examine how the latter has been applied in computer vision, and to improve the methods that can be employed to ensure the quality and expedience of data collection.
计算机视觉系统需要大量人工标注的数据来正确学习具有挑战性的视觉概念。众包平台提供了一种廉价的方法来获取人类的知识和理解,用于大量的视觉感知任务。计算机视觉中的众包描述了计算机视觉研究人员使用众包收集的注释类型,以及他们如何确保这些数据的高质量,同时最大限度地减少注释工作。它首先讨论了经典视觉任务(如物体识别)和最近的视觉任务(如视觉讲故事)的数据收集。然后总结了创建有效的数据收集接口和工作流的关键设计决策,并提出了智能选择要注释的最重要数据实例的策略。最后对计算机视觉众包的未来进行了一些思考。计算机视觉中的众包提供了如何在计算机视觉中使用众包的概述,使以前没有收集非专家数据的计算机视觉研究人员能够设计数据收集策略。它也将有助于广泛关注众包的研究人员研究后者如何应用于计算机视觉,并改进可用于确保数据收集质量和方便性的方法。
{"title":"Crowdsourcing in Computer Vision","authors":"Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, K. Grauman","doi":"10.1561/0600000073","DOIUrl":"https://doi.org/10.1561/0600000073","url":null,"abstract":"Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. Crowdsourcing in Computer Vision describes the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. It begins by discussing data collection on both classic vision tasks, such as object recognition, and recent vision tasks, such as visual story-telling. It then summarizes key design decisions for creating effective data collection interfaces and workflows, and presents strategies for intelligently selecting the most important data instances to annotate. It concludes with some thoughts on the future of crowdsourcing in computer vision. Crowdsourcing in Computer Vision provides an overview of how crowdsourcing has been used in computer vision, enabling a computer vision researcher who has previously not collected non-expert data to devise a data collection strategy. It will also be of help to researchers who focus broadly on crowdsourcing to examine how the latter has been applied in computer vision, and to improve the methods that can be employed to ensure the quality and expedience of data collection.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"13 1","pages":"177-243"},"PeriodicalIF":36.5,"publicationDate":"2016-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80961458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 122
Multi-View Stereo: A Tutorial 多视图立体:教程
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2015-05-30 DOI: 10.1561/0600000052
Yasutaka Furukawa, Carlos Hernández
This tutorial presents a hands-on view of the field of multi-view stereo with a focus on practical algorithms. Multi-view stereo algorithms are able to construct highly detailed 3D models from images alone. They take a possibly very large set of images and construct a 3D plausible geometry that explains the images under some reasonable assumptions, the most important being scene rigidity. The tutorial frames the multiview stereo problem as an image/geometry consistency optimization problem. It describes in detail its main two ingredients: robust implementations of photometric consistency measures, and efficient optimization algorithms. It then presents how these main ingredients are used by some of the most successful algorithms, applied into real applications, and deployed as products in the industry. Finally it describes more advanced approaches exploiting domain-specific knowledge such as structural priors, and gives an overview of the remaining challenges and future research directions.
本教程介绍了多视图立体领域的动手视图,重点是实用算法。多视图立体算法能够仅从图像构建高度详细的3D模型。他们采用可能非常大的图像集,并在一些合理的假设(最重要的是场景刚性)下构建一个合理的3D几何结构来解释这些图像。本教程将多视图立体问题定义为图像/几何一致性优化问题。它详细描述了其主要的两个组成部分:光度一致性措施的鲁棒实现和高效的优化算法。然后介绍了这些主要成分如何被一些最成功的算法使用,应用到实际应用中,并作为产品部署到行业中。最后介绍了利用领域特定知识(如结构先验)的更先进的方法,并概述了仍然存在的挑战和未来的研究方向。
{"title":"Multi-View Stereo: A Tutorial","authors":"Yasutaka Furukawa, Carlos Hernández","doi":"10.1561/0600000052","DOIUrl":"https://doi.org/10.1561/0600000052","url":null,"abstract":"This tutorial presents a hands-on view of the field of multi-view stereo with a focus on practical algorithms. Multi-view stereo algorithms are able to construct highly detailed 3D models from images alone. They take a possibly very large set of images and construct a 3D plausible geometry that explains the images under some reasonable assumptions, the most important being scene rigidity. The tutorial frames the multiview stereo problem as an image/geometry consistency optimization problem. It describes in detail its main two ingredients: robust implementations of photometric consistency measures, and efficient optimization algorithms. It then presents how these main ingredients are used by some of the most successful algorithms, applied into real applications, and deployed as products in the industry. Finally it describes more advanced approaches exploiting domain-specific knowledge such as structural priors, and gives an overview of the remaining challenges and future research directions.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"32 1","pages":"1-148"},"PeriodicalIF":36.5,"publicationDate":"2015-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86769042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 492
Domain Adaptation for Visual Recognition 视觉识别的领域自适应
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2015-03-26 DOI: 10.1561/0600000057
Raghuraman Gopalan, Ruonan Li, Vishal M. Patel, R. Chellappa
Domain adaptation is an active, emerging research area that attemptsto address the changes in data distribution across training and testingdatasets. With the availability of a multitude of image acquisition sensors,variations due to illumination, and viewpoint among others, computervision applications present a very natural test bed for evaluatingdomain adaptation methods. In this monograph, we provide a comprehensiveoverview of domain adaptation solutions for visual recognitionproblems. By starting with the problem description and illustrations,we discuss three adaptation scenarios namely, i unsupervised adaptationwhere the "source domain" training data is partially labeledand the "target domain" test data is unlabeled, ii semi-supervisedadaptation where the target domain also has partial labels, and iiimulti-domain heterogeneous adaptation which studies the previous twosettings with the source and/or target having more than one domain,and accounts for cases where the features used to represent the datain each domain are different. For all these topics we discuss existingadaptation techniques in the literature, which are motivated by theprinciples of max-margin discriminative learning, manifold learning,sparse coding, as well as low-rank representations. These techniqueshave shown improved performance on a variety of applications suchas object recognition, face recognition, activity analysis, concept classification,and person detection. We then conclude by analyzing thechallenges posed by the realm of "big visual data", in terms of thegeneralization ability of adaptation algorithms to unconstrained dataacquisition as well as issues related to their computational tractability,and draw parallels with the efforts from vision community on imagetransformation models, and invariant descriptors so as to facilitate improvedunderstanding of vision problems under uncertainty.
领域适应是一个活跃的新兴研究领域,试图解决跨训练和测试数据集的数据分布变化。随着大量图像采集传感器的可用性,由于光照和视点等因素的变化,计算机视觉应用为评估领域自适应方法提供了一个非常自然的测试平台。在这本专著中,我们提供了视觉识别问题的领域自适应解决方案的全面概述。从问题描述和插图开始,我们讨论了三种自适应场景,即无监督自适应,其中“源域”训练数据部分标记,“目标域”测试数据未标记;半监督自适应,其中目标域也有部分标签;三是多域异构自适应,它研究了前两种情况,即源和/或目标具有多个域,并考虑了用于表示每个域的数据的特征不同的情况。对于所有这些主题,我们讨论了文献中现有的自适应技术,这些技术是由最大边际判别学习、流形学习、稀疏编码以及低秩表示的原则驱动的。这些技术在物体识别、人脸识别、活动分析、概念分类和人员检测等各种应用中都显示出改进的性能。最后,我们分析了“大视觉数据”领域所面临的挑战,即自适应算法对无约束数据采集的泛化能力以及与计算可追溯性相关的问题,并与视觉界在图像变换模型和不变描述符方面的努力进行了比较,以促进对不确定性下视觉问题的更好理解。
{"title":"Domain Adaptation for Visual Recognition","authors":"Raghuraman Gopalan, Ruonan Li, Vishal M. Patel, R. Chellappa","doi":"10.1561/0600000057","DOIUrl":"https://doi.org/10.1561/0600000057","url":null,"abstract":"Domain adaptation is an active, emerging research area that attemptsto address the changes in data distribution across training and testingdatasets. With the availability of a multitude of image acquisition sensors,variations due to illumination, and viewpoint among others, computervision applications present a very natural test bed for evaluatingdomain adaptation methods. In this monograph, we provide a comprehensiveoverview of domain adaptation solutions for visual recognitionproblems. By starting with the problem description and illustrations,we discuss three adaptation scenarios namely, i unsupervised adaptationwhere the \"source domain\" training data is partially labeledand the \"target domain\" test data is unlabeled, ii semi-supervisedadaptation where the target domain also has partial labels, and iiimulti-domain heterogeneous adaptation which studies the previous twosettings with the source and/or target having more than one domain,and accounts for cases where the features used to represent the datain each domain are different. For all these topics we discuss existingadaptation techniques in the literature, which are motivated by theprinciples of max-margin discriminative learning, manifold learning,sparse coding, as well as low-rank representations. These techniqueshave shown improved performance on a variety of applications suchas object recognition, face recognition, activity analysis, concept classification,and person detection. We then conclude by analyzing thechallenges posed by the realm of \"big visual data\", in terms of thegeneralization ability of adaptation algorithms to unconstrained dataacquisition as well as issues related to their computational tractability,and draw parallels with the efforts from vision community on imagetransformation models, and invariant descriptors so as to facilitate improvedunderstanding of vision problems under uncertainty.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"47 1","pages":"285-378"},"PeriodicalIF":36.5,"publicationDate":"2015-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75037658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
A Fresh Look at Generalized Sampling 广义抽样的新认识
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2014-02-14 DOI: 10.1561/0600000053
Diego F. Nehab, Hugues Hoppe
Discretization and reconstruction are fundamental operations in computer graphics, enabling the conversion between sampled and continuous representations. Major advances in signal-processing research have shown that these operations can often be performed more efficiently by decomposing a filter into two parts: a compactly supported continuous-domain function and a digital filter. This strategy of "generalized sampling" has appeared in a few graphics papers, but is largely unexplored in our community. This paper broadly summarizes the key aspects of the framework, and delves into specific applications in graphics. Using new notation, we concisely present and extend several key techniques. In addition, we demonstrate benefits for prefiltering in image downscaling and supersample-based rendering, and present an analysis of the associated variance reduction. We conclude with a qualitative and quantitative comparison of traditional and generalized filters.
离散化和重构是计算机图形学中的基本操作,实现了采样和连续表示之间的转换。信号处理研究的主要进展表明,通过将滤波器分解为两部分:紧支持的连续域函数和数字滤波器,通常可以更有效地执行这些操作。这种“广义抽样”策略出现在一些图形论文中,但在我们的社区中基本上没有被探索过。本文概括地总结了框架的关键方面,并深入研究了图形学中的具体应用。使用新的符号,我们简明地介绍和扩展了几个关键技术。此外,我们展示了预滤波在图像降尺度和基于超样本的渲染中的好处,并对相关的方差减少进行了分析。最后对传统滤波器和广义滤波器进行了定性和定量的比较。
{"title":"A Fresh Look at Generalized Sampling","authors":"Diego F. Nehab, Hugues Hoppe","doi":"10.1561/0600000053","DOIUrl":"https://doi.org/10.1561/0600000053","url":null,"abstract":"Discretization and reconstruction are fundamental operations in computer graphics, enabling the conversion between sampled and continuous representations. Major advances in signal-processing research have shown that these operations can often be performed more efficiently by decomposing a filter into two parts: a compactly supported continuous-domain function and a digital filter. This strategy of \"generalized sampling\" has appeared in a few graphics papers, but is largely unexplored in our community. This paper broadly summarizes the key aspects of the framework, and delves into specific applications in graphics. Using new notation, we concisely present and extend several key techniques. In addition, we demonstrate benefits for prefiltering in image downscaling and supersample-based rendering, and present an analysis of the associated variance reduction. We conclude with a qualitative and quantitative comparison of traditional and generalized filters.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"65 10 1","pages":"1-84"},"PeriodicalIF":36.5,"publicationDate":"2014-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88465908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Modeling and Simulation of Skeletal Muscle for Computer Graphics: A Survey 骨骼肌的计算机图形学建模与仿真研究综述
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2012-04-14 DOI: 10.1561/0600000036
Dongwoon Lee, Michael Glueck, Azam Khan, E. Fiume, Kenneth R. Jackson
Muscles provide physiological functions to drive body movement and anatomically characterize body shape, making them a crucial component of modeling animated human figures. Substantial effort has been devoted to developing computational models of muscles for the purpose of increasing realism and accuracy in computer graphics and biomechanics. We survey various approaches to model and simulate muscles both morphologically and functionally. Modeling the realistic morphology of muscle requires that muscle deformation be accurately depicted. To this end, several methodologies are presented, including geometrically-based, physically-based, and data-driven approaches. On the other hand, the simulation of physiological muscle functions aims to identify the biomechanical controls responsible for realistic human motion. Estimating these muscle controls has been pursued through static and dynamic simulations. We review and discuss all these approaches, and conclude with suggestions for future research.
肌肉提供生理功能,以驱动身体运动和解剖特征的身体形状,使他们成为建模动画人物的重要组成部分。为了提高计算机图形学和生物力学的真实感和准确性,已经投入了大量的努力来开发肌肉的计算模型。我们调查了各种方法来模型和模拟肌肉形态和功能。模拟肌肉的真实形态需要准确地描述肌肉的变形。为此,提出了几种方法,包括基于几何的、基于物理的和数据驱动的方法。另一方面,生理肌肉功能的模拟旨在识别负责真实人体运动的生物力学控制。通过静态和动态模拟来估计这些肌肉控制。我们对这些方法进行了回顾和讨论,并对未来的研究提出了建议。
{"title":"Modeling and Simulation of Skeletal Muscle for Computer Graphics: A Survey","authors":"Dongwoon Lee, Michael Glueck, Azam Khan, E. Fiume, Kenneth R. Jackson","doi":"10.1561/0600000036","DOIUrl":"https://doi.org/10.1561/0600000036","url":null,"abstract":"Muscles provide physiological functions to drive body movement and anatomically characterize body shape, making them a crucial component of modeling animated human figures. Substantial effort has been devoted to developing computational models of muscles for the purpose of increasing realism and accuracy in computer graphics and biomechanics. We survey various approaches to model and simulate muscles both morphologically and functionally. Modeling the realistic morphology of muscle requires that muscle deformation be accurately depicted. To this end, several methodologies are presented, including geometrically-based, physically-based, and data-driven approaches. On the other hand, the simulation of physiological muscle functions aims to identify the biomechanical controls responsible for realistic human motion. Estimating these muscle controls has been pursued through static and dynamic simulations. We review and discuss all these approaches, and conclude with suggestions for future research.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"898 1","pages":"229-276"},"PeriodicalIF":36.5,"publicationDate":"2012-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77481313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Full-Reference Image Quality Metrics: Classification and Evaluation 全参考图像质量度量:分类和评价
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2012-04-01 DOI: 10.1561/0600000037
Marius Pedersen, J. Hardeberg
The wide variety of distortions that images are subject to during acquisition, processing, storage, and reproduction can degrade their perceived quality. Since subjective evaluation is time-consuming, expensive, and resource-intensive, objective methods of evaluation have been proposed. One type of these methods, image quality (IQ) metrics, have become very popular and new metrics are proposed continuously. This paper aims to give a survey of one class of metrics, full-reference IQ metrics. First, these IQ metrics were classified into different groups. Second, further IQ metrics from each group were selected and evaluated against six state-of-the-art IQ databases.
在获取、处理、存储和复制过程中,图像会受到各种各样的扭曲,这会降低图像的感知质量。由于主观评价费时、昂贵且资源密集,因此提出了客观评价方法。其中一种方法,图像质量(IQ)指标,已经变得非常流行,新的指标不断被提出。本文的目的是概述一类指标,即全参考智商指标。首先,这些智商指标被分为不同的组。其次,从每个组中选择进一步的智商指标,并根据六个最先进的智商数据库进行评估。
{"title":"Full-Reference Image Quality Metrics: Classification and Evaluation","authors":"Marius Pedersen, J. Hardeberg","doi":"10.1561/0600000037","DOIUrl":"https://doi.org/10.1561/0600000037","url":null,"abstract":"The wide variety of distortions that images are subject to during acquisition, processing, storage, and reproduction can degrade their perceived quality. Since subjective evaluation is time-consuming, expensive, and resource-intensive, objective methods of evaluation have been proposed. One type of these methods, image quality (IQ) metrics, have become very popular and new metrics are proposed continuously. This paper aims to give a survey of one class of metrics, full-reference IQ metrics. First, these IQ metrics were classified into different groups. Second, further IQ metrics from each group were selected and evaluated against six state-of-the-art IQ databases.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"37 1","pages":"1-80"},"PeriodicalIF":36.5,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86840961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 108
Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning 决策森林:分类、回归、密度估计、流形学习和半监督学习的统一框架
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2012-03-14 DOI: 10.1561/0600000035
A. Criminisi, J. Shotton, E. Konukoglu
This review presents a unified, efficient model of random decision forests which can be applied to a number of machine learning, computer vision, and medical image analysis tasks. Our model extends existing forest-based techniques as it unifies classification, regression, density estimation, manifold learning, semi-supervised learning, and active learning under the same decision forest framework. This gives us the opportunity to write and optimize the core implementation only once, with application to many diverse tasks. The proposed model may be used both in a discriminative or generative way and may be applied to discrete or continuous, labeled or unlabeled data. The main contributions of this review are: (1) Proposing a unified, probabilistic and efficient model for a variety of learning tasks; (2) Demonstrating margin-maximizing properties of classification forests; (3) Discussing probabilistic regression forests in comparison with other nonlinear regression algorithms; (4) Introducing density forests for estimating probability density functions; (5) Proposing an efficient algorithm for sampling from a density forest; (6) Introducing manifold forests for nonlinear dimensionality reduction; (7) Proposing new algorithms for transductive learning and active learning. Finally, we discuss how alternatives such as random ferns and extremely randomized trees stem from our more general forest model. This document is directed at both students who wish to learn the basics of decision forests, as well as researchers interested in the new contributions. It presents both fundamental and novel concepts in a structured way, with many illustrative examples and real-world applications. Thorough comparisons with state-of-the-art algorithms such as support vector machines, boosting and Gaussian processes are presented and relative advantages and disadvantages discussed. The many synthetic examples and existing commercial applications demonstrate the validity of the proposed model and its flexibility.
这篇综述提出了一个统一的、有效的随机决策森林模型,它可以应用于许多机器学习、计算机视觉和医学图像分析任务。我们的模型扩展了现有的基于森林的技术,因为它在同一决策森林框架下统一了分类、回归、密度估计、流形学习、半监督学习和主动学习。这使我们有机会只编写和优化一次核心实现,并将其应用于许多不同的任务。所提出的模型可以以判别或生成的方式使用,并且可以应用于离散或连续,标记或未标记的数据。本文的主要贡献有:(1)为各种学习任务提出了一个统一的、概率化的、高效的模型;(2)展示了分类森林的边际最大化特性;(3)讨论概率回归森林与其他非线性回归算法的比较;(4)引入密度森林来估计概率密度函数;(5)提出了一种高效的密度森林采样算法;(6)引入流形森林进行非线性降维;(7)提出了传导学习和主动学习的新算法。最后,我们讨论了随机蕨类植物和极端随机树木等替代方案如何从我们更一般的森林模型中产生。本文档面向希望学习决策森林基础知识的学生,以及对新贡献感兴趣的研究人员。它以结构化的方式展示了基本和新颖的概念,并提供了许多说明性示例和实际应用。与最先进的算法,如支持向量机,增强和高斯过程进行了全面的比较,并讨论了相对的优点和缺点。许多综合算例和现有的商业应用证明了该模型的有效性和灵活性。
{"title":"Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning","authors":"A. Criminisi, J. Shotton, E. Konukoglu","doi":"10.1561/0600000035","DOIUrl":"https://doi.org/10.1561/0600000035","url":null,"abstract":"This review presents a unified, efficient model of random decision forests which can be applied to a number of machine learning, computer vision, and medical image analysis tasks. \u0000 \u0000Our model extends existing forest-based techniques as it unifies classification, regression, density estimation, manifold learning, semi-supervised learning, and active learning under the same decision forest framework. This gives us the opportunity to write and optimize the core implementation only once, with application to many diverse tasks. \u0000 \u0000The proposed model may be used both in a discriminative or generative way and may be applied to discrete or continuous, labeled or unlabeled data. \u0000 \u0000The main contributions of this review are: (1) Proposing a unified, probabilistic and efficient model for a variety of learning tasks; (2) Demonstrating margin-maximizing properties of classification forests; (3) Discussing probabilistic regression forests in comparison with other nonlinear regression algorithms; (4) Introducing density forests for estimating probability density functions; (5) Proposing an efficient algorithm for sampling from a density forest; (6) Introducing manifold forests for nonlinear dimensionality reduction; (7) Proposing new algorithms for transductive learning and active learning. Finally, we discuss how alternatives such as random ferns and extremely randomized trees stem from our more general forest model. \u0000 \u0000This document is directed at both students who wish to learn the basics of decision forests, as well as researchers interested in the new contributions. It presents both fundamental and novel concepts in a structured way, with many illustrative examples and real-world applications. Thorough comparisons with state-of-the-art algorithms such as support vector machines, boosting and Gaussian processes are presented and relative advantages and disadvantages discussed. The many synthetic examples and existing commercial applications demonstrate the validity of the proposed model and its flexibility.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"14 1","pages":"81-227"},"PeriodicalIF":36.5,"publicationDate":"2012-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75042111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 891
Structured Learning and Prediction in Computer Vision 计算机视觉中的结构化学习与预测
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2011-05-10 DOI: 10.1561/0600000033
Sebastian Nowozin, Christoph H. Lampert
Powerful statistical models that can be learned efficiently from large amounts of data are currently revolutionizing computer vision. These models possess a rich internal structure reflecting task-specific relations and constraints. This monograph introduces the reader to the most popular classes of structured models in computer vision. Our focus is discrete undirected graphical models which we cover in detail together with a description of algorithms for both probabilistic inference and maximum a posteriori inference. We discuss separately recently successful techniques for prediction in general structured models. In the second part of this monograph we describe methods for parameter learning where we distinguish the classic maximum likelihood based methods from the more recent prediction-based parameter learning methods. We highlight developments to enhance current models and discuss kernelized models and latent variable models. To make the monograph more practical and to provide links to further study we provide examples of successful application of many methods in the computer vision literature.
强大的统计模型可以有效地从大量数据中学习,目前正在彻底改变计算机视觉。这些模型具有丰富的内部结构,反映了特定于任务的关系和约束。这本专著向读者介绍了计算机视觉中最流行的结构化模型。我们的重点是离散无向图形模型,我们详细介绍了概率推理和最大后验推理的算法描述。我们单独讨论了最近在一般结构化模型中成功的预测技术。在本专著的第二部分,我们描述了参数学习的方法,其中我们区分了经典的基于极大似然的方法和最近的基于预测的参数学习方法。我们强调了增强当前模型的发展,并讨论了核化模型和潜在变量模型。为了使本专著更加实用,并为进一步的研究提供链接,我们提供了许多方法在计算机视觉文献中成功应用的例子。
{"title":"Structured Learning and Prediction in Computer Vision","authors":"Sebastian Nowozin, Christoph H. Lampert","doi":"10.1561/0600000033","DOIUrl":"https://doi.org/10.1561/0600000033","url":null,"abstract":"Powerful statistical models that can be learned efficiently from large amounts of data are currently revolutionizing computer vision. These models possess a rich internal structure reflecting task-specific relations and constraints. This monograph introduces the reader to the most popular classes of structured models in computer vision. Our focus is discrete undirected graphical models which we cover in detail together with a description of algorithms for both probabilistic inference and maximum a posteriori inference. We discuss separately recently successful techniques for prediction in general structured models. In the second part of this monograph we describe methods for parameter learning where we distinguish the classic maximum likelihood based methods from the more recent prediction-based parameter learning methods. We highlight developments to enhance current models and discuss kernelized models and latent variable models. To make the monograph more practical and to provide links to further study we provide examples of successful application of many methods in the computer vision literature.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"2 1","pages":"185-365"},"PeriodicalIF":36.5,"publicationDate":"2011-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75559275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 358
Camera Models and Fundamental Concepts Used in Geometric Computer Vision 几何计算机视觉中使用的相机模型和基本概念
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2011-01-05 DOI: 10.1561/0600000023
P. Sturm, S. Ramalingam, J. Tardif, Simone Gasparini, J. Barreto
This survey is mainly motivated by the increased availability and use of panoramic image acquisition devices, in computer vision and various of its applications. Different technologies and different computational models thereof exist and algorithms and theoretical studies for geometric computer vision ("structure-from-motion") are often re-developed without highlighting common underlying principles. One of the goals of this survey is to give an overview of image acquisition methods used in computer vision and especially, of the vast number of camera models that have been proposed and investigated over the years, where we try to point out similarities between different models. Results on epipolar and multi-view geometry for different camera models are reviewed as well as various calibration and self-calibration approaches, with an emphasis on non-perspective cameras. We finally describe what we consider are fundamental building blocks for geometric computer vision or structure-from-motion: epipolar geometry, pose and motion estimation, 3D scene modeling, and bundle adjustment. The main goal here is to highlight the main principles of these, which are independent of specific camera models.
这项调查的主要动机是增加可用性和使用全景图像采集设备,在计算机视觉和它的各种应用。存在不同的技术和不同的计算模型,几何计算机视觉(“结构-运动”)的算法和理论研究经常被重新开发,而没有突出共同的基本原理。本调查的目标之一是概述计算机视觉中使用的图像采集方法,特别是多年来提出和研究的大量相机模型,我们试图指出不同模型之间的相似性。综述了不同相机模型的极极和多视点几何以及各种校准和自校准方法的研究结果,重点介绍了非透视相机。我们最后描述了我们认为是几何计算机视觉或运动结构的基本构建块:极几何,姿态和运动估计,3D场景建模和束调整。这里的主要目标是强调这些的主要原则,这些原则独立于特定的相机模型。
{"title":"Camera Models and Fundamental Concepts Used in Geometric Computer Vision","authors":"P. Sturm, S. Ramalingam, J. Tardif, Simone Gasparini, J. Barreto","doi":"10.1561/0600000023","DOIUrl":"https://doi.org/10.1561/0600000023","url":null,"abstract":"This survey is mainly motivated by the increased availability and use of panoramic image acquisition devices, in computer vision and various of its applications. Different technologies and different computational models thereof exist and algorithms and theoretical studies for geometric computer vision (\"structure-from-motion\") are often re-developed without highlighting common underlying principles. One of the goals of this survey is to give an overview of image acquisition methods used in computer vision and especially, of the vast number of camera models that have been proposed and investigated over the years, where we try to point out similarities between different models. Results on epipolar and multi-view geometry for different camera models are reviewed as well as various calibration and self-calibration approaches, with an emphasis on non-perspective cameras. We finally describe what we consider are fundamental building blocks for geometric computer vision or structure-from-motion: epipolar geometry, pose and motion estimation, 3D scene modeling, and bundle adjustment. The main goal here is to highlight the main principles of these, which are independent of specific camera models.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"16 1","pages":"1-183"},"PeriodicalIF":36.5,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82467089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 231
Geodesic Methods in Computer Vision and Graphics 计算机视觉与图形学中的测地线方法
IF 36.5 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2010-03-01 DOI: 10.1561/0600000029
G. Peyré, M. Pechaud, R. Keriven, L. Cohen
This monograph reviews both the theory and practice of the numerical computation of geodesic distances on Riemannian manifolds. The notion of Riemannian manifold allows one to define a local metric (a symmetric positive tensor field) that encodes the information about the problem one wishes to solve. This takes into account a local isotropic cost (whether some point should be avoided or not) and a local anisotropy (which direction should be preferred). Using this local tensor field, the geodesic distance is used to solve many problems of practical interest such as segmentation using geodesic balls and Voronoi regions, sampling points at regular geodesic distance or meshing a domain with geodesic Delaunay triangles. The shortest paths for this Riemannian distance, the so-called geodesics, are also important because they follow salient curvilinear structures in the domain. We show several applications of the numerical computation of geodesic distances and shortest paths to problems in surface and shape processing, in particular segmentation, sampling, meshing and comparison of shapes. All the figures from this review paper can be reproduced by following the Numerical Tours of Signal Processing. http://www.ceremade.dauphine.fr/~peyre/numerical-tour/ Several textbooks exist that include description of several manifold methods for image processing, shape and surface representation and computer graphics. In particular, the reader should refer to [42, 147, 208, 209, 213, 255] for fascinating applications of these methods to many important problems in vision and graphics. This review paper is intended to give an updated tour of both foundations and trends in the area of geodesic methods in vision and graphics.
本专著回顾了黎曼流形上测地线距离数值计算的理论和实践。黎曼流形的概念允许我们定义一个局部度规(一个对称的正张量场)来编码我们想要解决的问题的信息。这考虑了局部各向同性成本(是否应该避免某些点)和局部各向异性(哪个方向应该首选)。利用这个局部张量场,测地线距离可用于解决许多实际问题,例如使用测地线球和Voronoi区域进行分割,在规则测地线距离处采样点或使用测地线Delaunay三角形对域进行网格划分。这个黎曼距离的最短路径,即所谓的测地线,也很重要,因为它们遵循区域内显著的曲线结构。我们展示了测量距离和最短路径的数值计算在表面和形状处理中的几个应用,特别是分割、采样、网格划分和形状比较。这篇综述文章中的所有数字都可以按照信号处理的数字导览复制。http://www.ceremade.dauphine.fr/~peyre/numerical-tour/存在一些教科书,其中包括对图像处理,形状和表面表示以及计算机图形学的几种多种方法的描述。特别是,读者应该参考[42,147,208,209,213,255],了解这些方法在视觉和图形领域许多重要问题上的迷人应用。这篇综述论文旨在对视觉和图形领域的测地线方法的基础和趋势进行最新的介绍。
{"title":"Geodesic Methods in Computer Vision and Graphics","authors":"G. Peyré, M. Pechaud, R. Keriven, L. Cohen","doi":"10.1561/0600000029","DOIUrl":"https://doi.org/10.1561/0600000029","url":null,"abstract":"This monograph reviews both the theory and practice of the numerical computation of geodesic distances on Riemannian manifolds. The notion of Riemannian manifold allows one to define a local metric (a symmetric positive tensor field) that encodes the information about the problem one wishes to solve. This takes into account a local isotropic cost (whether some point should be avoided or not) and a local anisotropy (which direction should be preferred). Using this local tensor field, the geodesic distance is used to solve many problems of practical interest such as segmentation using geodesic balls and Voronoi regions, sampling points at regular geodesic distance or meshing a domain with geodesic Delaunay triangles. The shortest paths for this Riemannian distance, the so-called geodesics, are also important because they follow salient curvilinear structures in the domain. We show several applications of the numerical computation of geodesic distances and shortest paths to problems in surface and shape processing, in particular segmentation, sampling, meshing and comparison of shapes. All the figures from this review paper can be reproduced by following the Numerical Tours of Signal Processing. \u0000 \u0000http://www.ceremade.dauphine.fr/~peyre/numerical-tour/ \u0000 \u0000Several textbooks exist that include description of several manifold methods for image processing, shape and surface representation and computer graphics. In particular, the reader should refer to [42, 147, 208, 209, 213, 255] for fascinating applications of these methods to many important problems in vision and graphics. This review paper is intended to give an updated tour of both foundations and trends in the area of geodesic methods in vision and graphics.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"26 1","pages":"197-397"},"PeriodicalIF":36.5,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78263581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
期刊
Foundations and Trends in Computer Graphics and Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1