首页 > 最新文献

IPSJ Transactions on Computer Vision and Applications最新文献

英文 中文
Upper Body Pose Estimation for Team Sports Videos Using a Poselet-Regressor of Spine Pose and Body Orientation Classifiers Conditioned by the Spine Angle Prior 基于脊柱姿态和姿态分类器的团队运动视频上半身姿态估计
Q1 Computer Science Pub Date : 2015-10-20 DOI: 10.2197/ipsjtcva.7.121
Masaki Hayashi, Kyoko Oshima, Masamoto Tanabiki, Y. Aoki
We propose a per-frame upper body pose estimation method for sports players captured in low-resolution team sports videos. Using the head-center-aligned upper body region appearance in each frame from the head tracker, our framework estimates (1) 2D spine pose, composed of the head center and the pelvis center locations, and (2) the orientation of the upper body in each frame. Our framework is composed of three steps. In the first step, the head region of the subject player is tracked with a standard tracking-by-detection technique for upper body appearance alignment. In the second step, the relative pelvis center location from the head center is estimated by our newly proposed poseletregressor in each frame to obtain spine angle priors. In the last step, the body orientation is estimated by the upper body orientation classifier selected by the spine angle range. Owing to the alignment of the body appearance and the usage of multiple body orientation classifiers conditioned by the spine angle prior, our method can robustly estimate the body orientation of a player with a large variation of visual appearances during a game, even during side-poses or self-occluded poses. We tested the performance of our method in both American football and soccer videos.
提出了一种低分辨率团队运动视频中运动员上半身姿态的逐帧估计方法。利用头部跟踪器在每一帧中的头部中心对齐的上半身区域外观,我们的框架估计(1)2D脊柱姿势,由头部中心和骨盆中心位置组成,以及(2)上半身在每一帧中的方向。我们的框架由三个步骤组成。在第一步中,使用标准的检测跟踪技术跟踪受试者玩家的头部区域,以便上身外观对齐。在第二步中,通过我们新提出的姿态回归器在每帧中估计相对于头部中心的骨盆中心位置,以获得脊柱角度先验。最后一步,由脊柱角度范围选择上半身方向分类器估计身体方向。由于身体外观的对齐和使用由脊柱角度先验条件下的多个身体方向分类器,我们的方法可以稳健地估计游戏中视觉外观变化很大的球员的身体方向,即使在侧摆或自遮挡姿势时也是如此。我们在美式橄榄球和足球视频中测试了我们的方法的性能。
{"title":"Upper Body Pose Estimation for Team Sports Videos Using a Poselet-Regressor of Spine Pose and Body Orientation Classifiers Conditioned by the Spine Angle Prior","authors":"Masaki Hayashi, Kyoko Oshima, Masamoto Tanabiki, Y. Aoki","doi":"10.2197/ipsjtcva.7.121","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.121","url":null,"abstract":"We propose a per-frame upper body pose estimation method for sports players captured in low-resolution team sports videos. Using the head-center-aligned upper body region appearance in each frame from the head tracker, our framework estimates (1) 2D spine pose, composed of the head center and the pelvis center locations, and (2) the orientation of the upper body in each frame. Our framework is composed of three steps. In the first step, the head region of the subject player is tracked with a standard tracking-by-detection technique for upper body appearance alignment. In the second step, the relative pelvis center location from the head center is estimated by our newly proposed poseletregressor in each frame to obtain spine angle priors. In the last step, the body orientation is estimated by the upper body orientation classifier selected by the spine angle range. Owing to the alignment of the body appearance and the usage of multiple body orientation classifiers conditioned by the spine angle prior, our method can robustly estimate the body orientation of a player with a large variation of visual appearances during a game, even during side-poses or self-occluded poses. We tested the performance of our method in both American football and soccer videos.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"37 1","pages":"121-137"},"PeriodicalIF":0.0,"publicationDate":"2015-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84868702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Human Behavior Recognition in Shopping Settings 购物环境中的人类行为识别
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.151
R. Sicre, H. Nicolas
This paper presents a new application that improves communication between digital media and customers at a point of sale. The system uses several methods from various areas of computer vision such as motion detection, object tracking, behavior analysis and recognition, semantic description of behavior, and scenario recognition. Specifically, the system is divided in three parts: low-level, mid-level, and high-level analysis. Low-level analysis detects and tracks moving object in the scene. Then mid-level analysis describes and recognizes behavior of the tracked objects. Finally high-level analysis produces a semantic interpretation of the detected behavior and recognizes predefined scenarios. Our research is developed in order to build a real-time application that recognizes human behaviors while shopping. Specifically, the system detects customer interests and interactions with various products at a point of sale.
本文提出了一种新的应用程序,可以改善数字媒体与客户在销售点之间的通信。该系统使用了来自计算机视觉各个领域的几种方法,如运动检测、目标跟踪、行为分析和识别、行为的语义描述和场景识别。具体来说,系统分为三个部分:低级、中级和高级分析。低级分析检测和跟踪场景中的移动物体。然后,中级分析描述和识别被跟踪对象的行为。最后,高级分析生成检测到的行为的语义解释,并识别预定义的场景。我们的研究是为了建立一个实时应用程序,识别人类购物时的行为。具体来说,该系统检测客户的兴趣以及与销售点各种产品的交互。
{"title":"Human Behavior Recognition in Shopping Settings","authors":"R. Sicre, H. Nicolas","doi":"10.2197/ipsjtcva.7.151","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.151","url":null,"abstract":"This paper presents a new application that improves communication between digital media and customers at a point of sale. The system uses several methods from various areas of computer vision such as motion detection, object tracking, behavior analysis and recognition, semantic description of behavior, and scenario recognition. Specifically, the system is divided in three parts: low-level, mid-level, and high-level analysis. Low-level analysis detects and tracks moving object in the scene. Then mid-level analysis describes and recognizes behavior of the tracked objects. Finally high-level analysis produces a semantic interpretation of the detected behavior and recognizes predefined scenarios. Our research is developed in order to build a real-time application that recognizes human behaviors while shopping. Specifically, the system detects customer interests and interactions with various products at a point of sale.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"5 1","pages":"151-162"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87905917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A General Inlier Estimation for Moving Camera Motion Segmentation 运动摄像机运动分割的一般插值估计
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.163
Xuefeng Liang, Cuicui Zhang, T. Matsuyama
In moving camera videos, motion segmentation is often achieved by determining the motion coherence of each moving object. However, it is a nontrivial task on optical flow due to two problems: 1) Optical flow of the camera motions in 3D world consists of three primary 2D motion flows: translation, rotation, and radial flow. Their coherence analysis is done by a variety of models, and further requires plenty of priors in existing frameworks; 2) A moving camera introduces 3D motion, the depth discontinuities cause the motion discontinuities that severely break down the coherence. Meanwhile, the mixture of the camera motion and moving objects’ motions make it difficult to clearly identify foreground and background. In this work, our solution is to transform the optical flow into a potential space where the coherence of the background flow field is easily modeled by a low order polynomial. To this end, we first amend the Helmholts-Hodge Decomposition by adding coherence constraints, which can transform translation, rotation, and radial flow fields to two potential surfaces under a unified framework. Secondly, we introduce an Incoherence Map and a progressive Quad-Tree partition to reject moving objects and motion discontinuities. Finally, the low order polynomial is achieved from the rest flow samples on two potentials. We present results on more than twenty videos from four benchmarks. Extensive experiments demonstrate better performance in dealing with challenging scenes with complex backgrounds. Our method improves the segmentation accuracy of state-of-the-arts by 10%∼30%.
在运动摄像机视频中,运动分割通常是通过确定每个运动对象的运动相干性来实现的。然而,由于以下两个问题,对光流的研究是一项艰巨的任务:1)三维世界中相机运动的光流由三种主要的二维运动流组成:平移、旋转和径向流。它们的一致性分析是由各种模型完成的,并且进一步需要在现有框架中进行大量的先验;2)运动摄像机引入三维运动,深度不连续导致运动不连续,严重破坏了相干性。同时,摄像机的运动和运动物体的运动混合在一起,使得前景和背景难以清晰识别。在这项工作中,我们的解决方案是将光流转换成一个势空间,在这个势空间中,背景流场的相干性很容易用低阶多项式来建模。为此,我们首先对Helmholts-Hodge分解进行修正,加入相干约束,将平移、旋转和径向流场转化为统一框架下的两个势面。其次,我们引入了非相干映射和渐进式四叉树分割来抑制运动目标和运动不连续。最后,从两个电位上的剩余流样本得到低阶多项式。我们展示了来自四个基准的二十多个视频的结果。大量的实验表明,在处理具有复杂背景的具有挑战性的场景时,该方法具有更好的性能。我们的方法将最先进的分割精度提高了10% ~ 30%。
{"title":"A General Inlier Estimation for Moving Camera Motion Segmentation","authors":"Xuefeng Liang, Cuicui Zhang, T. Matsuyama","doi":"10.2197/ipsjtcva.7.163","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.163","url":null,"abstract":"In moving camera videos, motion segmentation is often achieved by determining the motion coherence of each moving object. However, it is a nontrivial task on optical flow due to two problems: 1) Optical flow of the camera motions in 3D world consists of three primary 2D motion flows: translation, rotation, and radial flow. Their coherence analysis is done by a variety of models, and further requires plenty of priors in existing frameworks; 2) A moving camera introduces 3D motion, the depth discontinuities cause the motion discontinuities that severely break down the coherence. Meanwhile, the mixture of the camera motion and moving objects’ motions make it difficult to clearly identify foreground and background. In this work, our solution is to transform the optical flow into a potential space where the coherence of the background flow field is easily modeled by a low order polynomial. To this end, we first amend the Helmholts-Hodge Decomposition by adding coherence constraints, which can transform translation, rotation, and radial flow fields to two potential surfaces under a unified framework. Secondly, we introduce an Incoherence Map and a progressive Quad-Tree partition to reject moving objects and motion discontinuities. Finally, the low order polynomial is achieved from the rest flow samples on two potentials. We present results on more than twenty videos from four benchmarks. Extensive experiments demonstrate better performance in dealing with challenging scenes with complex backgrounds. Our method improves the segmentation accuracy of state-of-the-arts by 10%∼30%.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"33 1","pages":"163-174"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86286320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Spatial Visual Attention for Novelty Detection: A Space-based Saliency Model in 3D Using Spatial Memory 新颖性检测的空间视觉注意:基于空间记忆的三维空间显著性模型
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.35
Nevrez Imamoglu, E. Dorronzoro, M. Sekine, K. Kita, Wenwei Yu
Saliency maps as visual attention computational models can reveal novel regions within a scene (as in the human visual system), which can decrease the amount of data to be processed in task specific computer vision applications. Most of the saliency computation models do not take advantage of prior spatial memory by giving priority to spatial or object based features to obtain bottom-up or top-down saliency maps. In our previous experiments, we demonstrated that spatial memory regardless of object features can aid detection and tracking tasks with a mobile robot by using a 2D global environment memory of the robot and local Kinect data in 2D to compute the space-based saliency map. However, in complex scenes where 2D space-based saliency is not enough (i.e., subject lying on the bed), 3D scene analysis is necessary to extract novelty within the scene by using spatial memory. Therefore, in this work, to improve the detection of novelty in a known environment, we proposed a space-based spatial saliency with 3D local information by improving 2D space base saliency with height as prior information about the specific locations. Moreover, the algorithm can also be integrated with other bottom-up or top-down saliency computational models to improve the detection results. Experimental results demonstrate that high accuracy for novelty detection can be obtained, and computational time can be reduced for existing state of the art detection and tracking models with the proposed algorithm.
作为视觉注意力计算模型的显著性图可以揭示场景中的新区域(就像在人类视觉系统中一样),这可以减少在特定任务的计算机视觉应用程序中需要处理的数据量。大多数显著性计算模型没有利用先验空间记忆,优先考虑空间或基于对象的特征来获得自底向上或自顶向下的显著性图。在我们之前的实验中,我们通过使用机器人的2D全局环境记忆和2D本地Kinect数据来计算基于空间的显著性地图,证明了空间记忆可以帮助移动机器人检测和跟踪任务,而不考虑物体特征。然而,在2D空间显著性不够的复杂场景中(例如,受试者躺在床上),需要进行3D场景分析,利用空间记忆提取场景内的新颖性。因此,在这项工作中,为了提高在已知环境下的新颖性检测,我们通过改进以高度作为特定位置先验信息的二维空间基础显著性,提出了基于三维局部信息的空间显著性。此外,该算法还可以与其他自底向上或自顶向下的显著性计算模型相结合,以提高检测结果。实验结果表明,该算法可以获得较高的新颖性检测精度,并且可以减少现有检测和跟踪模型的计算时间。
{"title":"Spatial Visual Attention for Novelty Detection: A Space-based Saliency Model in 3D Using Spatial Memory","authors":"Nevrez Imamoglu, E. Dorronzoro, M. Sekine, K. Kita, Wenwei Yu","doi":"10.2197/ipsjtcva.7.35","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.35","url":null,"abstract":"Saliency maps as visual attention computational models can reveal novel regions within a scene (as in the human visual system), which can decrease the amount of data to be processed in task specific computer vision applications. Most of the saliency computation models do not take advantage of prior spatial memory by giving priority to spatial or object based features to obtain bottom-up or top-down saliency maps. In our previous experiments, we demonstrated that spatial memory regardless of object features can aid detection and tracking tasks with a mobile robot by using a 2D global environment memory of the robot and local Kinect data in 2D to compute the space-based saliency map. However, in complex scenes where 2D space-based saliency is not enough (i.e., subject lying on the bed), 3D scene analysis is necessary to extract novelty within the scene by using spatial memory. Therefore, in this work, to improve the detection of novelty in a known environment, we proposed a space-based spatial saliency with 3D local information by improving 2D space base saliency with height as prior information about the specific locations. Moreover, the algorithm can also be integrated with other bottom-up or top-down saliency computational models to improve the detection results. Experimental results demonstrate that high accuracy for novelty detection can be obtained, and computational time can be reduced for existing state of the art detection and tracking models with the proposed algorithm.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"158 1","pages":"35-40"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88859146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss 基于卷积瓶颈网络的重度听力损失人的视听语音识别
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.64
Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, T. Takiguchi, Y. Ariki, Nobuyuki Mitani, K. Omori, Kaoru Nakazono
In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it. We investigate in this paper an audio-visual speech recognition system for a person with severe hearing loss in noisy environments, where a robust feature extraction method using a convolutive bottleneck network (CBN) is applied to audio-visual data. We confirmed the effectiveness of this approach through word-recognition experiments in noisy environments, where the CBN-based feature extraction method outperformed the conventional methods.
在本文中,我们提出了一种视听语音识别系统,用于严重听力损失导致的发音障碍患者。对于患有这种发音障碍的人来说,他们的说话风格与没有听力损失的人的说话风格大不相同,因此对于没有听力损失的人来说,独立于说话者的模型几乎无法识别。本文研究了一种针对重度听力损失患者在噪声环境下的视听语音识别系统,将一种基于卷积瓶颈网络(CBN)的鲁棒特征提取方法应用于视听数据。我们通过噪声环境下的词识别实验验证了该方法的有效性,其中基于cbn的特征提取方法优于传统方法。
{"title":"Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss","authors":"Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, T. Takiguchi, Y. Ariki, Nobuyuki Mitani, K. Omori, Kaoru Nakazono","doi":"10.2197/ipsjtcva.7.64","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.64","url":null,"abstract":"In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it. We investigate in this paper an audio-visual speech recognition system for a person with severe hearing loss in noisy environments, where a robust feature extraction method using a convolutive bottleneck network (CBN) is applied to audio-visual data. We confirmed the effectiveness of this approach through word-recognition experiments in noisy environments, where the CBN-based feature extraction method outperformed the conventional methods.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"26 1","pages":"64-68"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80606487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Rail Sensor: A Mobile Lidar System for 3D Archiving the Bas-reliefs in Angkor Wat 轨道传感器:用于吴哥窟浅浮雕3D存档的移动激光雷达系统
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.59
Bo Zheng, Takeshi Oishi, K. Ikeuchi
This paper presents a mobile Lidar system for efficiently and accurately capturing the 3D shape of the Bas-reliefs in Angkor Wat. The sensor system consists of two main components: 1) a panoramic camera and 2) a 2D 360-degree laser line scanner, which moves slowly on the rails parallel to the reliefs. In this paper, we first propose a new but simple method to accurately calibrate the panoramic camera to the 2D laser scan lines. Then the sensor motion can be estimated from the sensor-fused system using the 2D/3D features tracking method. Furthermore, to reduce the drifting error of sensor motion we adopt bundle adjustment to globally optimize and smooth the moving trajectories. In experiments, we demonstrate that our moving Lidar system achieves substantially better performance for accuracy and efficiency in comparison to the traditional stop-and-go methods.
本文介绍了一种移动激光雷达系统,用于高效、准确地捕捉吴哥窟浅浮雕的三维形状。传感器系统由两个主要组件组成:1)一个全景摄像机和2)一个2D 360度激光线扫描仪,它在平行于浮雕的轨道上缓慢移动。本文首先提出了一种新的、简单的方法来精确标定全景相机的二维激光扫描线。然后利用2D/3D特征跟踪方法从传感器融合系统中估计传感器运动。此外,为了减小传感器运动漂移误差,采用束调整方法对运动轨迹进行全局优化和平滑。在实验中,我们证明了与传统的走走停停方法相比,我们的移动激光雷达系统在精度和效率方面取得了显着提高。
{"title":"Rail Sensor: A Mobile Lidar System for 3D Archiving the Bas-reliefs in Angkor Wat","authors":"Bo Zheng, Takeshi Oishi, K. Ikeuchi","doi":"10.2197/ipsjtcva.7.59","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.59","url":null,"abstract":"This paper presents a mobile Lidar system for efficiently and accurately capturing the 3D shape of the Bas-reliefs in Angkor Wat. The sensor system consists of two main components: 1) a panoramic camera and 2) a 2D 360-degree laser line scanner, which moves slowly on the rails parallel to the reliefs. In this paper, we first propose a new but simple method to accurately calibrate the panoramic camera to the 2D laser scan lines. Then the sensor motion can be estimated from the sensor-fused system using the 2D/3D features tracking method. Furthermore, to reduce the drifting error of sensor motion we adopt bundle adjustment to globally optimize and smooth the moving trajectories. In experiments, we demonstrate that our moving Lidar system achieves substantially better performance for accuracy and efficiency in comparison to the traditional stop-and-go methods.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"15 1","pages":"59-63"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90395149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Image Denoising with Sparsity Distillation 稀疏度蒸馏图像去噪
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.50
S. Kawata, Nao Mishima
We propose a new image denoising method with shrinkage. In the proposed method, small blocks in an input image are projected to the space that makes projection coefficients sparse, and the explicitly evaluated sparsity degree is used to control the shrinkage threshold. On average, the proposed method obtained higher quantitative evaluation values (PSNRs and SSIMs) compared with one of the state-of-the-art methods in the field of image denoising. The proposed method removes random noise effectively from natural images while preserving intricate textures.
提出了一种基于收缩的图像去噪方法。在该方法中,将输入图像中的小块投影到使投影系数稀疏的空间中,并使用显式评估的稀疏度来控制收缩阈值。平均而言,与图像去噪领域的一种最新方法相比,该方法获得了更高的定量评价值(psnr和ssim)。该方法可以有效地去除自然图像中的随机噪声,同时保留复杂的纹理。
{"title":"Image Denoising with Sparsity Distillation","authors":"S. Kawata, Nao Mishima","doi":"10.2197/ipsjtcva.7.50","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.50","url":null,"abstract":"We propose a new image denoising method with shrinkage. In the proposed method, small blocks in an input image are projected to the space that makes projection coefficients sparse, and the explicitly evaluated sparsity degree is used to control the shrinkage threshold. On average, the proposed method obtained higher quantitative evaluation values (PSNRs and SSIMs) compared with one of the state-of-the-art methods in the field of image denoising. The proposed method removes random noise effectively from natural images while preserving intricate textures.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"40 1","pages":"50-54"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75511738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Auxiliary Training Information Assisted Visual Recognition 辅助训练信息辅助视觉识别
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.138
Qilin Zhang, G. Hua, W. Liu, Zicheng Liu, Zhengyou Zhang
In the realm of multi-modal visual recognition, the reliability of the data acquisition system is often a concern due to the increased complexity of the sensors. One of the major issues is the accidental loss of one or more sensing channels, which poses a major challenge to current learning systems. In this paper, we examine one of these specific missing data problems, where we have a main modality/view along with an auxiliary modality/view present in the training data, but merely the main modality/view in the test data. To effectively leverage the auxiliary information to train a stronger classifier, we propose a collaborative auxiliary learning framework based on a new discriminative canonical correlation analysis. This framework reveals a common semantic space shared across both modalities/views through enforcing a series of nonlinear projections. Such projections automatically embed the discriminative cues hidden in both modalities/views into the common space, and better visual recognition is thus achieved on the test data. The efficacy of our proposed auxiliary learning approach is demonstrated through four challenging visual recognition tasks with different kinds of auxiliary information.
在多模态视觉识别领域,由于传感器的复杂性不断增加,数据采集系统的可靠性经常受到关注。其中一个主要问题是一个或多个传感通道的意外丢失,这对当前的学习系统构成了重大挑战。在本文中,我们研究了这些特定的缺失数据问题之一,其中我们在训练数据中有一个主模态/视图以及一个辅助模态/视图,但在测试数据中只有主模态/视图。为了有效地利用辅助信息来训练更强的分类器,我们提出了一种基于新的判别典型相关分析的协同辅助学习框架。该框架通过执行一系列非线性投影,揭示了两种模式/视图之间共享的公共语义空间。这种投影自动将隐藏在两种模式/视图中的判别线索嵌入到公共空间中,从而在测试数据上实现更好的视觉识别。我们提出的辅助学习方法的有效性通过四个具有不同类型的辅助信息的具有挑战性的视觉识别任务来证明。
{"title":"Auxiliary Training Information Assisted Visual Recognition","authors":"Qilin Zhang, G. Hua, W. Liu, Zicheng Liu, Zhengyou Zhang","doi":"10.2197/ipsjtcva.7.138","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.138","url":null,"abstract":"In the realm of multi-modal visual recognition, the reliability of the data acquisition system is often a concern due to the increased complexity of the sensors. One of the major issues is the accidental loss of one or more sensing channels, which poses a major challenge to current learning systems. In this paper, we examine one of these specific missing data problems, where we have a main modality/view along with an auxiliary modality/view present in the training data, but merely the main modality/view in the test data. To effectively leverage the auxiliary information to train a stronger classifier, we propose a collaborative auxiliary learning framework based on a new discriminative canonical correlation analysis. This framework reveals a common semantic space shared across both modalities/views through enforcing a series of nonlinear projections. Such projections automatically embed the discriminative cues hidden in both modalities/views into the common space, and better visual recognition is thus achieved on the test data. The efficacy of our proposed auxiliary learning approach is demonstrated through four challenging visual recognition tasks with different kinds of auxiliary information.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"75 1","pages":"138-150"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79495291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Depth-based Gait Authentication for Practical Sensor Settings 基于深度的实际传感器设置步态认证
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.94
Taro Ikeda, Ikuhisa Mitsugami, Y. Yagi
This paper investigates performances of silhouette-based and depth-based gait authentication considering practical sensor settings where sensors are located in an environments afterwards and usually have to be located quite near to people. To realize fair comparison between different sensors and methods, we construct full-body volume of walking people by a multi-camera environment so as to reconstruct virtual silhouette and depth images at arbitrary sensor positions. In addition, we also investigate performances when we have to authenticate between frontal and rear views. Experimental results confirm that the depth-based methods outperform the silhouette-based ones in the realistic situations. We also confirm that by introducing Depth-based Gait Feature, we can authenticate between the frontal and rear views.
本文研究了基于轮廓和基于深度的步态认证的性能,考虑了实际传感器设置,其中传感器位于之后的环境中,通常必须位于离人很近的地方。为了实现不同传感器和方法之间的公平比较,我们在多相机环境下构建行走人的全身体积,从而在任意传感器位置重建虚拟轮廓和深度图像。此外,我们也调查性能,当我们必须验证正面和后视图之间。实验结果表明,在真实情况下,基于深度的方法优于基于轮廓的方法。我们还证实,通过引入基于深度的步态特征,我们可以在前视图和后视图之间进行验证。
{"title":"Depth-based Gait Authentication for Practical Sensor Settings","authors":"Taro Ikeda, Ikuhisa Mitsugami, Y. Yagi","doi":"10.2197/ipsjtcva.7.94","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.94","url":null,"abstract":"This paper investigates performances of silhouette-based and depth-based gait authentication considering practical sensor settings where sensors are located in an environments afterwards and usually have to be located quite near to people. To realize fair comparison between different sensors and methods, we construct full-body volume of walking people by a multi-camera environment so as to reconstruct virtual silhouette and depth images at arbitrary sensor positions. In addition, we also investigate performances when we have to authenticate between frontal and rear views. Experimental results confirm that the depth-based methods outperform the silhouette-based ones in the realistic situations. We also confirm that by introducing Depth-based Gait Feature, we can authenticate between the frontal and rear views.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"17 1","pages":"94-98"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86093642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Individuality-preserving Silhouette Extraction for Gait Recognition 步态识别中保持个性的轮廓提取
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.74
Yasushi Makihara, Takuya Tanoue, D. Muramatsu, Y. Yagi, Syunsuke Mori, Yuzuko Utsumi, M. Iwamura, K. Kise
Most gait recognition approaches rely on silhouette-based representations due to high recognition accu- racy and computational efficiency, and a key problem for those approaches is how to accurately extract individuality- preserved silhouettes from real scenes, where foreground colors may be similar to background colors and the back- groundis cluttered. We thereforeproposea method of individuality-preservingsilhouetteextractionfor gait recognition using standard gait models (SGMs) composed of clean silhouette sequences of a variety of training subjects as a shape prior. We firstly match the multiple SGMs to a background subtraction sequence of a test subject by dynamic pro- gramming and select the training subject whose SGM fit the test sequence the best. We then formulate our silhouette extraction problem in a well-established graph-cut segmentation framework while considering a balance between the observed test sequence and the matched SGM. More specifically, we define an energy function to be minimized by the following three terms: (1) a data term derived from the observed test sequence, (2) a smoothness term derived from spatio-temporally adjacent edges, and (3) a shape-prior term derived from the matched SGM. We demonstrate that the proposed method successfully extracts individuality-preserved silhouettes and improved gait recognition accuracy through experiments using 56 subjects.
由于具有较高的识别精度和计算效率,大多数步态识别方法都依赖于基于轮廓的表示,而这些方法的关键问题是如何从前景颜色可能与背景颜色相似且背景混乱的真实场景中准确提取保留个性的轮廓。因此,我们提出了一种保留个性的轮廓提取方法,用于步态识别,该方法使用由各种训练对象的干净轮廓序列组成的标准步态模型(SGMs)作为形状先验。首先通过动态规划方法将多个SGM与测试对象的背景差序列进行匹配,并选择SGM与测试序列最匹配的训练对象。然后,我们在一个完善的图形切割分割框架中制定我们的轮廓提取问题,同时考虑观察到的测试序列和匹配的SGM之间的平衡。更具体地说,我们定义了一个由以下三个项最小化的能量函数:(1)从观察到的测试序列中得到的数据项,(2)从时空相邻边缘中得到的平滑项,(3)从匹配的SGM中得到的形状先验项。通过56个被试的实验,我们证明了该方法成功地提取了保留个性的轮廓,提高了步态识别的准确性。
{"title":"Individuality-preserving Silhouette Extraction for Gait Recognition","authors":"Yasushi Makihara, Takuya Tanoue, D. Muramatsu, Y. Yagi, Syunsuke Mori, Yuzuko Utsumi, M. Iwamura, K. Kise","doi":"10.2197/ipsjtcva.7.74","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.74","url":null,"abstract":"Most gait recognition approaches rely on silhouette-based representations due to high recognition accu- racy and computational efficiency, and a key problem for those approaches is how to accurately extract individuality- preserved silhouettes from real scenes, where foreground colors may be similar to background colors and the back- groundis cluttered. We thereforeproposea method of individuality-preservingsilhouetteextractionfor gait recognition using standard gait models (SGMs) composed of clean silhouette sequences of a variety of training subjects as a shape prior. We firstly match the multiple SGMs to a background subtraction sequence of a test subject by dynamic pro- gramming and select the training subject whose SGM fit the test sequence the best. We then formulate our silhouette extraction problem in a well-established graph-cut segmentation framework while considering a balance between the observed test sequence and the matched SGM. More specifically, we define an energy function to be minimized by the following three terms: (1) a data term derived from the observed test sequence, (2) a smoothness term derived from spatio-temporally adjacent edges, and (3) a shape-prior term derived from the matched SGM. We demonstrate that the proposed method successfully extracts individuality-preserved silhouettes and improved gait recognition accuracy through experiments using 56 subjects.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"20 1","pages":"74-78"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88202496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
IPSJ Transactions on Computer Vision and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1