首页 > 最新文献

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing最新文献

英文 中文
Extended multi-spectral face recognition across two different age groups: an empirical study 扩展多光谱人脸识别在两个不同年龄组:一个实证研究
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010026
N. Vetrekar, Ramachandra Raghavendra, A. Gaonkar, G. Naik, R. Gad
Face recognition has attained a greater importance in bio-metric authentication due to its non-intrusive property of identifying individuals at varying stand-off distance. Face recognition based on multi-spectral imaging has recently gained prime importance due to its ability to capture spatial and spectral information across the spectrum. Our first contribution in this paper is to use extended multi-spectral face recognition in two different age groups. The second contribution is to show empirically the performance of face recognition for two age groups. Thus, in this paper, we developed a multi-spectral imaging sensor to capture facial database for two different age groups (≤ 15years and ≥ 20years) at nine different spectral bands covering 530nm to 1000nm range. We then collected a new facial images corresponding to two different age groups comprises of 168 individuals. Extensive experimental evaluation is performed independently on two different age group databases using four different state-of-the-art face recognition algorithms. We evaluate the verification and identification rate across individual spectral bands and fused spectral band for two age groups. The obtained evaluation results shows higher recognition rate for age groups ≥ 20years than ≤ 15years, which indicates the variation in face recognition across the different age groups.
人脸识别由于其在不同距离上识别个体的非侵入性而在生物特征认证中获得了更大的重要性。基于多光谱成像的人脸识别由于其能够捕获跨光谱的空间和光谱信息,最近获得了主要的重要性。我们在本文中的第一个贡献是在两个不同年龄组中使用扩展的多光谱人脸识别。第二个贡献是实证地展示了两个年龄组的人脸识别性能。因此,在本文中,我们开发了一种多光谱成像传感器,用于在530nm至1000nm范围内的9个不同光谱波段捕获两个不同年龄组(≤15岁和≥20岁)的面部数据库。然后,我们收集了两个不同年龄组的168个人的新面部图像。广泛的实验评估在两个不同年龄组的数据库上独立进行,使用四种不同的最先进的人脸识别算法。我们评估了两个年龄组的单个光谱带和融合光谱带的验证和识别率。得到的评价结果显示,≥20岁年龄组的识别率高于≤15岁年龄组,说明不同年龄组的人脸识别存在差异。
{"title":"Extended multi-spectral face recognition across two different age groups: an empirical study","authors":"N. Vetrekar, Ramachandra Raghavendra, A. Gaonkar, G. Naik, R. Gad","doi":"10.1145/3009977.3010026","DOIUrl":"https://doi.org/10.1145/3009977.3010026","url":null,"abstract":"Face recognition has attained a greater importance in bio-metric authentication due to its non-intrusive property of identifying individuals at varying stand-off distance. Face recognition based on multi-spectral imaging has recently gained prime importance due to its ability to capture spatial and spectral information across the spectrum. Our first contribution in this paper is to use extended multi-spectral face recognition in two different age groups. The second contribution is to show empirically the performance of face recognition for two age groups. Thus, in this paper, we developed a multi-spectral imaging sensor to capture facial database for two different age groups (≤ 15years and ≥ 20years) at nine different spectral bands covering 530nm to 1000nm range. We then collected a new facial images corresponding to two different age groups comprises of 168 individuals. Extensive experimental evaluation is performed independently on two different age group databases using four different state-of-the-art face recognition algorithms. We evaluate the verification and identification rate across individual spectral bands and fused spectral band for two age groups. The obtained evaluation results shows higher recognition rate for age groups ≥ 20years than ≤ 15years, which indicates the variation in face recognition across the different age groups.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"16 1","pages":"78:1-78:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86549944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Uncorrelated multiview discriminant locality preserving projection analysis for multiview facial expression recognition 多视角面部表情识别的非相关多视角判别局部保留投影分析
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010056
Sunil Kumar, M. Bhuyan, B. Chakraborty
Recently several multi-view learning-based methods have been proposed, and they are found to be more efficient in many real world applications. However, existing multi-view learning-based methods are not suitable for finding discriminative directions if the data is multi-modal. In such cases, Locality Preserving Projection (LPP) and/or Local Fisher Discriminant Analysis (LFDA) are found to be more appropriate to capture discriminative directions. Furthermore, existing methods show that imposing uncorrelated constraint onto the common space improves classification accuracy of the system. Hence inspired from the above findings, we propose an Un-correlated Multi-view Discriminant Locality Preserving Projection (UMvDLPP)-based approach. The proposed method searches a common uncorrelated discriminative space for multiple observable spaces. Moreover, the proposed method can also handle the multimodal characteristic, which is inherently embedded in multi-view facial expression recognition (FER) data. Hence, the proposed method is effectively more efficient for multi-view FER problem. Experimental results show that the proposed method outperforms state-of-the-art multi-view learning-based methods.
最近提出了几种基于多视图学习的方法,并在许多实际应用中发现它们更有效。然而,现有的基于多视图学习的方法不适合在数据是多模态的情况下寻找判别方向。在这种情况下,发现局部保持投影(LPP)和/或局部Fisher判别分析(LFDA)更适合捕获判别方向。此外,现有方法表明,在公共空间上施加不相关约束可以提高系统的分类精度。因此,受上述发现的启发,我们提出了一种基于非相关多视图判别局部保持投影(UMvDLPP)的方法。该方法对多个可观测空间搜索一个共同的不相关判别空间。此外,该方法还可以处理多视图面部表情识别数据中固有的多模态特征。因此,所提出的方法对于多视点FER问题具有更高的效率。实验结果表明,该方法优于当前基于多视图学习的方法。
{"title":"Uncorrelated multiview discriminant locality preserving projection analysis for multiview facial expression recognition","authors":"Sunil Kumar, M. Bhuyan, B. Chakraborty","doi":"10.1145/3009977.3010056","DOIUrl":"https://doi.org/10.1145/3009977.3010056","url":null,"abstract":"Recently several multi-view learning-based methods have been proposed, and they are found to be more efficient in many real world applications. However, existing multi-view learning-based methods are not suitable for finding discriminative directions if the data is multi-modal. In such cases, Locality Preserving Projection (LPP) and/or Local Fisher Discriminant Analysis (LFDA) are found to be more appropriate to capture discriminative directions. Furthermore, existing methods show that imposing uncorrelated constraint onto the common space improves classification accuracy of the system. Hence inspired from the above findings, we propose an Un-correlated Multi-view Discriminant Locality Preserving Projection (UMvDLPP)-based approach. The proposed method searches a common uncorrelated discriminative space for multiple observable spaces. Moreover, the proposed method can also handle the multimodal characteristic, which is inherently embedded in multi-view facial expression recognition (FER) data. Hence, the proposed method is effectively more efficient for multi-view FER problem. Experimental results show that the proposed method outperforms state-of-the-art multi-view learning-based methods.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"1 1","pages":"86:1-86:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84120810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Enhancement of high dynamic range images using variational calculus regularizer with stochastic resonance 利用随机共振变分微积分正则化器增强高动态范围图像
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010039
Sumit Kumar, R. K. Jha
While capturing pictures by a simple camera in a scene with the presence of harsh or strong lighting like a full sunny day, we often find loss of highlight detail information (overexposure) in the bright regions and loss of shadow detail information (underexposure) in dark regions. In this manuscript, a classical method for retrieval of minute information from the high dynamic range image has been proposed. Our technique is based on variational calculus and dynamic stochastic resonance (DSR). We use a regularizer function, which has been added in order to optimise the correct estimation of the lost details from the overexposed or underexposed region of the image. We suppress the dynamic range of the luminance image by attenuating large gradient with the large magnitude and low gradient with low magnitude. At the same time, dynamic stochastic resonance (DSR) has been used to improve the underexposed region of the image. The experimental results of our proposed technique are capable of enhancing the quality of images in both overexposed and underexposed regions. The proposed technique is compared with most of the state-of-the-art techniques and it has been observed that the proposed technique is better or at most comparable to the existing techniques.
当我们用简单的相机在阳光明媚的场景中拍摄照片时,我们经常会发现在明亮的区域丢失高光细节信息(过度曝光),在黑暗的区域丢失阴影细节信息(曝光不足)。本文提出了一种从高动态范围图像中提取微小信息的经典方法。我们的技术是基于变分微积分和动态随机共振(DSR)。我们使用了一个正则化函数,该函数是为了优化对图像过度曝光或曝光不足区域丢失细节的正确估计而添加的。通过大幅度衰减大梯度和低幅度衰减小梯度来抑制亮度图像的动态范围。同时,采用动态随机共振(DSR)技术改善了图像的欠曝光区域。实验结果表明,我们提出的技术能够提高过曝光和欠曝光区域的图像质量。将所建议的技术与大多数最先进的技术进行比较,并观察到所建议的技术比现有技术更好或最多可与之相比。
{"title":"Enhancement of high dynamic range images using variational calculus regularizer with stochastic resonance","authors":"Sumit Kumar, R. K. Jha","doi":"10.1145/3009977.3010039","DOIUrl":"https://doi.org/10.1145/3009977.3010039","url":null,"abstract":"While capturing pictures by a simple camera in a scene with the presence of harsh or strong lighting like a full sunny day, we often find loss of highlight detail information (overexposure) in the bright regions and loss of shadow detail information (underexposure) in dark regions. In this manuscript, a classical method for retrieval of minute information from the high dynamic range image has been proposed. Our technique is based on variational calculus and dynamic stochastic resonance (DSR). We use a regularizer function, which has been added in order to optimise the correct estimation of the lost details from the overexposed or underexposed region of the image. We suppress the dynamic range of the luminance image by attenuating large gradient with the large magnitude and low gradient with low magnitude. At the same time, dynamic stochastic resonance (DSR) has been used to improve the underexposed region of the image. The experimental results of our proposed technique are capable of enhancing the quality of images in both overexposed and underexposed regions. The proposed technique is compared with most of the state-of-the-art techniques and it has been observed that the proposed technique is better or at most comparable to the existing techniques.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"6 1","pages":"38:1-38:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87256210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Fusion-based skin detection using image distribution model 基于图像分布模型的融合皮肤检测
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010002
B. Chakraborty, M. Bhuyan, Sunil Kumar
Skin colour detection under poor or varying illumination condition is a big challenge for various image processing and human-computer interaction applications. In this paper, a novel skin detection method utilizing image pixel distribution in a given colour space is proposed. The pixel distribution of an image can provide a better localization of the actual skin colour distribution of an image. Hence, a local skin distribution model (LSDM) is derived using the image pixel distribution model and its similarity with the global skin distribution model (GSDM). Finally, a fusion-based skin model is obtained using both the GSDM and the LSDM. Subsequently, a dynamic region growing method is employed to improve the overall detection rate. Experimental results show that proposed skin detection method can significantly improve the detection accuracy in presence of varying illumination conditions.
光照差或光照变化条件下的肤色检测是各种图像处理和人机交互应用的一大挑战。本文提出了一种利用给定色彩空间中图像像素分布的皮肤检测方法。图像的像素分布可以更好地定位图像的实际肤色分布。因此,利用图像像素分布模型及其与全局皮肤分布模型(GSDM)的相似性,推导出局部皮肤分布模型(LSDM)。最后,结合GSDM和LSDM得到基于融合的皮肤模型。随后,采用动态区域生长方法提高整体检出率。实验结果表明,所提出的皮肤检测方法在不同光照条件下能够显著提高检测精度。
{"title":"Fusion-based skin detection using image distribution model","authors":"B. Chakraborty, M. Bhuyan, Sunil Kumar","doi":"10.1145/3009977.3010002","DOIUrl":"https://doi.org/10.1145/3009977.3010002","url":null,"abstract":"Skin colour detection under poor or varying illumination condition is a big challenge for various image processing and human-computer interaction applications. In this paper, a novel skin detection method utilizing image pixel distribution in a given colour space is proposed. The pixel distribution of an image can provide a better localization of the actual skin colour distribution of an image. Hence, a local skin distribution model (LSDM) is derived using the image pixel distribution model and its similarity with the global skin distribution model (GSDM). Finally, a fusion-based skin model is obtained using both the GSDM and the LSDM. Subsequently, a dynamic region growing method is employed to improve the overall detection rate. Experimental results show that proposed skin detection method can significantly improve the detection accuracy in presence of varying illumination conditions.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"41 4","pages":"67:1-67:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91479252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
MPMF: multi-part multi-feature based object tracking MPMF:基于多部分多特征的目标跟踪
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010057
Neha Bhargava, S. Chaudhuri
The objective of tracking is to determine the states of an object in video frames while maintaining appearance and motion consistency. In this paper, we propose a novel multi-part multi-feature (MPMF) based object tracking which falls in the category of part-based trackers. We represent a target by a set of fixed parts (not semantic as body parts such as limbs, face) and each part is represented by a set of features. The multi-part representation of the object aids in partial occlusion handling and the multi-feature based object description increases robustness of the target representation. Instead of considering all the features of the parts, we measure tracker's confidence for a candidate by utilizing only the strong features of the candidate. This ensures that weak features do not interfere in the decision making. We also present an automatic method for selecting this subset of appropriate features for each part. To increase the tracker's speed and to reduce the number of erroneous candidates, we do not search in the whole frame. We keep the size of search area adaptive that depends on the tracker's confidence for the predicted location of the object. Additionally, it is easy to integrate more parts and features to the proposed tracker. The results on various challenging videos from VOT dataset are encouraging. MPMF outperforms state-of-the-art trackers on some of the standard challenging videos.
跟踪的目标是确定视频帧中物体的状态,同时保持外观和运动的一致性。本文提出了一种基于多部分多特征(MPMF)的目标跟踪方法,该方法属于基于部分的目标跟踪方法。我们用一组固定的部分来表示目标(不是语义上的肢体、面部等身体部位),每个部分用一组特征来表示。目标的多部分表示有助于局部遮挡处理,基于多特征的目标描述增加了目标表示的鲁棒性。我们不考虑部件的所有特征,而是仅利用候选部件的强特征来测量跟踪器对候选部件的置信度。这确保了弱功能不会干扰决策制定。我们还提出了一种为每个零件选择适当特征子集的自动方法。为了提高跟踪器的速度和减少错误的候选数,我们不搜索整个帧。我们保持搜索区域的大小自适应,这取决于跟踪器对目标预测位置的置信度。此外,它很容易集成更多的部件和功能,以拟议的跟踪器。来自VOT数据集的各种具有挑战性的视频的结果令人鼓舞。MPMF在一些具有挑战性的标准视频中表现优于最先进的跟踪器。
{"title":"MPMF: multi-part multi-feature based object tracking","authors":"Neha Bhargava, S. Chaudhuri","doi":"10.1145/3009977.3010057","DOIUrl":"https://doi.org/10.1145/3009977.3010057","url":null,"abstract":"The objective of tracking is to determine the states of an object in video frames while maintaining appearance and motion consistency. In this paper, we propose a novel multi-part multi-feature (MPMF) based object tracking which falls in the category of part-based trackers. We represent a target by a set of fixed parts (not semantic as body parts such as limbs, face) and each part is represented by a set of features. The multi-part representation of the object aids in partial occlusion handling and the multi-feature based object description increases robustness of the target representation. Instead of considering all the features of the parts, we measure tracker's confidence for a candidate by utilizing only the strong features of the candidate. This ensures that weak features do not interfere in the decision making. We also present an automatic method for selecting this subset of appropriate features for each part. To increase the tracker's speed and to reduce the number of erroneous candidates, we do not search in the whole frame. We keep the size of search area adaptive that depends on the tracker's confidence for the predicted location of the object. Additionally, it is easy to integrate more parts and features to the proposed tracker. The results on various challenging videos from VOT dataset are encouraging. MPMF outperforms state-of-the-art trackers on some of the standard challenging videos.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"246 1","pages":"17:1-17:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76968945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Complementary tracker's fusion for robust visual tracking 互补跟踪融合的鲁棒视觉跟踪
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010006
S. Kakanuru, Madan Kumar Rapuru, Deepak Mishra, R. S. S. Gorthi
Though visual object tracking algorithms are capable of handling various challenging scenarios individually, still none of them is robust enough to handle all the challenges simultaneously. This paper aims at proposing a novel robust tracking algorithm by elegantly fusing the frame level detection strategy of Tracking Learning & Detection (TLD) with systematic model update strategy of Kernelized Correlation Filter tracker (KCF). The motivation behind the selection of trackers is their complementary nature in handling tracking challenges. The proposed algorithm efficiently combines the two tracking algorithms based on conservative correspondence measure with strategic model updates, which takes advantages of both and outperforms them on their short-ends by the virtue of other. The proposed fusion approach is quite general and any complimentary tracker (not just KCF) can be fused with TLD to leverage the best performance. Extensive evaluation of the proposed method based on different metrics is carried out on the datasets ALOV300++, Online Tracking Benchmark (OTB) and Visual Object Tracking (VOT2015) and demonstrated its superiority in terms of robustness and success rate by comparing with state-of-the-art trackers.
虽然视觉目标跟踪算法能够单独处理各种具有挑战性的场景,但它们都没有足够的鲁棒性来同时处理所有的挑战。本文将跟踪学习与检测(TLD)的帧级检测策略与核化相关滤波跟踪器(KCF)的系统模型更新策略巧妙融合,提出一种新的鲁棒跟踪算法。选择跟踪器背后的动机是它们在处理跟踪挑战方面的互补性。该算法有效地将两种基于保守对应度量和策略模型更新的跟踪算法结合起来,既发挥了两者的优点,又以其短端优势胜过它们。所提出的融合方法是非常通用的,任何免费的跟踪器(不仅仅是KCF)都可以与TLD融合,以利用最佳性能。在alov300++、在线跟踪基准(Online Tracking Benchmark, OTB)和视觉目标跟踪(Visual Object Tracking, VOT2015)数据集上对该方法进行了基于不同指标的广泛评估,并与最先进的跟踪器进行了比较,证明了该方法在鲁棒性和成功率方面的优势。
{"title":"Complementary tracker's fusion for robust visual tracking","authors":"S. Kakanuru, Madan Kumar Rapuru, Deepak Mishra, R. S. S. Gorthi","doi":"10.1145/3009977.3010006","DOIUrl":"https://doi.org/10.1145/3009977.3010006","url":null,"abstract":"Though visual object tracking algorithms are capable of handling various challenging scenarios individually, still none of them is robust enough to handle all the challenges simultaneously. This paper aims at proposing a novel robust tracking algorithm by elegantly fusing the frame level detection strategy of Tracking Learning & Detection (TLD) with systematic model update strategy of Kernelized Correlation Filter tracker (KCF). The motivation behind the selection of trackers is their complementary nature in handling tracking challenges. The proposed algorithm efficiently combines the two tracking algorithms based on conservative correspondence measure with strategic model updates, which takes advantages of both and outperforms them on their short-ends by the virtue of other. The proposed fusion approach is quite general and any complimentary tracker (not just KCF) can be fused with TLD to leverage the best performance. Extensive evaluation of the proposed method based on different metrics is carried out on the datasets ALOV300++, Online Tracking Benchmark (OTB) and Visual Object Tracking (VOT2015) and demonstrated its superiority in terms of robustness and success rate by comparing with state-of-the-art trackers.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"7 41 1","pages":"51:1-51:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79628659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Qualitative spatial and temporal reasoning over diagrams for activity recognition 定性的空间和时间推理图的活动识别
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010015
Chayanika Deka Nath, S. Hazarika
In quest for an efficient representation schema for activity recognition in video, we employ techniques combining diagrammatic reasoning (DR) with qualitative spatial and temporal reasoning (QSTR). QSTR allows qualitative abstraction of spatio-temporal relations among objects of interest; and is often thwart by ambiguous conclusions. 'Diagrams' influence cognitive reasoning by externalizing mental context. Hence, QSTR over diagrams holds promise. We define 'diagrams' as explicit representation of objects of interest and their spatial information on a 2D grid. A sequence of 'key diagrams' is extracted. Inter diagrammatic reasoning operators combine 'key diagrams' to obtain spatio-temporal information. The qualitative spatial and temporal information thus obtained define short-term activity (STA). Several STAs combine to form long-term activities (LTA). Sequence of STAs as a feature vector is used for LTA recognition. We evaluate our approach over six LTAs from the CAVIAR dataset.
为了寻求视频中活动识别的有效表示模式,我们采用了将图解推理(DR)与定性时空推理(QSTR)相结合的技术。QSTR允许对感兴趣的对象之间的时空关系进行定性抽象;而且常常被模棱两可的结论所挫败。“图表”通过外化心理环境来影响认知推理。因此,QSTR比图更有希望。我们将“图表”定义为感兴趣的对象及其在二维网格上的空间信息的明确表示。提取出一系列“关键图”。图间推理运算符结合“关键图”来获得时空信息。由此获得的定性时空信息定义了短期活动(STA)。几个sta结合起来形成长期活动(LTA)。使用sta序列作为特征向量进行LTA识别。我们在来自CAVIAR数据集的六个lta上评估了我们的方法。
{"title":"Qualitative spatial and temporal reasoning over diagrams for activity recognition","authors":"Chayanika Deka Nath, S. Hazarika","doi":"10.1145/3009977.3010015","DOIUrl":"https://doi.org/10.1145/3009977.3010015","url":null,"abstract":"In quest for an efficient representation schema for activity recognition in video, we employ techniques combining diagrammatic reasoning (DR) with qualitative spatial and temporal reasoning (QSTR). QSTR allows qualitative abstraction of spatio-temporal relations among objects of interest; and is often thwart by ambiguous conclusions. 'Diagrams' influence cognitive reasoning by externalizing mental context. Hence, QSTR over diagrams holds promise. We define 'diagrams' as explicit representation of objects of interest and their spatial information on a 2D grid. A sequence of 'key diagrams' is extracted. Inter diagrammatic reasoning operators combine 'key diagrams' to obtain spatio-temporal information. The qualitative spatial and temporal information thus obtained define short-term activity (STA). Several STAs combine to form long-term activities (LTA). Sequence of STAs as a feature vector is used for LTA recognition. We evaluate our approach over six LTAs from the CAVIAR dataset.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"25 1","pages":"72:1-72:6"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83521632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Event recognition in broadcast soccer videos 足球转播视频中的事件识别
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010074
Himangi Saraogi, R. Sharma, Vijay Kumar
Automatic recognition of important events in soccer broadcast videos plays a vital role in many applications including video summarization, indexing, content-based search, and in performance analysis of players and teams. This paper proposes an approach for soccer event recognition using deep convolutional features combined with domain-specific cues. For deep representation, we use the recently proposed trajectory based deep convolutional descriptor (TDD) [1] which samples and pools the discriminatively trained convolutional features around the improved trajectories. We further improve the performance by incorporating domain-specific knowledge based on camera view type and its position. The camera position and view type captures the statistics of occurrence of events in different play-field regions and zoom-level respectively. We conduct extensive experiments on 6 hour long soccer matches and show the effectiveness of deep video representation for soccer and the improvements obtained using domain-specific cues.
足球转播视频中重要事件的自动识别在视频摘要、索引、基于内容的搜索以及球员和球队的表现分析等许多应用中起着至关重要的作用。本文提出了一种基于深度卷积特征和特定领域线索的足球事件识别方法。对于深度表示,我们使用最近提出的基于轨迹的深度卷积描述符(TDD)[1],它对改进轨迹周围的判别训练卷积特征进行采样和池化。我们通过结合基于摄像机视图类型及其位置的领域特定知识进一步提高了性能。摄像机位置和视图类型分别捕获不同游戏区域和变焦级别中事件发生的统计数据。我们对6小时长的足球比赛进行了广泛的实验,并展示了足球深度视频表示的有效性以及使用特定领域线索获得的改进。
{"title":"Event recognition in broadcast soccer videos","authors":"Himangi Saraogi, R. Sharma, Vijay Kumar","doi":"10.1145/3009977.3010074","DOIUrl":"https://doi.org/10.1145/3009977.3010074","url":null,"abstract":"Automatic recognition of important events in soccer broadcast videos plays a vital role in many applications including video summarization, indexing, content-based search, and in performance analysis of players and teams. This paper proposes an approach for soccer event recognition using deep convolutional features combined with domain-specific cues. For deep representation, we use the recently proposed trajectory based deep convolutional descriptor (TDD) [1] which samples and pools the discriminatively trained convolutional features around the improved trajectories. We further improve the performance by incorporating domain-specific knowledge based on camera view type and its position. The camera position and view type captures the statistics of occurrence of events in different play-field regions and zoom-level respectively. We conduct extensive experiments on 6 hour long soccer matches and show the effectiveness of deep video representation for soccer and the improvements obtained using domain-specific cues.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"4 1","pages":"14:1-14:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81874845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Immersive augmented reality system for assisting needle positioning during ultrasound guided intervention 沉浸式增强现实系统在超声引导干预期间协助针头定位
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010023
P. Kanithi, J. Chatterjee, D. Sheet
Ultrasound (US) guided intervention is a surgical procedure where the clinician makes use of imaging in realtime, to track the position of the needle, and correct its trajectory for accurately steering it to the lesion of interest. However, the needle is visible in the US image, only when aligned in-plane with the scanning plane of the US probe. In practice, clinicians often use a mechanical needle guide, thus restricting their available degrees of freedom in the US probe movement. Alternatively, during free-hand procedure, they use multiple needle punctures to achieve this in-plane positioning. Our present work details an augmented reality (AR) system for patient comfort centric aid to needle intervention through an overlaid visualization of the needle trajectory on the US frame prior to its insertion. This is implemented by continuous visual tracking of the US probe and the needle in 3D world coordinate system using fiducial markers. The tracked marker positions are used to draw the needle trajectory and tip visualized in realtime to augment on the US feed. Subsequently, the continuously tracked US probe and needle, and the navigation assistance information, would be overlaid with the visual feed from a head mounted display (HMD) for generating totally immersive AR experience for the clinician.
超声引导介入是一种外科手术,临床医生利用实时成像来跟踪针的位置,并纠正其轨迹,以准确地将其引导到感兴趣的病变。然而,只有在与美针扫描平面对齐时,才能在美针图像中看到针。在实践中,临床医生经常使用机械导针,从而限制了他们在美国探针运动的可用自由度。或者,在徒手操作过程中,他们使用多个针头穿刺来实现平面内定位。我们目前的工作详细介绍了一种增强现实(AR)系统,该系统以患者舒适度为中心,通过在针头插入之前在美国框架上覆盖针头轨迹的可视化来辅助针头干预。这是通过在三维世界坐标系中使用基准标记对美针和针进行连续的视觉跟踪来实现的。跟踪的标记位置用于绘制实时可视化的针轨迹和尖端,以增强对美国饲料。随后,持续跟踪的美国探针和针头以及导航辅助信息将与头戴式显示器(HMD)的视觉馈送叠加在一起,为临床医生提供完全沉浸式的增强现实体验。
{"title":"Immersive augmented reality system for assisting needle positioning during ultrasound guided intervention","authors":"P. Kanithi, J. Chatterjee, D. Sheet","doi":"10.1145/3009977.3010023","DOIUrl":"https://doi.org/10.1145/3009977.3010023","url":null,"abstract":"Ultrasound (US) guided intervention is a surgical procedure where the clinician makes use of imaging in realtime, to track the position of the needle, and correct its trajectory for accurately steering it to the lesion of interest. However, the needle is visible in the US image, only when aligned in-plane with the scanning plane of the US probe. In practice, clinicians often use a mechanical needle guide, thus restricting their available degrees of freedom in the US probe movement. Alternatively, during free-hand procedure, they use multiple needle punctures to achieve this in-plane positioning. Our present work details an augmented reality (AR) system for patient comfort centric aid to needle intervention through an overlaid visualization of the needle trajectory on the US frame prior to its insertion. This is implemented by continuous visual tracking of the US probe and the needle in 3D world coordinate system using fiducial markers. The tracked marker positions are used to draw the needle trajectory and tip visualized in realtime to augment on the US feed. Subsequently, the continuously tracked US probe and needle, and the navigation assistance information, would be overlaid with the visual feed from a head mounted display (HMD) for generating totally immersive AR experience for the clinician.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"72 1","pages":"65:1-65:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81876073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
How much can a Gaussian smoother denoise? 高斯平滑算法能降噪多少?
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010027
S. Gubbi, Ashutosh Gupta, C. Seelamantula
Recently, a suite of increasingly sophisticated methods have been developed to suppress additive noise from images. Most of these methods take advantage of sparsity of the underlying signal in a specific transform domain to achieve good visual or quantitative results. These methods apply relatively complex statistical modelling techniques to bifurcate the noise from the signal. In this paper, we demonstrate that a spatially adaptive Gaussian smoother could be a very effective solution to the image denoising problem. To derive the optimal parameter estimates for the Gaussian smoothening kernel, we derive and deploy a surrogate of the mean-squared error (MSE) risk similar to the Stein's estimator for Gaussian distributed noise. However, unlike the Stein's estimator or its counterparts for other noise distributions, the proposed generic risk estimator (GenRE) uses only first- and second-order moments of the noise distribution and is agnostic to the exact form of the noise distribution. By locally adapting the parameters of the Gaussian smoother, we obtain a denoising function that has a denoising performance (quantified by the peak signal-to-noise ratio (PSNR)) that is competitive to far more sophisticated methods reported in the literature. To avail the parallelism offered by the proposed method, we also provide a graphics processing unit (GPU) based implementation.
最近,一套越来越复杂的方法已经开发出来,以抑制图像中的加性噪声。这些方法大多利用底层信号在特定变换域中的稀疏性来获得良好的视觉或定量结果。这些方法采用相对复杂的统计建模技术将噪声从信号中分离出来。在本文中,我们证明了一个空间自适应高斯平滑可以是一个非常有效的解决图像去噪问题。为了获得高斯平滑核的最佳参数估计,我们推导并部署了一个均方误差(MSE)风险的代理,类似于高斯分布噪声的Stein估计量。然而,与Stein估计器或其对其他噪声分布的对应估计器不同,所提出的通用风险估计器(GenRE)仅使用噪声分布的一阶和二阶矩,并且与噪声分布的确切形式无关。通过局部调整高斯平滑器的参数,我们获得了一个降噪函数,其降噪性能(由峰值信噪比(PSNR)量化)与文献中报道的更复杂的方法相比具有竞争力。为了利用所提出的方法提供的并行性,我们还提供了一个基于图形处理单元(GPU)的实现。
{"title":"How much can a Gaussian smoother denoise?","authors":"S. Gubbi, Ashutosh Gupta, C. Seelamantula","doi":"10.1145/3009977.3010027","DOIUrl":"https://doi.org/10.1145/3009977.3010027","url":null,"abstract":"Recently, a suite of increasingly sophisticated methods have been developed to suppress additive noise from images. Most of these methods take advantage of sparsity of the underlying signal in a specific transform domain to achieve good visual or quantitative results. These methods apply relatively complex statistical modelling techniques to bifurcate the noise from the signal. In this paper, we demonstrate that a spatially adaptive Gaussian smoother could be a very effective solution to the image denoising problem. To derive the optimal parameter estimates for the Gaussian smoothening kernel, we derive and deploy a surrogate of the mean-squared error (MSE) risk similar to the Stein's estimator for Gaussian distributed noise. However, unlike the Stein's estimator or its counterparts for other noise distributions, the proposed generic risk estimator (GenRE) uses only first- and second-order moments of the noise distribution and is agnostic to the exact form of the noise distribution. By locally adapting the parameters of the Gaussian smoother, we obtain a denoising function that has a denoising performance (quantified by the peak signal-to-noise ratio (PSNR)) that is competitive to far more sophisticated methods reported in the literature. To avail the parallelism offered by the proposed method, we also provide a graphics processing unit (GPU) based implementation.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"8 1","pages":"7:1-7:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78627775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1