首页 > 最新文献

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)最新文献

英文 中文
Eyeball Movement Model for Lecturer Character in Speech-Driven Embodied Group Entrainment System 言语驱动具身群体娱乐系统中讲师角色眼球运动模型
Yoshihiro Sejima, Tomio Watanabe, M. Jindai, Atsushi Osa
In our previous research, we proposed an eyeball movement model that consists of a saccade model and a group gaze model for enhancing group interaction and communication. In this study, in order to evaluate the effects of the proposed model, we develop an advanced communication system in which the proposed model is used with a speech-driven embodied group entrained communication system. The effectiveness of the proposed model is demonstrated for performing the communication experiments with a sensory evaluation using the developed system.
在我们之前的研究中,我们提出了一个眼球运动模型,该模型由扫视模型和群体注视模型组成,以增强群体的互动和沟通。在本研究中,为了评估所提出的模型的效果,我们开发了一个先进的通信系统,其中所提出的模型与语音驱动的具身群体携带通信系统一起使用。利用所开发的系统进行了具有感官评估的通信实验,验证了所提出模型的有效性。
{"title":"Eyeball Movement Model for Lecturer Character in Speech-Driven Embodied Group Entrainment System","authors":"Yoshihiro Sejima, Tomio Watanabe, M. Jindai, Atsushi Osa","doi":"10.1109/ISM.2013.99","DOIUrl":"https://doi.org/10.1109/ISM.2013.99","url":null,"abstract":"In our previous research, we proposed an eyeball movement model that consists of a saccade model and a group gaze model for enhancing group interaction and communication. In this study, in order to evaluate the effects of the proposed model, we develop an advanced communication system in which the proposed model is used with a speech-driven embodied group entrained communication system. The effectiveness of the proposed model is demonstrated for performing the communication experiments with a sensory evaluation using the developed system.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"39 1","pages":"506-507"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77059376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual Quality and File Size Prediction of H.264 Videos and Its Application to Video Transcoding for the Multimedia Messaging Service and Video on Demand H.264视频的视觉质量和文件大小预测及其在多媒体消息服务和视频点播视频转码中的应用
Didier Joset, S. Coulombe
In this paper, we address the problem of adapting video files to meet terminal file size and resolution constraints while maximizing visual quality. First, two new quality estimation models are proposed, which predict quality as function of resolution, quantization step size, and frame rate parameters. The first model is generic and the second takes video motion into account. Then, we propose a video file size estimation model. Simulation results show a Pearson correlation coefficient (PCC) of 0.956 between the mean opinion score and our generic quality model (0.959 for the motion-conscious model). We obtain a PCC of 0.98 between actual and estimated file sizes. Using these models, we estimate the combination of parameters that yields the best video quality while meeting the target terminal's constraints. We obtain an average quality difference of 4.39% (generic model) and of 3.22% (motion-conscious model) when compared with the best theoretical transcoding possible. The proposed models can be applied to video transcoding for the Multimedia Messaging Service and for video on demand services such as YouTube and Netflix.
在本文中,我们解决的问题是适应视频文件,以满足终端文件的大小和分辨率的限制,同时最大限度地提高视觉质量。首先,提出了两种新的质量估计模型,将质量预测作为分辨率、量化步长和帧率参数的函数。第一个模型是通用的,第二个模型考虑了视频运动。然后,我们提出了一个视频文件大小估计模型。仿真结果显示,平均意见评分与我们的通用质量模型之间的Pearson相关系数(PCC)为0.956(运动意识模型为0.959)。我们得到实际文件大小和估计文件大小之间的PCC为0.98。使用这些模型,我们估计在满足目标终端约束的情况下产生最佳视频质量的参数组合。与最佳理论转码相比,我们获得了4.39%(通用模型)和3.22%(运动意识模型)的平均质量差异。所提出的模型可以应用于多媒体消息服务和视频点播服务(如YouTube和Netflix)的视频转码。
{"title":"Visual Quality and File Size Prediction of H.264 Videos and Its Application to Video Transcoding for the Multimedia Messaging Service and Video on Demand","authors":"Didier Joset, S. Coulombe","doi":"10.1109/ISM.2013.62","DOIUrl":"https://doi.org/10.1109/ISM.2013.62","url":null,"abstract":"In this paper, we address the problem of adapting video files to meet terminal file size and resolution constraints while maximizing visual quality. First, two new quality estimation models are proposed, which predict quality as function of resolution, quantization step size, and frame rate parameters. The first model is generic and the second takes video motion into account. Then, we propose a video file size estimation model. Simulation results show a Pearson correlation coefficient (PCC) of 0.956 between the mean opinion score and our generic quality model (0.959 for the motion-conscious model). We obtain a PCC of 0.98 between actual and estimated file sizes. Using these models, we estimate the combination of parameters that yields the best video quality while meeting the target terminal's constraints. We obtain an average quality difference of 4.39% (generic model) and of 3.22% (motion-conscious model) when compared with the best theoretical transcoding possible. The proposed models can be applied to video transcoding for the Multimedia Messaging Service and for video on demand services such as YouTube and Netflix.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"12 1","pages":"321-328"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84013046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluation of Image Browsing Interfaces for Smartphones and Tablets 智能手机和平板电脑图像浏览界面的评估
Marco A. Hudelist, Klaus Schöffmann, David Ahlström
Smartphones and tablets are popular devices. As lightweight, compact devices with built-in high-quality cameras, they are ideal to carry around and to use for snapshot photography. As the number of photos accumulate on the device quickly finding a particular photo can be tedious using the default grid-based photo browser installed on the device. In this paper we investigate user performance in a photo browsing task on an iPad and an iPod Touch device. We present results from two user experiments comparing the standard grid interface to a pan-and-zoom able grid, a 3D-globe and a 3D-ring. In particular we are interested in how the interfaces perform with large photo collections (100 to 400 photos). The results show most promise for the pan-and-zoom grid and that the performance with the standard grid interface quickly deteriorates with large collections.
智能手机和平板电脑是很受欢迎的设备。它们轻巧紧凑,内置高质量摄像头,是随身携带和抓拍的理想选择。随着设备上的照片数量迅速增加,使用设备上安装的默认的基于网格的照片浏览器查找特定的照片可能会很繁琐。在本文中,我们研究了在iPad和iPod Touch设备上的照片浏览任务中的用户性能。我们展示了两个用户实验的结果,将标准网格界面与平移和缩放网格、3d球体和3d环进行比较。我们特别感兴趣的是界面如何处理大型图片集(100到400张照片)。结果显示,平移和缩放网格最有希望,而标准网格接口的性能在大量集合时迅速恶化。
{"title":"Evaluation of Image Browsing Interfaces for Smartphones and Tablets","authors":"Marco A. Hudelist, Klaus Schöffmann, David Ahlström","doi":"10.1109/ISM.2013.11","DOIUrl":"https://doi.org/10.1109/ISM.2013.11","url":null,"abstract":"Smartphones and tablets are popular devices. As lightweight, compact devices with built-in high-quality cameras, they are ideal to carry around and to use for snapshot photography. As the number of photos accumulate on the device quickly finding a particular photo can be tedious using the default grid-based photo browser installed on the device. In this paper we investigate user performance in a photo browsing task on an iPad and an iPod Touch device. We present results from two user experiments comparing the standard grid interface to a pan-and-zoom able grid, a 3D-globe and a 3D-ring. In particular we are interested in how the interfaces perform with large photo collections (100 to 400 photos). The results show most promise for the pan-and-zoom grid and that the performance with the standard grid interface quickly deteriorates with large collections.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"82 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78325220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Requirements for Mobile Learning Applications in Higher Education 高等教育中移动学习应用的需求
André Klassen, Marcus Eibrink-Lunzenauer, Till Gloggler
Mobile learning has gained significant importance in the field of e-learning and higher education during the last years. Especially in student self-organization of learning, it is important that previous use-cases can be transferred and enhanced to mobile platforms. In the first part of the paper a requirements analysis for an adjusted mobile version of the learn management system (LMS) Stud.IP is presented. The analysis consists of an evaluation of existing approaches, focus group sessions and student surveys to obtain insights on the subject on the one hand and on the other to get specific requirements and usage scenarios of students. The latter part of the paper describes the implementation of an Android-based and web-based app for the LMS Stud.IP.
在过去的几年中,移动学习在电子学习和高等教育领域取得了显著的重要性。特别是在学生的自组织学习中,将以前的用例转移和增强到移动平台是很重要的。本文第一部分对调整后的移动版学习管理系统(LMS)进行了需求分析。IP表示。分析包括对现有方法的评估,焦点小组会议和学生调查,一方面获得对主题的见解,另一方面获得学生的具体需求和使用场景。论文的后半部分描述了基于android和基于web的LMS Stud.IP应用程序的实现。
{"title":"Requirements for Mobile Learning Applications in Higher Education","authors":"André Klassen, Marcus Eibrink-Lunzenauer, Till Gloggler","doi":"10.1109/ISM.2013.94","DOIUrl":"https://doi.org/10.1109/ISM.2013.94","url":null,"abstract":"Mobile learning has gained significant importance in the field of e-learning and higher education during the last years. Especially in student self-organization of learning, it is important that previous use-cases can be transferred and enhanced to mobile platforms. In the first part of the paper a requirements analysis for an adjusted mobile version of the learn management system (LMS) Stud.IP is presented. The analysis consists of an evaluation of existing approaches, focus group sessions and student surveys to obtain insights on the subject on the one hand and on the other to get specific requirements and usage scenarios of students. The latter part of the paper describes the implementation of an Android-based and web-based app for the LMS Stud.IP.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"3 1","pages":"492-497"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88965301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Efficient Content-Based Multimedia Retrieval Using Novel Indexing Structure in PostgreSQL 在PostgreSQL中使用新颖索引结构的高效内容多媒体检索
Fausto Fleites, Shu‐Ching Chen
This demo paper presents a system based on PostgreSQL and the AH-Tree that supports Content-Based Image Retrieval (CBIR) through similarity queries. The AH-Tree is a balanced, tree-based index structure that utilizes high-level semantic information to address the well-known problems of semantic gap and user perception subjectivity. The proposed system implements the AH-Tree inside PostgreSQL's kernel by internally modifying PostgreSQL's GiST access mechanism and thus provides a DBMS with a viable and efficient content-based multimedia retrieval functionality.
本文演示了一个基于PostgreSQL和AH-Tree的系统,该系统通过相似性查询支持基于内容的图像检索(CBIR)。AH-Tree是一种平衡的、基于树的索引结构,它利用高级语义信息来解决众所周知的语义缺口和用户感知主观性问题。该系统通过内部修改PostgreSQL的GiST访问机制,在PostgreSQL内核内部实现了AH-Tree,从而为DBMS提供了一个可行且高效的基于内容的多媒体检索功能。
{"title":"Efficient Content-Based Multimedia Retrieval Using Novel Indexing Structure in PostgreSQL","authors":"Fausto Fleites, Shu‐Ching Chen","doi":"10.1109/ISM.2013.96","DOIUrl":"https://doi.org/10.1109/ISM.2013.96","url":null,"abstract":"This demo paper presents a system based on PostgreSQL and the AH-Tree that supports Content-Based Image Retrieval (CBIR) through similarity queries. The AH-Tree is a balanced, tree-based index structure that utilizes high-level semantic information to address the well-known problems of semantic gap and user perception subjectivity. The proposed system implements the AH-Tree inside PostgreSQL's kernel by internally modifying PostgreSQL's GiST access mechanism and thus provides a DBMS with a viable and efficient content-based multimedia retrieval functionality.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"39 1","pages":"500-501"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90536606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A JND Profile Based on Hierarchically Selective Attention for Images 基于图像层次选择注意的JND配置文件
Dongdong Zhang, Lijing Gao, D. Zang, Yaoru Sun, Jiujun Cheng
Most of the traditional just-noticeable-distortion (JND) models in pixel domain compute the JND threshold by incorporating the spatial luminance adaptation effect and the textures contrast masking effect. Recently, with the rapid development of the computable models of visual attention, researchers started to improve the JND model by considering visual saliency of images, a foveated spatial JND model (FSJND) was proposed by incorporating the traditional visual characteristics and fovea characteristic of human eyes to enhance JND thresholds. However, the thresholds computed by the FSJND model may be overestimated for some high resolution images. In this paper, we proposed a new JND profile in pixel domain, in which a multi-level modulation function is built to reflect the effect of hierarchically selective visual attention on JND thresholds. The contrast masking is also considered in our modulation function to obtain more accurate JND thresholds. Compared with the lasted JND profiles, the proposed model can tolerate more distortion and has much better perceptual quality. The proposed JND model can be easily applied in many areas, such as compression, error protection, and so on.
传统的像素域just- visible -distortion (JND)模型大多是综合空间亮度适应效应和纹理对比度掩蔽效应来计算JND阈值的。近年来,随着视觉注意可计算模型的迅速发展,研究人员开始考虑图像的视觉显著性对JND模型进行改进,提出了一种结合人眼传统视觉特征和中央凹特征来提高JND阈值的注视点空间JND模型(FSJND)。然而,对于某些高分辨率图像,FSJND模型计算的阈值可能会被高估。本文提出了一种新的像素域JND轮廓,其中建立了一个多级调制函数来反映分层选择视觉注意对JND阈值的影响。为了获得更精确的JND阈值,我们还在调制函数中考虑了对比度掩蔽。与现有的JND轮廓相比,该模型可以承受更大的失真,具有更好的感知质量。提出的JND模型可以很容易地应用于许多领域,如压缩、错误保护等。
{"title":"A JND Profile Based on Hierarchically Selective Attention for Images","authors":"Dongdong Zhang, Lijing Gao, D. Zang, Yaoru Sun, Jiujun Cheng","doi":"10.1109/ISM.2013.50","DOIUrl":"https://doi.org/10.1109/ISM.2013.50","url":null,"abstract":"Most of the traditional just-noticeable-distortion (JND) models in pixel domain compute the JND threshold by incorporating the spatial luminance adaptation effect and the textures contrast masking effect. Recently, with the rapid development of the computable models of visual attention, researchers started to improve the JND model by considering visual saliency of images, a foveated spatial JND model (FSJND) was proposed by incorporating the traditional visual characteristics and fovea characteristic of human eyes to enhance JND thresholds. However, the thresholds computed by the FSJND model may be overestimated for some high resolution images. In this paper, we proposed a new JND profile in pixel domain, in which a multi-level modulation function is built to reflect the effect of hierarchically selective visual attention on JND thresholds. The contrast masking is also considered in our modulation function to obtain more accurate JND thresholds. Compared with the lasted JND profiles, the proposed model can tolerate more distortion and has much better perceptual quality. The proposed JND model can be easily applied in many areas, such as compression, error protection, and so on.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"39 1","pages":"263-266"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90747643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Visibility of Single Hazy Images Captured in Inclement Weather Conditions 改善在恶劣天气条件下拍摄的单幅朦胧图像的能见度
Bo-Hao Chen, Shih-Chia Huang
Haze removal is the process by which horizontal obscuration is eliminated from hazy images captured during inclement weather. Sandstorms present a particularly challenging condition, images captured during sandstorms often exhibit color-shift effects due to inadequate spectrum absorption. In this paper, we present a new type of haze removal approach which uses a combination of hybrid spectrum analysis and dark channel prior in order to repair color shifts and thereby achieve effective restoration of hazy images captured during sandstorms. The restoration results and qualitative evaluation demonstrate that our proposed approach can provide superior restoration results for images captured during sandstorms in comparison with the previous state-of-the-art approach.
雾霾去除是在恶劣天气下,从朦胧图像中消除水平遮挡的过程。沙尘暴呈现出特别具有挑战性的条件,在沙尘暴期间拍摄的图像通常由于光谱吸收不足而出现色差效应。在本文中,我们提出了一种新型的雾霾去除方法,该方法采用混合光谱分析和暗通道先验相结合的方法来修复色移,从而实现沙尘暴期间捕获的雾霾图像的有效恢复。恢复结果和定性评价表明,与以前的最先进的方法相比,我们提出的方法可以提供更好的恢复结果。
{"title":"Improved Visibility of Single Hazy Images Captured in Inclement Weather Conditions","authors":"Bo-Hao Chen, Shih-Chia Huang","doi":"10.1109/ISM.2013.51","DOIUrl":"https://doi.org/10.1109/ISM.2013.51","url":null,"abstract":"Haze removal is the process by which horizontal obscuration is eliminated from hazy images captured during inclement weather. Sandstorms present a particularly challenging condition, images captured during sandstorms often exhibit color-shift effects due to inadequate spectrum absorption. In this paper, we present a new type of haze removal approach which uses a combination of hybrid spectrum analysis and dark channel prior in order to repair color shifts and thereby achieve effective restoration of hazy images captured during sandstorms. The restoration results and qualitative evaluation demonstrate that our proposed approach can provide superior restoration results for images captured during sandstorms in comparison with the previous state-of-the-art approach.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"60 1","pages":"267-270"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84788233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Relevance Segmentation of Laparoscopic Videos 腹腔镜视频的相关分割
Bernd Münzer, Klaus Schöffmann, L. Böszörményi
In recent years, it became common to record video footage of laparoscopic surgeries. This leads to large video archives that are very hard to manage. They often contain a considerable portion of completely irrelevant scenes which waste storage capacity and hamper an efficient retrieval of relevant scenes. In this paper we (1) define three classes of irrelevant segments, (2) propose visual feature extraction methods to obtain irrelevance indicators for each class and (3) present an extensible framework to detect irrelevant segments in laparoscopic videos. The framework includes a training component that learns a prediction model using nonlinear regression with a generalized logistic function and a segment composition algorithm that derives segment boundaries from the fuzzy frame classifications. The experimental results show that our method performs very good both for the classification of individual frames and the detection of segment boundaries in videos and enables considerable storage space savings.
近年来,记录腹腔镜手术的视频片段变得很普遍。这导致大量的视频档案很难管理。它们通常包含相当一部分完全不相关的场景,这浪费了存储容量,妨碍了相关场景的有效检索。在本文中,我们(1)定义了三类不相关片段,(2)提出了视觉特征提取方法来获得每一类不相关的指标,(3)提出了一个可扩展的框架来检测腹腔镜视频中的不相关片段。该框架包括一个训练组件,该组件使用具有广义逻辑函数的非线性回归学习预测模型,以及一个从模糊框架分类中导出段边界的段组合算法。实验结果表明,该方法对于视频中单个帧的分类和片段边界的检测都有很好的效果,并且可以节省大量的存储空间。
{"title":"Relevance Segmentation of Laparoscopic Videos","authors":"Bernd Münzer, Klaus Schöffmann, L. Böszörményi","doi":"10.1109/ISM.2013.22","DOIUrl":"https://doi.org/10.1109/ISM.2013.22","url":null,"abstract":"In recent years, it became common to record video footage of laparoscopic surgeries. This leads to large video archives that are very hard to manage. They often contain a considerable portion of completely irrelevant scenes which waste storage capacity and hamper an efficient retrieval of relevant scenes. In this paper we (1) define three classes of irrelevant segments, (2) propose visual feature extraction methods to obtain irrelevance indicators for each class and (3) present an extensible framework to detect irrelevant segments in laparoscopic videos. The framework includes a training component that learns a prediction model using nonlinear regression with a generalized logistic function and a segment composition algorithm that derives segment boundaries from the fuzzy frame classifications. The experimental results show that our method performs very good both for the classification of individual frames and the detection of segment boundaries in videos and enables considerable storage space savings.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"20 1","pages":"84-91"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89580226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Speeded-Up Video Summarization Based on Local Features 基于局部特征的加速视频摘要
Javier Iparraguirre, C. Delrieux
Digital video has become a very popular media in several contexts, with an ever expanding horizon of applications and uses. Thus, the amount of available video data is growing almost limitless. For this reason, video summarization continues to attract the attention of a wide spectrum of research efforts. In this work we present a novel video summarization technique based on tracking local features among consecutive frames. Our approach operates on the uncompressed domain, and requires only a small set of consecutive frames to perform, thus being able to process the video stream directly and produce results on the fly. We tested our implementation on standard available datasets, and compared the results with the most recent published work in the field. The results achieved show that our proposal produces summarizations that have similar quality than the best published proposals, with the additional advantage of being able to process the stream directly in the uncompressed domain.
数字视频已经成为一种非常流行的媒体,在许多情况下,应用和用途的范围不断扩大。因此,可用视频数据的数量几乎是无限增长的。由于这个原因,视频摘要继续吸引广泛的研究工作的注意。在这项工作中,我们提出了一种新的基于跟踪连续帧之间的局部特征的视频摘要技术。我们的方法在未压缩域上操作,只需要一小组连续帧来执行,因此能够直接处理视频流并在飞行中产生结果。我们在标准可用数据集上测试了我们的实现,并将结果与该领域最新发表的工作进行了比较。所取得的结果表明,我们的建议产生的摘要与最好的已发布的建议具有相似的质量,并且具有能够直接在未压缩域中处理流的额外优势。
{"title":"Speeded-Up Video Summarization Based on Local Features","authors":"Javier Iparraguirre, C. Delrieux","doi":"10.1109/ISM.2013.70","DOIUrl":"https://doi.org/10.1109/ISM.2013.70","url":null,"abstract":"Digital video has become a very popular media in several contexts, with an ever expanding horizon of applications and uses. Thus, the amount of available video data is growing almost limitless. For this reason, video summarization continues to attract the attention of a wide spectrum of research efforts. In this work we present a novel video summarization technique based on tracking local features among consecutive frames. Our approach operates on the uncompressed domain, and requires only a small set of consecutive frames to perform, thus being able to process the video stream directly and produce results on the fly. We tested our implementation on standard available datasets, and compared the results with the most recent published work in the field. The results achieved show that our proposal produces summarizations that have similar quality than the best published proposals, with the additional advantage of being able to process the stream directly in the uncompressed domain.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"15 1","pages":"370-373"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81019536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A Video Text Detection and Tracking System 视频文本检测与跟踪系统
Tuoerhongjiang Yusufu, Yiqing Wang, Xiangzhong Fang
Faced with the increasing large scale video databases, retrieving videos quickly and efficiently has become a crucial problem. Video text, which carries high level semantic information, is a type of important source that is useful for this task. In this paper, we introduce a video text detecting and tracking approach. By these methods we can obtain clear binary text images, and these text images can be processed by OCR (Optical Character Recognition) software directly. Our approach including two parts, one is stroke-model based video text detection and localization method, the other is SURF (Speeded Up Robust Features) based text region tracking method. In our detection and localization approach, we use stroke model and morphological operation to roughly identify candidate text regions. Combine stroke-map and edge response to localize text lines in each candidate text regions. Several heuristics and SVM (Support Vector Machine) used to verifying text blocks. The core part of our text tracking method is fast approximate nearest-neighbour search algorithm for extracted SURF features. Text-ending frame is determined based on SURF feature point numbers, while, text motion estimation is based on correct matches in adjacent frames. Experimental result on large number of different video clips shows that our approach can effectively detect and track both static texts and scrolling texts.
面对日益庞大的视频数据库,快速高效地检索视频已成为一个关键问题。视频文本承载着高层次的语义信息,是实现这一任务的一种重要来源。本文介绍了一种视频文本检测与跟踪方法。通过这些方法可以得到清晰的二值文本图像,这些文本图像可以直接被OCR(光学字符识别)软件处理。我们的方法包括两部分,一是基于笔画模型的视频文本检测与定位方法,二是基于SURF (accelerated Robust Features)的文本区域跟踪方法。在我们的检测和定位方法中,我们使用笔画模型和形态学操作来粗略地识别候选文本区域。结合描边映射和边缘响应来定位每个候选文本区域中的文本行。几种启发式算法和支持向量机(SVM)用于验证文本块。本文文本跟踪方法的核心部分是提取SURF特征的快速近似近邻搜索算法。文本结束帧是基于SURF特征点数确定的,文本运动估计是基于相邻帧的正确匹配。在大量不同视频片段上的实验结果表明,我们的方法可以有效地检测和跟踪静态文本和滚动文本。
{"title":"A Video Text Detection and Tracking System","authors":"Tuoerhongjiang Yusufu, Yiqing Wang, Xiangzhong Fang","doi":"10.1109/ISM.2013.106","DOIUrl":"https://doi.org/10.1109/ISM.2013.106","url":null,"abstract":"Faced with the increasing large scale video databases, retrieving videos quickly and efficiently has become a crucial problem. Video text, which carries high level semantic information, is a type of important source that is useful for this task. In this paper, we introduce a video text detecting and tracking approach. By these methods we can obtain clear binary text images, and these text images can be processed by OCR (Optical Character Recognition) software directly. Our approach including two parts, one is stroke-model based video text detection and localization method, the other is SURF (Speeded Up Robust Features) based text region tracking method. In our detection and localization approach, we use stroke model and morphological operation to roughly identify candidate text regions. Combine stroke-map and edge response to localize text lines in each candidate text regions. Several heuristics and SVM (Support Vector Machine) used to verifying text blocks. The core part of our text tracking method is fast approximate nearest-neighbour search algorithm for extracted SURF features. Text-ending frame is determined based on SURF feature point numbers, while, text motion estimation is based on correct matches in adjacent frames. Experimental result on large number of different video clips shows that our approach can effectively detect and track both static texts and scrolling texts.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"25 1","pages":"522-529"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81590513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1