首页 > 最新文献

2015 IEEE International Symposium on Multimedia (ISM)最新文献

英文 中文
Distortion Estimation Using Structural Similarity for Video Transmission over Wireless Networks 基于结构相似度的无线视频传输失真估计
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.88
Arun Sankisa, A. Katsaggelos, P. Pahalawatta
Efficient streaming of video over wireless networks requires real-time assessment of distortion due to packet loss, especially because predictive coding at the encoder can cause inter-frame propagation of errors and impact the overall quality of the transmitted video. This paper presents an algorithm to evaluate the expected receiver distortion on the source side by utilizing encoder information, transmission channel characteristics and error concealment. Specifically, distinct video transmission units, Group of Blocks (GOBs), are iteratively built at the source by taking into account macroblock coding modes and motion-compensated error concealment for three different combinations of packet loss. Distortion of these units is then calculated using the structural similarity (SSIM) metric and they are stochastically combined to derive the overall expected distortion. The proposed model provides a more accurate estimate of the distortion that closely models quality as perceived through the human visual system. When incorporated into a content-aware utility function, preliminary experimental results show improved packet ordering & scheduling efficiency and overall video signal at the receiver.
无线网络上高效的视频流需要实时评估由于丢包造成的失真,特别是因为编码器的预测编码会导致帧间传播错误并影响传输视频的整体质量。本文提出了一种利用编码器信息、传输信道特性和错误隐藏来评估信源侧预期接收失真的算法。具体来说,不同的视频传输单元,组块(gob),通过考虑宏块编码模式和三种不同的包丢失组合的运动补偿错误隐藏,在源处迭代构建。然后使用结构相似性(SSIM)度量来计算这些单元的失真,并将它们随机组合以得出总体预期失真。提出的模型提供了一个更准确的失真估计,接近模型质量,通过人类视觉系统感知。当与内容感知实用函数结合时,初步实验结果表明,在接收端,数据包排序和调度效率以及整体视频信号都有所提高。
{"title":"Distortion Estimation Using Structural Similarity for Video Transmission over Wireless Networks","authors":"Arun Sankisa, A. Katsaggelos, P. Pahalawatta","doi":"10.1109/ISM.2015.88","DOIUrl":"https://doi.org/10.1109/ISM.2015.88","url":null,"abstract":"Efficient streaming of video over wireless networks requires real-time assessment of distortion due to packet loss, especially because predictive coding at the encoder can cause inter-frame propagation of errors and impact the overall quality of the transmitted video. This paper presents an algorithm to evaluate the expected receiver distortion on the source side by utilizing encoder information, transmission channel characteristics and error concealment. Specifically, distinct video transmission units, Group of Blocks (GOBs), are iteratively built at the source by taking into account macroblock coding modes and motion-compensated error concealment for three different combinations of packet loss. Distortion of these units is then calculated using the structural similarity (SSIM) metric and they are stochastically combined to derive the overall expected distortion. The proposed model provides a more accurate estimate of the distortion that closely models quality as perceived through the human visual system. When incorporated into a content-aware utility function, preliminary experimental results show improved packet ordering & scheduling efficiency and overall video signal at the receiver.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116018657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Endoscopic Video Retrieval: A Signature-Based Approach for Linking Endoscopic Images with Video Segments 内窥镜视频检索:一种基于特征的内窥镜图像与视频片段链接方法
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.21
C. Beecks, Klaus Schöffmann, M. Lux, M. S. Uysal, T. Seidl
In the field of medical endoscopy more and more surgeons are changing over to record and store videos of their endoscopic procedures, such as surgeries and examinations, in long-term video archives. In order to support surgeons in accessing these endoscopic video archives in a content-based way, we propose a simple yet effective signature-based approach: the Signature Matching Distance based on adaptive-binning feature signatures. The proposed distance-based similarity model facilitates an adaptive representation of the visual properties of endoscopic images and allows for matching these properties efficiently. We conduct an extensive performance analysis with respect to the task of linking specific endoscopic images with video segments and show the high efficacy of our approach. We are able to link more than 88% of the endoscopic images to their corresponding correct video segments, which improves the current state of the art by one order of magnitude.
在医学内窥镜领域,越来越多的外科医生开始将手术和检查等内窥镜过程的视频记录和存储在长期视频档案中。为了支持外科医生以基于内容的方式访问这些内窥镜视频档案,我们提出了一种简单而有效的基于签名的方法:基于自适应分组特征签名的签名匹配距离。所提出的基于距离的相似性模型促进了内窥镜图像视觉属性的自适应表示,并允许有效地匹配这些属性。我们对连接特定内窥镜图像与视频片段的任务进行了广泛的性能分析,并显示了我们方法的高效率。我们能够将超过88%的内窥镜图像链接到相应的正确视频片段,这将目前的技术水平提高了一个数量级。
{"title":"Endoscopic Video Retrieval: A Signature-Based Approach for Linking Endoscopic Images with Video Segments","authors":"C. Beecks, Klaus Schöffmann, M. Lux, M. S. Uysal, T. Seidl","doi":"10.1109/ISM.2015.21","DOIUrl":"https://doi.org/10.1109/ISM.2015.21","url":null,"abstract":"In the field of medical endoscopy more and more surgeons are changing over to record and store videos of their endoscopic procedures, such as surgeries and examinations, in long-term video archives. In order to support surgeons in accessing these endoscopic video archives in a content-based way, we propose a simple yet effective signature-based approach: the Signature Matching Distance based on adaptive-binning feature signatures. The proposed distance-based similarity model facilitates an adaptive representation of the visual properties of endoscopic images and allows for matching these properties efficiently. We conduct an extensive performance analysis with respect to the task of linking specific endoscopic images with video segments and show the high efficacy of our approach. We are able to link more than 88% of the endoscopic images to their corresponding correct video segments, which improves the current state of the art by one order of magnitude.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125094710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Interactive Crowd Content Generation and Analysis Using Trajectory-Level Behavior Learning 使用轨迹级行为学习的交互式人群内容生成和分析
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.89
Sujeong Kim, Aniket Bera, Dinesh Manocha
We present an interactive approach for analyzing crowd videos and generating content for multimedia applications. Our formulation combines online tracking algorithms from computer vision, non-linear pedestrian motion models from computer graphics, and machine learning techniques to automatically compute the trajectory-level pedestrian behaviors for each agent in the video. These learned behaviors are used to detect anomalous behaviors, perform crowd replication, augment crowd videos with virtual agents, and segment the motion of pedestrians. We demonstrate the performance of these tasks using indoor and outdoor crowd video benchmarks consisting of tens of human agents, moreover, our algorithm takes less than a tenth of a second per frame on a multi-core PC. The overall approach can handle dense and heterogeneous crowd behaviors and is useful for realtime crowd scene analysis applications.
我们提出了一种交互式方法来分析人群视频并为多媒体应用程序生成内容。我们的公式结合了计算机视觉的在线跟踪算法、计算机图形学的非线性行人运动模型和机器学习技术,自动计算视频中每个代理的轨迹级行人行为。这些学习行为用于检测异常行为,执行人群复制,用虚拟代理增强人群视频,并分割行人的运动。我们使用由数十个人类代理组成的室内和室外人群视频基准来演示这些任务的性能,此外,我们的算法在多核PC上每帧耗时不到十分之一秒。总体而言,该方法可以处理密集和异构的人群行为,对于实时人群场景分析应用非常有用。
{"title":"Interactive Crowd Content Generation and Analysis Using Trajectory-Level Behavior Learning","authors":"Sujeong Kim, Aniket Bera, Dinesh Manocha","doi":"10.1109/ISM.2015.89","DOIUrl":"https://doi.org/10.1109/ISM.2015.89","url":null,"abstract":"We present an interactive approach for analyzing crowd videos and generating content for multimedia applications. Our formulation combines online tracking algorithms from computer vision, non-linear pedestrian motion models from computer graphics, and machine learning techniques to automatically compute the trajectory-level pedestrian behaviors for each agent in the video. These learned behaviors are used to detect anomalous behaviors, perform crowd replication, augment crowd videos with virtual agents, and segment the motion of pedestrians. We demonstrate the performance of these tasks using indoor and outdoor crowd video benchmarks consisting of tens of human agents, moreover, our algorithm takes less than a tenth of a second per frame on a multi-core PC. The overall approach can handle dense and heterogeneous crowd behaviors and is useful for realtime crowd scene analysis applications.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129745317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Improvement of Image and Video Matting with Multiple Reliability Maps 基于多可靠性映射的图像和视频抠图改进
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.28
Takahiro Hayashi, Masato Ishimori, N. Ishii, K. Abe
In this paper, we propose a framework for extending existing matting methods to actualize more reliable alpha estimation. The key idea of the framework is integration of multiple alpha maps based on their reliabilities. In the proposed framework, the given input image is converted into multiple grayscale images having various luminance appearances. Then, alpha maps are generated corresponding to these grayscale images by utilizing an existing matting method. At the same time reliability maps (single channel images visualizing the reliabilities of the estimated alpha values) are generated. Finally, by combining alpha maps having the highest reliabilities in each local region, one reliable alpha map is generated. The experimental results have shown that reliable alpha estimation can be actualized by the proposed framework.
在本文中,我们提出了一个框架来扩展现有的抠图方法,以实现更可靠的alpha估计。该框架的关键思想是基于可靠性对多个alpha映射进行集成。在所提出的框架中,将给定的输入图像转换为具有不同亮度外观的多个灰度图像。然后,利用现有的抠图方法对这些灰度图像生成对应的alpha贴图。同时生成可靠性图(显示估计alpha值的可靠性的单通道图像)。最后,通过组合每个局部区域具有最高可靠性的alpha地图,生成一个可靠的alpha地图。实验结果表明,该框架可以实现可靠的alpha估计。
{"title":"Improvement of Image and Video Matting with Multiple Reliability Maps","authors":"Takahiro Hayashi, Masato Ishimori, N. Ishii, K. Abe","doi":"10.1109/ISM.2015.28","DOIUrl":"https://doi.org/10.1109/ISM.2015.28","url":null,"abstract":"In this paper, we propose a framework for extending existing matting methods to actualize more reliable alpha estimation. The key idea of the framework is integration of multiple alpha maps based on their reliabilities. In the proposed framework, the given input image is converted into multiple grayscale images having various luminance appearances. Then, alpha maps are generated corresponding to these grayscale images by utilizing an existing matting method. At the same time reliability maps (single channel images visualizing the reliabilities of the estimated alpha values) are generated. Finally, by combining alpha maps having the highest reliabilities in each local region, one reliable alpha map is generated. The experimental results have shown that reliable alpha estimation can be actualized by the proposed framework.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122761675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery 基于航空图像地理空间马赛克的自动视频内容摘要
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.124
R. Viguier, Chung-Ching Lin, H. Aliakbarpour, F. Bunyak, Sharath Pankanti, G. Seetharaman, K. Palaniappan
It is estimated that less than five percent of videos are currently analyzed to any degree. In addition to petabyte-sized multimedia archives, continuing innovations in optics, imaging sensors, camera arrays, (aerial) platforms, and storage technologies indicates that for the foreseeable future existing and new applications will continue to generate enormous volumes of video imagery. Contextual video summarizations and activity maps offers one innovative direction to tackling this Big Data problem in computer vision. The goal of this work is to develop semi-automatic exploitation algorithms and tools to increase utility, dissemination and usage potential by providing quick dynamic overview geospatial mosaics and motion maps. We present a framework to summarize (multiple) video streams from unmanned aerial vehicles (UAV) or drones which have very different characteristics compared to structured commercial and consumer videos that have been analyzed in the past. Using both metadata geospatial characteristics of the video combined with fast low-level image-based algorithms, the proposed method first generates mini-mosaics that can then be combined into geo-referenced meta-mosaics imagery. These geospatial maps enable rapid assessment of hours long videos with arbitrary spatial coverage from multiple sensors by generating quick look imagery, composed of multiple mini-mosaics, summarizing spatiotemporal dynamics such as coverage, dwell time, activity, etc. The overall summarization pipeline was tested on several DARPA Video and Image Retrieval and Analysis Tool (VIRAT) datasets. We evaluate the effectiveness of the proposed video summarization framework using metrics such as compression and hours of viewing time.
据估计,目前只有不到5%的视频得到了某种程度的分析。除了拍字节大小的多媒体档案之外,光学、成像传感器、相机阵列、(空中)平台和存储技术方面的持续创新表明,在可预见的未来,现有的和新的应用将继续产生大量的视频图像。上下文视频摘要和活动地图为解决计算机视觉中的大数据问题提供了一个创新方向。这项工作的目标是开发半自动开发算法和工具,通过提供快速动态概览地理空间马赛克和运动地图来增加效用、传播和使用潜力。我们提出了一个框架来总结来自无人机(UAV)或无人机的(多个)视频流,这些视频流与过去分析的结构化商业和消费者视频相比具有非常不同的特征。该方法将视频的元数据地理空间特征与快速的低水平图像算法相结合,首先生成迷你马赛克,然后将其组合成地理参考元马赛克图像。这些地理空间地图通过生成由多个迷你马赛克组成的快速查看图像,总结了覆盖范围、停留时间、活动等时空动态,从而能够快速评估来自多个传感器的任意空间覆盖的数小时视频。在几个DARPA视频和图像检索和分析工具(VIRAT)数据集上对总体摘要管道进行了测试。我们使用压缩和观看时间等指标来评估所提出的视频摘要框架的有效性。
{"title":"Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery","authors":"R. Viguier, Chung-Ching Lin, H. Aliakbarpour, F. Bunyak, Sharath Pankanti, G. Seetharaman, K. Palaniappan","doi":"10.1109/ISM.2015.124","DOIUrl":"https://doi.org/10.1109/ISM.2015.124","url":null,"abstract":"It is estimated that less than five percent of videos are currently analyzed to any degree. In addition to petabyte-sized multimedia archives, continuing innovations in optics, imaging sensors, camera arrays, (aerial) platforms, and storage technologies indicates that for the foreseeable future existing and new applications will continue to generate enormous volumes of video imagery. Contextual video summarizations and activity maps offers one innovative direction to tackling this Big Data problem in computer vision. The goal of this work is to develop semi-automatic exploitation algorithms and tools to increase utility, dissemination and usage potential by providing quick dynamic overview geospatial mosaics and motion maps. We present a framework to summarize (multiple) video streams from unmanned aerial vehicles (UAV) or drones which have very different characteristics compared to structured commercial and consumer videos that have been analyzed in the past. Using both metadata geospatial characteristics of the video combined with fast low-level image-based algorithms, the proposed method first generates mini-mosaics that can then be combined into geo-referenced meta-mosaics imagery. These geospatial maps enable rapid assessment of hours long videos with arbitrary spatial coverage from multiple sensors by generating quick look imagery, composed of multiple mini-mosaics, summarizing spatiotemporal dynamics such as coverage, dwell time, activity, etc. The overall summarization pipeline was tested on several DARPA Video and Image Retrieval and Analysis Tool (VIRAT) datasets. We evaluate the effectiveness of the proposed video summarization framework using metrics such as compression and hours of viewing time.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"44 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121012492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Joint Video and Sparse 3D Transform-Domain Collaborative Filtering for Time-of-Flight Depth Maps 联合视频和稀疏三维变换域协同滤波的飞行时间深度图
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.112
T. Hach, Tamara Seybold, H. Böttcher
This paper proposes a novel strategy for depth video denoising in RGBD camera systems. Today's depth map sequences obtained by state-of-the-art Time-of-Flight sensors suffer from high temporal noise. All high-level RGB video renderings based on the accompanied depth map's 3D geometry like augmented reality applications will have severe temporal flickering artifacts. We approached this limitation by decoupling depth map upscaling from the temporal denoising step. Thereby, denoising is processed on raw pixels including uncorrelated pixel-wise noise distributions. Our denoising methodology utilizes joint sparse 3D transform-domain collaborative filtering. Therein, we extract RGB texture information to yield a more stable and accurate highly sparse 3D depth block representation for the consecutive shrinkage operation. We show the effectiveness of our method on real RGBD camera data and on a publicly available synthetic data set. The evaluation reveals that our method is superior to state-of-the-art methods. Our method delivers improved flicker-free depth video streams for future applications, which are especially sensitive to temporal noise and arbitrary depth artifacts.
提出了一种新的RGBD相机系统深度视频去噪策略。当今由最先进的飞行时间传感器获得的深度图序列受到高时间噪声的影响。所有基于随附深度图的3D几何图形(如增强现实应用程序)的高级RGB视频渲染都会有严重的时间闪烁伪影。我们通过将深度图上尺度与时间去噪步骤解耦来解决这一限制。因此,在包括不相关的逐像素噪声分布的原始像素上处理去噪。我们的去噪方法采用联合稀疏三维变换域协同滤波。其中,我们提取RGB纹理信息,为连续收缩操作提供更稳定和准确的高度稀疏的3D深度块表示。我们在真实的RGBD相机数据和公开的合成数据集上展示了我们的方法的有效性。评价表明我们的方法优于最先进的方法。我们的方法为未来的应用提供了改进的无闪烁深度视频流,这些视频流对时间噪声和任意深度伪影特别敏感。
{"title":"Joint Video and Sparse 3D Transform-Domain Collaborative Filtering for Time-of-Flight Depth Maps","authors":"T. Hach, Tamara Seybold, H. Böttcher","doi":"10.1109/ISM.2015.112","DOIUrl":"https://doi.org/10.1109/ISM.2015.112","url":null,"abstract":"This paper proposes a novel strategy for depth video denoising in RGBD camera systems. Today's depth map sequences obtained by state-of-the-art Time-of-Flight sensors suffer from high temporal noise. All high-level RGB video renderings based on the accompanied depth map's 3D geometry like augmented reality applications will have severe temporal flickering artifacts. We approached this limitation by decoupling depth map upscaling from the temporal denoising step. Thereby, denoising is processed on raw pixels including uncorrelated pixel-wise noise distributions. Our denoising methodology utilizes joint sparse 3D transform-domain collaborative filtering. Therein, we extract RGB texture information to yield a more stable and accurate highly sparse 3D depth block representation for the consecutive shrinkage operation. We show the effectiveness of our method on real RGBD camera data and on a publicly available synthetic data set. The evaluation reveals that our method is superior to state-of-the-art methods. Our method delivers improved flicker-free depth video streams for future applications, which are especially sensitive to temporal noise and arbitrary depth artifacts.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122908200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Location Specification and Representation in Multimedia Databases 多媒体数据库中的位置规范与表示
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.128
H. Samet
Techniques for the specification and representation of the locational component of multimedia data are reviewed. The focus is on how the locational component is specified and also on how it is represented. For the specification component we also discuss textual specifications. For the representation component, the emphasis is on a sorting approach which yields an index to the locational component where the data includes both points as well as objects with a spatial extent.
回顾了多媒体数据的位置组件的规范和表示技术。重点是如何指定位置组件,以及如何表示位置组件。对于规范组件,我们还讨论了文本规范。对于表示组件,重点是排序方法,该方法生成位置组件的索引,其中的数据既包括点,也包括具有空间范围的对象。
{"title":"Location Specification and Representation in Multimedia Databases","authors":"H. Samet","doi":"10.1109/ISM.2015.128","DOIUrl":"https://doi.org/10.1109/ISM.2015.128","url":null,"abstract":"Techniques for the specification and representation of the locational component of multimedia data are reviewed. The focus is on how the locational component is specified and also on how it is represented. For the specification component we also discuss textual specifications. For the representation component, the emphasis is on a sorting approach which yields an index to the locational component where the data includes both points as well as objects with a spatial extent.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131232263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Go Green with EnVI: the Energy-Video Index 与EnVI一起走向绿色:能源视频指数
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.50
Oche Ejembi, S. Bhatti
Video is the most prevalent traffic type on the Internet today. Significant research has been done on measuring user's Quality of Experience (QoE) through different metrics. We take the position that energy use must be incorporated into quality metrics for digital video. We present our novel, energy-aware QoE metric for video, the Energy-Video Index (EnVI). We present our EnVI measurements from the playback of a diverse set of online videos. We observe that 4K-UHD (2160p) video can use ~30% more energy on a client device compared to HD (720p), and up to ~600% more network bandwidth than FHD (1080p), without significant improvement in objective QoE measurements.
视频是当今互联网上最流行的流量类型。对于用户体验质量(QoE)的度量方法,已有大量的研究。我们的立场是,能源使用必须纳入数字视频的质量指标。我们提出了我们的新颖的,能源意识的视频QoE度量,能源-视频指数(EnVI)。我们通过播放一组不同的在线视频来展示我们的EnVI测量值。我们观察到,与高清(720p)相比,4K-UHD (2160p)视频可以在客户端设备上多使用约30%的能量,比FHD (1080p)多使用约600%的网络带宽,而客观QoE测量没有显着改善。
{"title":"Go Green with EnVI: the Energy-Video Index","authors":"Oche Ejembi, S. Bhatti","doi":"10.1109/ISM.2015.50","DOIUrl":"https://doi.org/10.1109/ISM.2015.50","url":null,"abstract":"Video is the most prevalent traffic type on the Internet today. Significant research has been done on measuring user's Quality of Experience (QoE) through different metrics. We take the position that energy use must be incorporated into quality metrics for digital video. We present our novel, energy-aware QoE metric for video, the Energy-Video Index (EnVI). We present our EnVI measurements from the playback of a diverse set of online videos. We observe that 4K-UHD (2160p) video can use ~30% more energy on a client device compared to HD (720p), and up to ~600% more network bandwidth than FHD (1080p), without significant improvement in objective QoE measurements.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116758338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Portable Lecture Capture that Captures the Complete Lecture 便携式讲座捕获捕获完整的讲座
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.22
P. Dickson, Chris Kondrat, Ryan B. Szeto, W. R. Adrion, Tung T. Pham, Tim D. Richards
Lecture recording is not a new concept nor is high-resolution recording of multimedia presentations that include computer and whiteboard material. We describe a novel portable lecture capture system that captures not only computer content and video as do most modern lecture capture systems but also captures content from whiteboards. The white-board material is captured at high resolution and processed for clarity without the necessity for the electronic whiteboards required by many capture systems. Our presentation system also processes the entire lecture in real time. The system we present is the logical next step in lecture capture technology.
讲座录音并不是一个新概念,包括电脑和白板材料在内的多媒体演示的高分辨率录音也不是一个新概念。我们描述了一种新型的便携式讲座捕捉系统,它不仅能像大多数现代讲座捕捉系统那样捕捉计算机内容和视频,还能捕捉白板上的内容。白板材料以高分辨率捕获并处理以获得清晰度,而不需要许多捕获系统所需的电子白板。我们的演示系统也实时处理整个讲座。我们提出的系统是讲座捕捉技术合乎逻辑的下一步。
{"title":"Portable Lecture Capture that Captures the Complete Lecture","authors":"P. Dickson, Chris Kondrat, Ryan B. Szeto, W. R. Adrion, Tung T. Pham, Tim D. Richards","doi":"10.1109/ISM.2015.22","DOIUrl":"https://doi.org/10.1109/ISM.2015.22","url":null,"abstract":"Lecture recording is not a new concept nor is high-resolution recording of multimedia presentations that include computer and whiteboard material. We describe a novel portable lecture capture system that captures not only computer content and video as do most modern lecture capture systems but also captures content from whiteboards. The white-board material is captured at high resolution and processed for clarity without the necessity for the electronic whiteboards required by many capture systems. Our presentation system also processes the entire lecture in real time. The system we present is the logical next step in lecture capture technology.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115448066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluation of Feature Detection in HDR Based Imaging Under Changes in Illumination Conditions 光照条件变化下基于HDR成像的特征检测评价
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.58
A. Rana, G. Valenzise, F. Dufaux
High dynamic range (HDR) imaging enables to capture details in both dark and very bright regions of a scene, and is therefore supposed to provide higher robustness to illumination changes than conventional low dynamic range (LDR) imaging in tasks such as visual features extraction. However, it is not clear how much this gain is, and which are the best modalities of using HDR to obtain it. In this paper we evaluate the first block of the visual feature extraction pipeline, i.e., keypoint detection, using both LDR and different HDR-based modalities, when significant illumination changes are present in the scene. To this end, we captured a dataset with two scenes and a wide range of illumination conditions. On these images, we measure how the repeatability of either corner or blob interest points is affected with different LDR/HDR approaches. Our observations confirm the potential of HDR over conventional LDR acquisition. Moreover, extracting features directly from HDR pixel values is more effective than first tonemapping and then extracting features, provided that HDR luminance information is previously encoded to perceptually linear values.
高动态范围(HDR)成像能够捕获场景中黑暗和非常明亮区域的细节,因此在视觉特征提取等任务中,比传统的低动态范围(LDR)成像提供更高的光照变化鲁棒性。然而,目前尚不清楚这种增益有多大,以及使用HDR获得增益的最佳方式是什么。在本文中,我们评估了视觉特征提取管道的第一块,即关键点检测,使用LDR和不同的基于hdr的模式,当场景中存在显著的照明变化时。为此,我们捕获了一个具有两个场景和广泛照明条件的数据集。在这些图像上,我们测量了不同的LDR/HDR方法对角点或斑点兴趣点的可重复性的影响。我们的观察证实了HDR相对于传统LDR获取的潜力。此外,如果HDR亮度信息事先编码为感知线性值,则直接从HDR像素值中提取特征比先进行色调映射再提取特征更有效。
{"title":"Evaluation of Feature Detection in HDR Based Imaging Under Changes in Illumination Conditions","authors":"A. Rana, G. Valenzise, F. Dufaux","doi":"10.1109/ISM.2015.58","DOIUrl":"https://doi.org/10.1109/ISM.2015.58","url":null,"abstract":"High dynamic range (HDR) imaging enables to capture details in both dark and very bright regions of a scene, and is therefore supposed to provide higher robustness to illumination changes than conventional low dynamic range (LDR) imaging in tasks such as visual features extraction. However, it is not clear how much this gain is, and which are the best modalities of using HDR to obtain it. In this paper we evaluate the first block of the visual feature extraction pipeline, i.e., keypoint detection, using both LDR and different HDR-based modalities, when significant illumination changes are present in the scene. To this end, we captured a dataset with two scenes and a wide range of illumination conditions. On these images, we measure how the repeatability of either corner or blob interest points is affected with different LDR/HDR approaches. Our observations confirm the potential of HDR over conventional LDR acquisition. Moreover, extracting features directly from HDR pixel values is more effective than first tonemapping and then extracting features, provided that HDR luminance information is previously encoded to perceptually linear values.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114292006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
期刊
2015 IEEE International Symposium on Multimedia (ISM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1