首页 > 最新文献

2015 IEEE International Symposium on Multimedia (ISM)最新文献

英文 中文
Feature Level Fusion for Bimodal Facial Action Unit Recognition 双峰面部动作单元识别的特征级融合
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.116
Zibo Meng, Shizhong Han, Min Chen, Yan Tong
Recognizing facial actions from spontaneous facial displays suffers from subtle and complex facial deformation, frequent head movements, and partial occlusions. It is especially challenging when the facial activities are accompanied with speech. Instead of employing information solely from the visual channel, this paper presents a novel fusion framework, which exploits information from both visual and audio channels in recognizing speech-related facial action units (AUs). In particular, features are first extracted from visual and audio channels, independently. Then, the audio features are aligned with the visual features in order to handle the difference in time scales and the time shift between the two signals. Finally, these aligned audio and visual features are integrated via a feature-level fusion framework and utilized in recognizing AUs. Experimental results on a new audiovisual AU-coded dataset have demonstrated that the proposed feature-level fusion framework outperforms a state-of-the-art visual-based method in recognizing speech-related AUs, especially for those AUs that are "invisible" in the visual channel during speech. The improvement is more impressive with occlusions on the facial images, which, fortunately, would not affect the audio channel.
从自发的面部表情中识别面部动作受到微妙而复杂的面部变形、频繁的头部运动和部分闭塞的影响。当面部活动伴随着语言时,这尤其具有挑战性。本文提出了一种新的融合框架,该框架利用视觉和音频通道的信息来识别语音相关的面部动作单元。特别是,首先从视觉和音频通道中独立提取特征。然后,将音频特征与视觉特征对齐,以处理两个信号之间的时间尺度差异和时移。最后,通过特征级融合框架将这些对齐的音频和视觉特征集成并用于识别AUs。在一个新的视听au编码数据集上的实验结果表明,所提出的特征级融合框架在识别语音相关au方面优于最先进的基于视觉的方法,特别是对于那些在语音过程中视觉通道中“不可见”的au。在面部图像遮挡的情况下,这种改进更令人印象深刻,幸运的是,这不会影响音频通道。
{"title":"Feature Level Fusion for Bimodal Facial Action Unit Recognition","authors":"Zibo Meng, Shizhong Han, Min Chen, Yan Tong","doi":"10.1109/ISM.2015.116","DOIUrl":"https://doi.org/10.1109/ISM.2015.116","url":null,"abstract":"Recognizing facial actions from spontaneous facial displays suffers from subtle and complex facial deformation, frequent head movements, and partial occlusions. It is especially challenging when the facial activities are accompanied with speech. Instead of employing information solely from the visual channel, this paper presents a novel fusion framework, which exploits information from both visual and audio channels in recognizing speech-related facial action units (AUs). In particular, features are first extracted from visual and audio channels, independently. Then, the audio features are aligned with the visual features in order to handle the difference in time scales and the time shift between the two signals. Finally, these aligned audio and visual features are integrated via a feature-level fusion framework and utilized in recognizing AUs. Experimental results on a new audiovisual AU-coded dataset have demonstrated that the proposed feature-level fusion framework outperforms a state-of-the-art visual-based method in recognizing speech-related AUs, especially for those AUs that are \"invisible\" in the visual channel during speech. The improvement is more impressive with occlusions on the facial images, which, fortunately, would not affect the audio channel.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121448682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Network Adaptive Textured Mesh Generation for Collaborative 3D Tele-Immersion 协同三维远程沉浸的网络自适应纹理网格生成
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.111
Kevin Desai, K. Bahirat, S. Raghuraman, B. Prabhakaran
3D Tele-Immersion (3DTI) has emerged as an efficient environment for virtual interactions and collaborations in a variety of fields like rehabilitation, education, gaming, etc. In 3DTI, geographically distributed users are captured using multiple cameras and immersed in a single virtual environment. The quality of experience depends on the available network bandwidth, quality of the 3D model generated and the time taken for rendering. In a collaborative environment, achieving high quality, high frame rate rendering by transmitting data to multiple sites having different bandwidth is challenging. In this paper we introduce a network adaptive textured mesh generation scheme to transmit varying quality data based on the available bandwidth. To reduce the volume of information transmitted, a visual quality based vertex selection approach is used to generate a sparse representation of the user. This sparse representation is then transmitted to the receiver side where a sweep-line based technique is used to generate a 3D mesh of the user. High visual quality is maintained by transmitting a high resolution texture image compressed using a lossy compression algorithm. In our studies users were unable to notice visual quality variations of the rendered 3D model even at 90% compression.
3D远程沉浸(3DTI)已经成为一种有效的虚拟交互和协作环境,应用于康复、教育、游戏等各个领域。在3DTI中,使用多个摄像机捕获地理分布的用户,并沉浸在单个虚拟环境中。体验的质量取决于可用的网络带宽、生成的3D模型的质量和渲染所花费的时间。在协作环境中,通过将数据传输到具有不同带宽的多个站点来实现高质量、高帧率的渲染是具有挑战性的。本文介绍了一种基于可用带宽的网络自适应纹理网格生成方案来传输不同质量的数据。为了减少传输的信息量,使用基于视觉质量的顶点选择方法来生成用户的稀疏表示。然后将这种稀疏表示传输到接收端,在接收端使用基于扫描线的技术生成用户的3D网格。通过传输使用有损压缩算法压缩的高分辨率纹理图像来保持高视觉质量。在我们的研究中,即使在90%的压缩下,用户也无法注意到渲染3D模型的视觉质量变化。
{"title":"Network Adaptive Textured Mesh Generation for Collaborative 3D Tele-Immersion","authors":"Kevin Desai, K. Bahirat, S. Raghuraman, B. Prabhakaran","doi":"10.1109/ISM.2015.111","DOIUrl":"https://doi.org/10.1109/ISM.2015.111","url":null,"abstract":"3D Tele-Immersion (3DTI) has emerged as an efficient environment for virtual interactions and collaborations in a variety of fields like rehabilitation, education, gaming, etc. In 3DTI, geographically distributed users are captured using multiple cameras and immersed in a single virtual environment. The quality of experience depends on the available network bandwidth, quality of the 3D model generated and the time taken for rendering. In a collaborative environment, achieving high quality, high frame rate rendering by transmitting data to multiple sites having different bandwidth is challenging. In this paper we introduce a network adaptive textured mesh generation scheme to transmit varying quality data based on the available bandwidth. To reduce the volume of information transmitted, a visual quality based vertex selection approach is used to generate a sparse representation of the user. This sparse representation is then transmitted to the receiver side where a sweep-line based technique is used to generate a 3D mesh of the user. High visual quality is maintained by transmitting a high resolution texture image compressed using a lossy compression algorithm. In our studies users were unable to notice visual quality variations of the rendered 3D model even at 90% compression.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126562090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Classquake: Measuring Students' Attentiveness in the Classroom 课堂地震:学生课堂注意力的测量
Pub Date : 2015-12-01 DOI: 10.1109/ism.2015.24
Kai Michael Hover, M. Muhlhauser
{"title":"Classquake: Measuring Students' Attentiveness in the Classroom","authors":"Kai Michael Hover, M. Muhlhauser","doi":"10.1109/ism.2015.24","DOIUrl":"https://doi.org/10.1109/ism.2015.24","url":null,"abstract":"","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134125097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Foveated High Efficiency Video Coding for Low Bit Rate Transmission 面向低比特率传输的注视点高效视频编码
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.37
I. Cheng, Masha Mohammadkhani, A. Basu, F. Dufaux
This work describes the design and subjective performance of Foveated High Efficiency Video Coding (FHEVC). Even though foveation has been widely used for various forms of compression since the early 1990s, we believe its use to improve HEVC is new. We consider the application of, possibly moving, foveated compression in this work and evaluate scenarios where it can be used to improve perceptual quality of videos under constrained transmission resources, e.g., bandwidth. A new method to reduce artifacts during remapping is also proposed. The preliminary implementation considers a single fovea only. Experiments summarizing user evaluations are presented to validate our implementation.
本文描述了注视点高效视频编码(Foveated High Efficiency Video Coding, FHEVC)的设计和主观性能。尽管自20世纪90年代初以来,注视点已广泛用于各种形式的压缩,但我们相信它用于改善HEVC是新的。我们在这项工作中考虑了可能移动的注视点压缩的应用,并评估了在受限传输资源(例如带宽)下可用于提高视频感知质量的场景。提出了一种减少重映射过程中伪影的新方法。初步实现只考虑单个中央凹。实验总结了用户的评价,以验证我们的实现。
{"title":"Foveated High Efficiency Video Coding for Low Bit Rate Transmission","authors":"I. Cheng, Masha Mohammadkhani, A. Basu, F. Dufaux","doi":"10.1109/ISM.2015.37","DOIUrl":"https://doi.org/10.1109/ISM.2015.37","url":null,"abstract":"This work describes the design and subjective performance of Foveated High Efficiency Video Coding (FHEVC). Even though foveation has been widely used for various forms of compression since the early 1990s, we believe its use to improve HEVC is new. We consider the application of, possibly moving, foveated compression in this work and evaluate scenarios where it can be used to improve perceptual quality of videos under constrained transmission resources, e.g., bandwidth. A new method to reduce artifacts during remapping is also proposed. The preliminary implementation considers a single fovea only. Experiments summarizing user evaluations are presented to validate our implementation.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123811358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Frame Synchronization of Live Video Streams Using Visible Light Communication 利用可见光通信实现实时视频流的帧同步
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.26
Maziar Mehrabi, S. Lafond, Le Wang
With the growth of heterogeneous social media networks and the widespread use of camera-equipped handheld devices, interactive video broadcasting services are emerging on the Internet. When a media server combines and broadcasts live-streaming video contents received from heterogeneous camera equipped devices filming a common scene from different angles, the time-based alignment of the audio and video streams is required. Although many techniques and methods for video stream synchronization have been in use or proposed, these solutions are not suitable for a non-centralized multi-camera system consisting of for example heterogeneous camera-equipped smart phones. This paper proposes a novel approach by harnessing the capabilities of Visible Light Communication (VLC) to provide a robust and efficient way to synchronize video streams. This paper presents the design and implementation of a VLC-based video synchronization prototype. The synchronization of different video streams is provided by the means of VLC through Light Emitting Diode (LED) lights and digital phone cameras. This is achieved by embedding the necessary information as light patterns in the video content which can later be extracted by processing the video streams. The main benefit of our approach is the ability to use off-the-shelf cameras as it does not require any modification of software or hardware components in the camera devices. Moreover, the means of VLC can be exploited to carry other types of information such as position so that the receiver of the video stream can have a notion of the location in which the video was recorded.
随着异构社交媒体网络的发展和带摄像头手持设备的广泛使用,交互式视频广播服务在互联网上应运而生。当媒体服务器合并并播放从不同角度拍摄公共场景的异构摄像机设备接收的实时流视频内容时,需要基于时间的音频和视频流对齐。尽管许多视频流同步的技术和方法已经在使用或提出,但这些解决方案不适合由非集中式多摄像头系统组成的系统,例如由配备了异构摄像头的智能手机组成的系统。本文提出了一种利用可见光通信(VLC)的能力来提供一种鲁棒且有效的视频流同步方法的新方法。本文介绍了一个基于vlc的视频同步原型的设计与实现。不同视频流的同步是由VLC通过发光二极管(LED)灯和数字手机摄像头提供的。这是通过在视频内容中嵌入作为光模式的必要信息来实现的,这些信息稍后可以通过处理视频流来提取。我们的方法的主要好处是能够使用现成的相机,因为它不需要修改相机设备中的软件或硬件组件。此外,可以利用VLC的手段来携带其他类型的信息,例如位置,以便视频流的接收器可以具有记录视频的位置的概念。
{"title":"Frame Synchronization of Live Video Streams Using Visible Light Communication","authors":"Maziar Mehrabi, S. Lafond, Le Wang","doi":"10.1109/ISM.2015.26","DOIUrl":"https://doi.org/10.1109/ISM.2015.26","url":null,"abstract":"With the growth of heterogeneous social media networks and the widespread use of camera-equipped handheld devices, interactive video broadcasting services are emerging on the Internet. When a media server combines and broadcasts live-streaming video contents received from heterogeneous camera equipped devices filming a common scene from different angles, the time-based alignment of the audio and video streams is required. Although many techniques and methods for video stream synchronization have been in use or proposed, these solutions are not suitable for a non-centralized multi-camera system consisting of for example heterogeneous camera-equipped smart phones. This paper proposes a novel approach by harnessing the capabilities of Visible Light Communication (VLC) to provide a robust and efficient way to synchronize video streams. This paper presents the design and implementation of a VLC-based video synchronization prototype. The synchronization of different video streams is provided by the means of VLC through Light Emitting Diode (LED) lights and digital phone cameras. This is achieved by embedding the necessary information as light patterns in the video content which can later be extracted by processing the video streams. The main benefit of our approach is the ability to use off-the-shelf cameras as it does not require any modification of software or hardware components in the camera devices. Moreover, the means of VLC can be exploited to carry other types of information such as position so that the receiver of the video stream can have a notion of the location in which the video was recorded.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116235884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Dynamic MCU Placement for Video Conferencing on Peer-to-Peer Network 点对点网络视频会议的动态MCU布局
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.125
Md. Amjad Hossain, J. Khan
In this paper, we investigate a novel Multipoint Video Conferencing (MVC) architecture potentially suitable for Peer-to-Peer (P2P) platform, such as Gnutella. In particular, we present an election protocol (extension to Gnutella) where the Multipoint Control Unit (MCU) of the MVC is dynamically migrated among peers when new peer joins or leaves. Simulation result shows that this improves overall conferencing performance compared to the system with static MCU by minimizing total traffic, individual node hotness, and video composition delay.
在本文中,我们研究了一种可能适用于点对点(P2P)平台(如Gnutella)的新型多点视频会议(MVC)架构。特别地,我们提出了一种选举协议(扩展到Gnutella),其中MVC的多点控制单元(MCU)在新的对等体加入或离开时在对等体之间动态迁移。仿真结果表明,与静态MCU系统相比,该系统通过最小化总流量、单个节点热度和视频合成延迟,提高了整体会议性能。
{"title":"Dynamic MCU Placement for Video Conferencing on Peer-to-Peer Network","authors":"Md. Amjad Hossain, J. Khan","doi":"10.1109/ISM.2015.125","DOIUrl":"https://doi.org/10.1109/ISM.2015.125","url":null,"abstract":"In this paper, we investigate a novel Multipoint Video Conferencing (MVC) architecture potentially suitable for Peer-to-Peer (P2P) platform, such as Gnutella. In particular, we present an election protocol (extension to Gnutella) where the Multipoint Control Unit (MCU) of the MVC is dynamically migrated among peers when new peer joins or leaves. Simulation result shows that this improves overall conferencing performance compared to the system with static MCU by minimizing total traffic, individual node hotness, and video composition delay.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116847974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Towards an Efficient Algorithm to Get the Chorus of a Salsa Song 一种获取萨尔萨歌曲合唱的有效算法
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.42
Camilo Arévalo, M. GerardoM.Sarria, M. Mora, Carlos A. Arce-Lopera
A well-known musical genre and part of Latin-American cultural identity is Salsa. To be able to perform a scientific analysis of this genre, the first step to take is to analyze the structure of Salsa songs. Furthermore, the most representative part of Salsa is the chorus. In this paper we detail the design and implementation of an algorithm developed for getting the chorus of any Salsa song.
Salsa是一种著名的音乐类型,也是拉丁美洲文化特征的一部分。为了能够对这一流派进行科学的分析,第一步是分析萨尔萨歌曲的结构。此外,萨尔萨舞最具代表性的部分是合唱部分。在本文中,我们详细设计和实现了一种算法,用于获得任何萨尔萨歌曲的合唱。
{"title":"Towards an Efficient Algorithm to Get the Chorus of a Salsa Song","authors":"Camilo Arévalo, M. GerardoM.Sarria, M. Mora, Carlos A. Arce-Lopera","doi":"10.1109/ISM.2015.42","DOIUrl":"https://doi.org/10.1109/ISM.2015.42","url":null,"abstract":"A well-known musical genre and part of Latin-American cultural identity is Salsa. To be able to perform a scientific analysis of this genre, the first step to take is to analyze the structure of Salsa songs. Furthermore, the most representative part of Salsa is the chorus. In this paper we detail the design and implementation of an algorithm developed for getting the chorus of any Salsa song.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129273833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Scalable Saliency-Aware Distributed Compressive Video Sensing 可扩展显著性感知分布式压缩视频感知
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.54
Jin Xu, S. Djahel, Yuansong Qiao
Distributed compressive video sensing (DCVS) is an emerging low-complexity video coding framework which integrates the merits of distributed video coding (DVC) and compressive sensing (CS). Because the human visual system (HVS) is the ultimate receiver of visual signals, we aim to improve the perceptual rate-distortion performance of DCVS by designing a novel scalable saliency-aware DCVS codec. Firstly, we perform saliency estimation in the the side information (SI) frame generated at the decoder side and adaptively control the size of region-of-interest (ROI) according to the measurements budget by applying a saliency guided foveation model. Subsequently, based on online estimation of the correlation noise between a non-key frame and its SI, we develop a saliency-aware block compressive sensing scheme to more accurately reconstruct the ROI of each non-key frame. The obtained experimental results reveal that our DCVS codec outperforms the legacy DCVS codecs in terms of the perceptual rate-distortion performance.
分布式压缩视频感知(DCVS)是一种新兴的低复杂度视频编码框架,它融合了分布式视频编码(DVC)和压缩感知(CS)的优点。由于人类视觉系统(HVS)是视觉信号的最终接收者,我们旨在通过设计一种新颖的可扩展显著性感知DCVS编解码器来提高DCVS的感知率失真性能。首先,我们对解码器侧生成的侧信息帧进行显著性估计,并应用显著性引导注视点模型根据测量预算自适应控制感兴趣区域(ROI)的大小。随后,基于在线估计非关键帧与其SI之间的相关噪声,我们开发了一种显著性感知的块压缩感知方案,以更准确地重建每个非关键帧的ROI。实验结果表明,我们的DCVS编解码器在感知率失真性能方面优于传统的DCVS编解码器。
{"title":"Scalable Saliency-Aware Distributed Compressive Video Sensing","authors":"Jin Xu, S. Djahel, Yuansong Qiao","doi":"10.1109/ISM.2015.54","DOIUrl":"https://doi.org/10.1109/ISM.2015.54","url":null,"abstract":"Distributed compressive video sensing (DCVS) is an emerging low-complexity video coding framework which integrates the merits of distributed video coding (DVC) and compressive sensing (CS). Because the human visual system (HVS) is the ultimate receiver of visual signals, we aim to improve the perceptual rate-distortion performance of DCVS by designing a novel scalable saliency-aware DCVS codec. Firstly, we perform saliency estimation in the the side information (SI) frame generated at the decoder side and adaptively control the size of region-of-interest (ROI) according to the measurements budget by applying a saliency guided foveation model. Subsequently, based on online estimation of the correlation noise between a non-key frame and its SI, we develop a saliency-aware block compressive sensing scheme to more accurately reconstruct the ROI of each non-key frame. The obtained experimental results reveal that our DCVS codec outperforms the legacy DCVS codecs in terms of the perceptual rate-distortion performance.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122237307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Super-Resolution Method Using Spatio-Temporal Registration of Multi-Scale Components in Consideration of Color-Sampling Patterns of UHDTV Cameras 考虑超高清电视摄像机彩色采样模式的多尺度分量时空配准超分辨方法
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.57
Y. Matsuo, S. Sakaida
Ultra high-definition television (UHDTV) video contain many similar objects in a single-frame because it has high self-similarity caused by its high resolution. In addition, typical UHDTV cameras have one-CMOS sensor with a Bayer or other color-sampling pattern. A super-resolution method using single-frame registration of an original image and its multi-scale components is therefore proposed. Furthermore, this registration performs similarly for this original image and multi-scale components in past and future images of this original image. Accuracy of the registration is enhanced by compensating the registration results in consideration of color-sampling patterns of UHDTV cameras. Experiments show that the proposed method provides an objectively better PSNR measurement and a subjectively better appearance in comparison with the conventional and state-of-the-art super-resolution methods.
超高清电视(UHDTV)视频由于其高分辨率而具有高度的自相似性,因此在单帧内包含了许多相似的对象。此外,典型的超高清电视摄像机有一个带有拜耳或其他颜色采样模式的cmos传感器。因此,提出了一种利用原始图像及其多尺度分量的单帧配准的超分辨率方法。此外,该配准对原始图像和该原始图像的过去和未来图像中的多尺度分量的配准效果相似。考虑到超高清电视摄像机的彩色采样模式,通过对配准结果进行补偿,提高了配准精度。实验表明,与传统的超分辨率方法相比,该方法在客观上提供了更好的PSNR测量,在主观上提供了更好的外观。
{"title":"A Super-Resolution Method Using Spatio-Temporal Registration of Multi-Scale Components in Consideration of Color-Sampling Patterns of UHDTV Cameras","authors":"Y. Matsuo, S. Sakaida","doi":"10.1109/ISM.2015.57","DOIUrl":"https://doi.org/10.1109/ISM.2015.57","url":null,"abstract":"Ultra high-definition television (UHDTV) video contain many similar objects in a single-frame because it has high self-similarity caused by its high resolution. In addition, typical UHDTV cameras have one-CMOS sensor with a Bayer or other color-sampling pattern. A super-resolution method using single-frame registration of an original image and its multi-scale components is therefore proposed. Furthermore, this registration performs similarly for this original image and multi-scale components in past and future images of this original image. Accuracy of the registration is enhanced by compensating the registration results in consideration of color-sampling patterns of UHDTV cameras. Experiments show that the proposed method provides an objectively better PSNR measurement and a subjectively better appearance in comparison with the conventional and state-of-the-art super-resolution methods.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127896000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal and Spatial Evolution through Images 通过图像进行时空演化
Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.105
F. Branco, Nuno Correia, A. Rodrigues, João Gouveia, Rui Nóbrega
Image matching algorithms are used in image search, classification and retrieval but are also useful to show how urban structures evolve over time. Images have the power to illustrate and evoke past events and can be used to show the evolution of structures such as buildings and other elements present in the urban landscape. The paper describes a process and a tool to provide a chronological journey through time, given a set of photographs from different time periods. The developed tool provides the ability to generate visualizations of a geographic location, given a set of related images, taken at different periods in time. It automatically processes comparisons of images and establishes relationships between them. It also offers a semi-automated method to define relationships between parts of images.
图像匹配算法用于图像搜索、分类和检索,但对于显示城市结构如何随时间演变也很有用。图像具有说明和唤起过去事件的力量,可以用来展示城市景观中建筑物和其他元素等结构的演变。这篇论文描述了一个过程和一个工具,提供了一组来自不同时期的照片,通过时间顺序的旅程。开发的工具提供了生成地理位置可视化的能力,给定一组在不同时期拍摄的相关图像。它自动处理图像的比较,并建立它们之间的关系。它还提供了一种半自动化的方法来定义图像各部分之间的关系。
{"title":"Temporal and Spatial Evolution through Images","authors":"F. Branco, Nuno Correia, A. Rodrigues, João Gouveia, Rui Nóbrega","doi":"10.1109/ISM.2015.105","DOIUrl":"https://doi.org/10.1109/ISM.2015.105","url":null,"abstract":"Image matching algorithms are used in image search, classification and retrieval but are also useful to show how urban structures evolve over time. Images have the power to illustrate and evoke past events and can be used to show the evolution of structures such as buildings and other elements present in the urban landscape. The paper describes a process and a tool to provide a chronological journey through time, given a set of photographs from different time periods. The developed tool provides the ability to generate visualizations of a geographic location, given a set of related images, taken at different periods in time. It automatically processes comparisons of images and establishes relationships between them. It also offers a semi-automated method to define relationships between parts of images.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"226 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121480102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2015 IEEE International Symposium on Multimedia (ISM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1