首页 > 最新文献

2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)最新文献

英文 中文
Autonomous virtual humans and social robots in telepresence 远程呈现中的自主虚拟人和社交机器人
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958836
N. Magnenat-Thalmann, Zerrin Yumak, Aryel Beck
Telepresence refers to the possibility of feeling present in a remote location through the use of technology. This can be achieved by immersing a user to a place reconstructed in 3D. The reconstructed place can be captured from the real world or can be completely virtual. Another way to realize telepresence is by using robots and virtual avatars that act as proxies for real people. In case a human-mediated interaction is not needed or not possible, the virtual human and the robot can rely on artificial intelligence to act and interact autonomously. In this paper, these forms of telepresence are discussed, how they are related and different from each other and how autonomy takes place in telepresence. The paper concludes with an overview of the ongoing research on autonomous virtual humans and social robots conducted in the BeingThere centre.
网真指的是通过使用技术在远程位置感觉存在的可能性。这可以通过让用户沉浸在3D重建的地方来实现。重建的地方可以从现实世界中捕获,也可以是完全虚拟的。另一种实现远程呈现的方法是使用机器人和虚拟化身作为真人的代理。在不需要或不可能进行人为交互的情况下,虚拟人和机器人可以依靠人工智能自主行动和交互。本文讨论了这些形式的网真,它们之间的联系和区别,以及在网真中自治是如何发生的。本文最后概述了正在进行的自主虚拟人和社交机器人在北京中心进行的研究。
{"title":"Autonomous virtual humans and social robots in telepresence","authors":"N. Magnenat-Thalmann, Zerrin Yumak, Aryel Beck","doi":"10.1109/MMSP.2014.6958836","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958836","url":null,"abstract":"Telepresence refers to the possibility of feeling present in a remote location through the use of technology. This can be achieved by immersing a user to a place reconstructed in 3D. The reconstructed place can be captured from the real world or can be completely virtual. Another way to realize telepresence is by using robots and virtual avatars that act as proxies for real people. In case a human-mediated interaction is not needed or not possible, the virtual human and the robot can rely on artificial intelligence to act and interact autonomously. In this paper, these forms of telepresence are discussed, how they are related and different from each other and how autonomy takes place in telepresence. The paper concludes with an overview of the ongoing research on autonomous virtual humans and social robots conducted in the BeingThere centre.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129919739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Grabcut-based abandoned object detection 基于grabcut的废弃物体检测
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958806
K. Muchtar, Chih-Yang Lin, C. Yeh
This paper presents a detection-based method to subtract abandoned object from a surveillance scene. Unlike tracking-based approaches that are commonly complicated and unreliable on a crowded scene, the proposed method employs background (BG) modelling and focus only on immobile objects. The main contribution of our work is to build abandoned object detection system which is robust and can resist interference (shadow, illumination changes and occlusion). In addition, we introduce the MRF model and shadow removal to our system. MRF is a promising way to model neighbours' information when labeling the pixel that is either set to background or abandoned object. It represents the correlation and dependency in a pixel and its neighbours. By incorporating the MRF model, as shown in the experimental part, our method can efficiently reduce the false alarm. To evaluate the system's robustness, several dataset including CAVIAR datasets and outdoor test cases are both tested in our experiments.
本文提出了一种基于检测的从监控场景中去除废弃目标的方法。基于跟踪的方法在拥挤的场景中通常是复杂和不可靠的,与此不同,该方法采用背景(BG)建模,只关注不移动的物体。我们的主要贡献是建立了一个鲁棒的、能够抵抗干扰(阴影、光照变化和遮挡)的废弃物体检测系统。此外,我们还引入了MRF模型和阴影去除。当标记被设置为背景或废弃物体的像素时,MRF是一种很有前途的方法来模拟邻居的信息。它表示像素和它的邻居之间的相关性和依赖性。如实验部分所示,我们的方法通过引入磁振函数模型,可以有效地降低虚警。为了评估系统的鲁棒性,我们在实验中对CAVIAR数据集和室外测试用例进行了测试。
{"title":"Grabcut-based abandoned object detection","authors":"K. Muchtar, Chih-Yang Lin, C. Yeh","doi":"10.1109/MMSP.2014.6958806","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958806","url":null,"abstract":"This paper presents a detection-based method to subtract abandoned object from a surveillance scene. Unlike tracking-based approaches that are commonly complicated and unreliable on a crowded scene, the proposed method employs background (BG) modelling and focus only on immobile objects. The main contribution of our work is to build abandoned object detection system which is robust and can resist interference (shadow, illumination changes and occlusion). In addition, we introduce the MRF model and shadow removal to our system. MRF is a promising way to model neighbours' information when labeling the pixel that is either set to background or abandoned object. It represents the correlation and dependency in a pixel and its neighbours. By incorporating the MRF model, as shown in the experimental part, our method can efficiently reduce the false alarm. To evaluate the system's robustness, several dataset including CAVIAR datasets and outdoor test cases are both tested in our experiments.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131077296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
QoE-driven performance analysis of cloud gaming services 基于qos的云游戏服务性能分析
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958835
Zi-Yi Wen, Hsu-Feng Hsiao
With the popularity of cloud computing services and the endorsement from the video game industry, cloud gaming services have emerged promisingly. In a cloud gaming service, the contents of games can be delivered to the clients through either video streaming or file streaming. Due to the strict constraint on the end-to-end latency for real-time interaction in a game, there are still challenges in designing a successful cloud gaming system, which needs to deliver satisfying quality of experience to the customers. In this paper, the methodology for subjective and objective evaluation as well as the analysis of cloud gaming services was developed. The methodology is based on a nonintrusive approach, and therefore, it can be used on different kinds of cloud gaming systems. There are challenges in such objective measurements of important QoS factors, due to the fact that most of the commercial cloud gaming systems are proprietary and closed. In addition, satisfactory QoE is one of the crucial ingredients in the success of cloud gaming services. By combining subjective and objective evaluation results, cloud gaming system developers can infer possible results of QoE levels based on the measured QoS factors. It can also be used in an expert system for choosing the list of games that customers can appreciate at a given environment, as well as for deciding the upper bound of the number of users in a system.
随着云计算服务的普及和视频游戏行业的认可,云游戏服务已经崭露头角。在云游戏服务中,游戏内容可以通过视频流或文件流传递给客户端。由于游戏中实时交互的端到端延迟受到严格的限制,设计一个成功的云游戏系统仍然存在挑战,它需要为客户提供满意的体验质量。本文提出了云游戏服务的主客观评价和分析方法。该方法基于非侵入性方法,因此可用于不同类型的云游戏系统。由于大多数商业云游戏系统都是专有的和封闭的,因此对重要的QoS因素的客观测量存在挑战。此外,令人满意的QoE是云游戏服务成功的关键因素之一。云游戏系统开发者将主观和客观的评价结果结合起来,根据测量到的QoS因素,推断出QoE水平的可能结果。它还可以用于专家系统中,用于在给定环境中选择用户可以欣赏的游戏列表,以及决定系统中用户数量的上限。
{"title":"QoE-driven performance analysis of cloud gaming services","authors":"Zi-Yi Wen, Hsu-Feng Hsiao","doi":"10.1109/MMSP.2014.6958835","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958835","url":null,"abstract":"With the popularity of cloud computing services and the endorsement from the video game industry, cloud gaming services have emerged promisingly. In a cloud gaming service, the contents of games can be delivered to the clients through either video streaming or file streaming. Due to the strict constraint on the end-to-end latency for real-time interaction in a game, there are still challenges in designing a successful cloud gaming system, which needs to deliver satisfying quality of experience to the customers. In this paper, the methodology for subjective and objective evaluation as well as the analysis of cloud gaming services was developed. The methodology is based on a nonintrusive approach, and therefore, it can be used on different kinds of cloud gaming systems. There are challenges in such objective measurements of important QoS factors, due to the fact that most of the commercial cloud gaming systems are proprietary and closed. In addition, satisfactory QoE is one of the crucial ingredients in the success of cloud gaming services. By combining subjective and objective evaluation results, cloud gaming system developers can infer possible results of QoE levels based on the measured QoS factors. It can also be used in an expert system for choosing the list of games that customers can appreciate at a given environment, as well as for deciding the upper bound of the number of users in a system.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115608697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
2D/3D AudioVisual content analysis & description 2D/3D视听内容分析和描述
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958837
I. Pitas, K. Papachristou, N. Nikolaidis, M. Liuni, L. Benaroya, G. Peeters, A. Röbel, A. Linnemann, Mohan Liu, S. Gerke
In this paper, we propose a way of using the Audio-Visual Description Profile (AVDP) of the MPEG-7 standard for 2D or stereo video and multichannel audio content description. Our aim is to provide means of using AVDP in such a way, that 3D video and audio content can be correctly and consistently described. Since AVDP semantics do not include ways for dealing with 3D audiovisual content, a new semantic framework within AVDP is proposed and examples of using AVDP to describe the results of analysis algorithms on stereo video and multichannel audio content are presented.
在本文中,我们提出了一种使用MPEG-7标准的视听描述文件(AVDP)进行二维或立体视频和多声道音频内容描述的方法。我们的目标是提供一种使用AVDP的方法,使3D视频和音频内容能够被正确和一致地描述。由于AVDP语义不包括处理3D视听内容的方法,因此提出了一种新的AVDP语义框架,并给出了使用AVDP描述立体视频和多声道音频内容分析算法结果的示例。
{"title":"2D/3D AudioVisual content analysis & description","authors":"I. Pitas, K. Papachristou, N. Nikolaidis, M. Liuni, L. Benaroya, G. Peeters, A. Röbel, A. Linnemann, Mohan Liu, S. Gerke","doi":"10.1109/MMSP.2014.6958837","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958837","url":null,"abstract":"In this paper, we propose a way of using the Audio-Visual Description Profile (AVDP) of the MPEG-7 standard for 2D or stereo video and multichannel audio content description. Our aim is to provide means of using AVDP in such a way, that 3D video and audio content can be correctly and consistently described. Since AVDP semantics do not include ways for dealing with 3D audiovisual content, a new semantic framework within AVDP is proposed and examples of using AVDP to describe the results of analysis algorithms on stereo video and multichannel audio content are presented.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123007725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Estimating spatial layout of rooms from RGB-D videos 从RGB-D视频中估计房间的空间布局
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958786
Anran Wang, Jiwen Lu, Jianfei Cai, G. Wang, Tat-Jen Cham
Spatial layout estimation of indoor rooms plays an important role in many visual analysis applications such as robotics and human-computer interaction. While many methods have been proposed for recovering spatial layout of rooms in recent years, their performance is still far from satisfactory due to high occlusion caused by the presence of objects that clutter the scene. In this paper, we propose a new approach to estimate the spatial layout of rooms from RGB-D videos. Unlike most existing methods which estimate the layout from still images, RGB-D videos provide more spatial-temporal and depth information, which are helpful to improve the estimation performance because more contextual information can be exploited in RGB-D videos. Given a RGB-D video, we first estimate the spatial layout of the scene in each single frame and compute the camera trajectory using the simultaneous localization and mapping (SLAM) algorithm. Then, the estimated spatial layouts of different frames are integrated to infer temporally consistent layouts of the room throughout the whole video. Our method is evaluated on the NYU RGB-D dataset, and the experimental results show the efficacy of the proposed approach.
室内空间布局估计在机器人和人机交互等视觉分析应用中起着重要的作用。虽然近年来提出了许多方法来恢复房间的空间布局,但由于物体的存在导致场景的高遮挡,它们的性能仍然远远不能令人满意。本文提出了一种从RGB-D视频中估计房间空间布局的新方法。与大多数从静止图像中估计布局的方法不同,RGB-D视频提供了更多的时空和深度信息,这有助于提高估计性能,因为RGB-D视频可以利用更多的上下文信息。给定一个RGB-D视频,我们首先在每一帧中估计场景的空间布局,并使用同步定位和映射(SLAM)算法计算摄像机轨迹。然后,整合不同帧的估计空间布局,推断出整个视频中房间在时间上一致的布局。在NYU RGB-D数据集上对该方法进行了测试,实验结果表明了该方法的有效性。
{"title":"Estimating spatial layout of rooms from RGB-D videos","authors":"Anran Wang, Jiwen Lu, Jianfei Cai, G. Wang, Tat-Jen Cham","doi":"10.1109/MMSP.2014.6958786","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958786","url":null,"abstract":"Spatial layout estimation of indoor rooms plays an important role in many visual analysis applications such as robotics and human-computer interaction. While many methods have been proposed for recovering spatial layout of rooms in recent years, their performance is still far from satisfactory due to high occlusion caused by the presence of objects that clutter the scene. In this paper, we propose a new approach to estimate the spatial layout of rooms from RGB-D videos. Unlike most existing methods which estimate the layout from still images, RGB-D videos provide more spatial-temporal and depth information, which are helpful to improve the estimation performance because more contextual information can be exploited in RGB-D videos. Given a RGB-D video, we first estimate the spatial layout of the scene in each single frame and compute the camera trajectory using the simultaneous localization and mapping (SLAM) algorithm. Then, the estimated spatial layouts of different frames are integrated to infer temporally consistent layouts of the room throughout the whole video. Our method is evaluated on the NYU RGB-D dataset, and the experimental results show the efficacy of the proposed approach.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126857351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Block-based compressive sensing of video using local sparsifying transform 基于局部稀疏化变换的视频分块压缩感知
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958826
Chien Van Trinh, V. Nguyen, B. Jeon
Block-based compressive sensing is attractive for sensing natural images and video because it makes large-sized image/video tractable. However, its reconstruction performance is yet to be improved much. This paper proposes a new block-based compressive video sensing recovery scheme which can reconstruct video sequences with high quality. It generates initial key frames by incorporating the augmented Lagrangian total variation with a nonlocal means filter which is well known for being good at preserving edges and reducing noise. Additionally, local principal component analysis (PCA) transform is employed to enhance the detailed information. The non-key frames are initially predicted by their measurements and reconstructed key frames. Furthermore, regularization with PCA transform-aided side information iteratively seeks better reconstructed solution. Simulation results manifest effectiveness of the proposed scheme.
基于块的压缩感知对于自然图像和视频的感知具有很大的吸引力,因为它使大尺寸的图像/视频易于处理。但是,其重构性能还有待提高。本文提出了一种新的基于块的压缩视频感知恢复方案,该方案能够高质量地重建视频序列。它通过将增广拉格朗日总变分与非局部均值滤波器相结合来生成初始关键帧,非局部均值滤波器以保持边缘和降低噪声而闻名。此外,采用局部主成分分析(PCA)变换增强图像的细节信息。对非关键帧进行初步预测,并对关键帧进行重构。此外,利用PCA变换辅助边信息进行正则化迭代,寻求更好的重构解。仿真结果表明了该方案的有效性。
{"title":"Block-based compressive sensing of video using local sparsifying transform","authors":"Chien Van Trinh, V. Nguyen, B. Jeon","doi":"10.1109/MMSP.2014.6958826","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958826","url":null,"abstract":"Block-based compressive sensing is attractive for sensing natural images and video because it makes large-sized image/video tractable. However, its reconstruction performance is yet to be improved much. This paper proposes a new block-based compressive video sensing recovery scheme which can reconstruct video sequences with high quality. It generates initial key frames by incorporating the augmented Lagrangian total variation with a nonlocal means filter which is well known for being good at preserving edges and reducing noise. Additionally, local principal component analysis (PCA) transform is employed to enhance the detailed information. The non-key frames are initially predicted by their measurements and reconstructed key frames. Furthermore, regularization with PCA transform-aided side information iteratively seeks better reconstructed solution. Simulation results manifest effectiveness of the proposed scheme.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125302934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Soccer video summarization based on cinematography and motion analysis 基于电影摄影和运动分析的足球视频摘要
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958804
Ngoc Nguyen, A. Yoshitaka
Summarization of soccer videos has been widely studied due to its worldwide viewers and potential commercial applications. Most existing methods focus on searching for highlight events in soccer videos such as goals, penalty kicks and generating a summary as a list of such events. However, besides highlight events, scenes of intensive competition between players of two teams and emotional moments are also interesting. In this paper, we propose a soccer summarization system which is able to capture highlight events, scenes of intensive competition, and emotional moments. Based on the flow of soccer games, we organize a video summary as follows: first, scenes of intensive competition, second, what events happened, third, who were involved in the events, and finally how players or audience reacted to the events. With this structure, the generated summary is more complete and interesting because it provides both game play and emotional moments. Our system takes broadcast video as input, and divides it into multiple clips based on cinematographic features such as sport video production techniques, the transition of shots, and camera motions. Then, the system evaluates the interest level of each clip to generate a summary. Experimental results and subjective evaluation are carried out to evaluate the quality of the generated summary and the effectiveness of our proposed interest level measure.
由于足球视频的全球受众和潜在的商业应用,摘要技术已被广泛研究。大多数现有的方法专注于搜索足球视频中的亮点事件,如进球、点球,并生成这些事件的摘要列表。然而,除了重头戏之外,两队选手的激烈竞争和情感时刻的场景也很有趣。在本文中,我们提出了一个能够捕捉重要事件、激烈比赛场景和情感时刻的足球摘要系统。根据足球比赛的流程,我们将视频总结如下:第一,激烈的比赛场景,第二,发生了什么事件,第三,谁参与了事件,最后,球员或观众对事件的反应。有了这种结构,生成的摘要就更加完整和有趣,因为它既提供了游戏玩法,又提供了情感时刻。我们的系统以广播视频作为输入,并根据运动视频制作技术、镜头转换、摄像机运动等电影学特征将其分割成多个片段。然后,系统评估每个片段的兴趣水平以生成摘要。通过实验结果和主观评价来评价生成摘要的质量和我们提出的兴趣水平度量的有效性。
{"title":"Soccer video summarization based on cinematography and motion analysis","authors":"Ngoc Nguyen, A. Yoshitaka","doi":"10.1109/MMSP.2014.6958804","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958804","url":null,"abstract":"Summarization of soccer videos has been widely studied due to its worldwide viewers and potential commercial applications. Most existing methods focus on searching for highlight events in soccer videos such as goals, penalty kicks and generating a summary as a list of such events. However, besides highlight events, scenes of intensive competition between players of two teams and emotional moments are also interesting. In this paper, we propose a soccer summarization system which is able to capture highlight events, scenes of intensive competition, and emotional moments. Based on the flow of soccer games, we organize a video summary as follows: first, scenes of intensive competition, second, what events happened, third, who were involved in the events, and finally how players or audience reacted to the events. With this structure, the generated summary is more complete and interesting because it provides both game play and emotional moments. Our system takes broadcast video as input, and divides it into multiple clips based on cinematographic features such as sport video production techniques, the transition of shots, and camera motions. Then, the system evaluates the interest level of each clip to generate a summary. Experimental results and subjective evaluation are carried out to evaluate the quality of the generated summary and the effectiveness of our proposed interest level measure.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131959582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Gaze direction estimation from static images 静态图像的凝视方向估计
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958803
Krystian Radlak, M. Kawulok, B. Smolka, Natalia Radlak
This study presents a novel multilevel algorithm for gaze direction recognition from static images. Proposed solution consists of three stages: (i) eye pupil localization using a multistage ellipse detector combined with a support vector machines verifier, (ii) eye bounding box localization calculated using a hybrid projection function and (iii) gaze direction classification using support vector machines and random forests. The proposed method has been tested on Eye-Chimera database with very promising results. Extensive tests show that eye bounding box localization allows us to achieve highly accurate results both in terms of eye location and gaze direction classification.
提出了一种基于静态图像的多层次凝视方向识别算法。该解决方案包括三个阶段:(i)使用结合支持向量机验证器的多级椭圆检测器进行瞳孔定位;(ii)使用混合投影函数计算眼睛边界盒定位;(iii)使用支持向量机和随机森林进行凝视方向分类。该方法已在Eye-Chimera数据库上进行了测试,取得了令人满意的结果。大量的测试表明,眼边界盒定位使我们能够在眼睛定位和凝视方向分类方面获得高度准确的结果。
{"title":"Gaze direction estimation from static images","authors":"Krystian Radlak, M. Kawulok, B. Smolka, Natalia Radlak","doi":"10.1109/MMSP.2014.6958803","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958803","url":null,"abstract":"This study presents a novel multilevel algorithm for gaze direction recognition from static images. Proposed solution consists of three stages: (i) eye pupil localization using a multistage ellipse detector combined with a support vector machines verifier, (ii) eye bounding box localization calculated using a hybrid projection function and (iii) gaze direction classification using support vector machines and random forests. The proposed method has been tested on Eye-Chimera database with very promising results. Extensive tests show that eye bounding box localization allows us to achieve highly accurate results both in terms of eye location and gaze direction classification.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124604446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Shot type characterization in 2D and 3D video content 2D和3D视频内容的镜头类型表征
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958788
Ioannis Tsingalis, A. Tefas, N. Nikolaidis, I. Pitas
Due to the enormous increase of video and image content on the web in the last decades, automatic video annotation became a necessity. The successful annotation of video and image content facilitate a successful indexing and retrieval in search databases. In this work we study a variety of possible shot type characterizations that can be assigned in a single video frame or still image. Possible ways to propagate these characterizations to a video segment (or to an entire shot) are also discussed. A method for the detection of Over-the-Shoulder shots in 3D (stereo) video is also proposed.
在过去的几十年里,由于网络上视频和图像内容的大量增加,自动视频注释成为一种必要。视频和图像内容的成功标注有助于在搜索数据库中成功地建立索引和检索。在这项工作中,我们研究了可以在单个视频帧或静止图像中分配的各种可能的镜头类型特征。还讨论了将这些特征传播到视频片段(或整个镜头)的可能方法。提出了一种三维(立体)视频中过肩镜头的检测方法。
{"title":"Shot type characterization in 2D and 3D video content","authors":"Ioannis Tsingalis, A. Tefas, N. Nikolaidis, I. Pitas","doi":"10.1109/MMSP.2014.6958788","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958788","url":null,"abstract":"Due to the enormous increase of video and image content on the web in the last decades, automatic video annotation became a necessity. The successful annotation of video and image content facilitate a successful indexing and retrieval in search databases. In this work we study a variety of possible shot type characterizations that can be assigned in a single video frame or still image. Possible ways to propagate these characterizations to a video segment (or to an entire shot) are also discussed. A method for the detection of Over-the-Shoulder shots in 3D (stereo) video is also proposed.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114330985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A hybrid approach to animating the murals with Dunhuang style 采用混合的方法使壁画具有敦煌风格
Pub Date : 2014-11-20 DOI: 10.1109/MMSP.2014.6958789
Bingwen Jin, Linglong Feng, Gang Liu, Huaqing Luo, Wei-dong Geng
In order to animate the valuable murals of Dunhuang Mogao Grottoes, we propose a hybrid approach to creating the animation with the artistic style in the murals. Its key point is the fusion of 2D and 3D animation assets, for which a hybrid model is constructed from a 2.5D model, a 3D model, and registration information. The 2.5D model, created from 2D multi-view drawings, is composed of 2.5D strokes. For each 2.5D stroke, we let the user draw corresponding strokes on the surface of the 3D model in multiple views. Then the method automatically generates registration information, which enables 3D animation assets to animate the 2.5D model. At last, the animated line drawings are produced from 2.5D and 3D models respectively and blended under the control of per-stroke weights. The user can manually modify the weights to get the desired animation style.
为了对敦煌莫高窟珍贵壁画进行动画制作,我们提出了一种将壁画的艺术风格与动画制作相结合的方法。其关键是将2D和3D动画资产进行融合,由2.5D模型、3D模型和配准信息构建混合模型。2.5D模型由2D多视图绘图创建,由2.5D笔画组成。对于每一个2.5D笔画,我们让用户在多个视图中在3D模型的表面上绘制相应的笔画。然后,该方法自动生成注册信息,使3D动画资产能够对2.5D模型进行动画化。最后,分别从2.5D和3D模型生成动画线条图,并在每笔权值的控制下进行混合。用户可以手动修改权重以获得所需的动画风格。
{"title":"A hybrid approach to animating the murals with Dunhuang style","authors":"Bingwen Jin, Linglong Feng, Gang Liu, Huaqing Luo, Wei-dong Geng","doi":"10.1109/MMSP.2014.6958789","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958789","url":null,"abstract":"In order to animate the valuable murals of Dunhuang Mogao Grottoes, we propose a hybrid approach to creating the animation with the artistic style in the murals. Its key point is the fusion of 2D and 3D animation assets, for which a hybrid model is constructed from a 2.5D model, a 3D model, and registration information. The 2.5D model, created from 2D multi-view drawings, is composed of 2.5D strokes. For each 2.5D stroke, we let the user draw corresponding strokes on the surface of the 3D model in multiple views. Then the method automatically generates registration information, which enables 3D animation assets to animate the 2.5D model. At last, the animated line drawings are produced from 2.5D and 3D models respectively and blended under the control of per-stroke weights. The user can manually modify the weights to get the desired animation style.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134635608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1