首页 > 最新文献

Proceedings of the 21st ACM international conference on Multimedia最新文献

英文 中文
SwarmVision: autonomous aesthetic multi-camera interaction SwarmVision:自主美学多摄像头交互
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502192
G. Legrady, D. Bazo, Marco Pinter
A platform of exploratory networked robotic cameras was created, utilizing an aesthetic approach to experimentation. Initiated by research in autonomous swarm robotic camera behavior, SwarmVision is an installation consisting of multiple Pan-Tilt-Zoom cameras on rails positioned above spectators in an exhibition space, where each camera behaves autonomously based on its own rules of computer vision and control. Each of the cameras is programmed to detect visual information of interest based on a different algorithm, and each negotiates with the other two, influencing what subject matter to study in a collective way. The emergent behaviors of the system illustrate an ongoing process of scene
利用美学方法进行实验,创建了一个探索性网络机器人相机平台。由自主群体机器人摄像机行为研究发起,swarm vision是一个由多个Pan-Tilt-Zoom摄像机组成的装置,放置在展览空间中观众上方的轨道上,每个摄像机根据自己的计算机视觉和控制规则自主行为。每个摄像头都被编程为根据不同的算法检测感兴趣的视觉信息,每个摄像头都与另外两个摄像头协商,以一种集体的方式影响研究的主题。系统的涌现行为说明了一个正在进行的场景过程
{"title":"SwarmVision: autonomous aesthetic multi-camera interaction","authors":"G. Legrady, D. Bazo, Marco Pinter","doi":"10.1145/2502081.2502192","DOIUrl":"https://doi.org/10.1145/2502081.2502192","url":null,"abstract":"A platform of exploratory networked robotic cameras was created, utilizing an aesthetic approach to experimentation. Initiated by research in autonomous swarm robotic camera behavior, SwarmVision is an installation consisting of multiple Pan-Tilt-Zoom cameras on rails positioned above spectators in an exhibition space, where each camera behaves autonomously based on its own rules of computer vision and control. Each of the cameras is programmed to detect visual information of interest based on a different algorithm, and each negotiates with the other two, influencing what subject matter to study in a collective way. The emergent behaviors of the system illustrate an ongoing process of scene","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86578535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond bag of words: image representation in sub-semantic space 超越词袋:亚语义空间中的图像表征
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502132
Chunjie Zhang, Shuhui Wang, Chao Liang, J. Liu, Qingming Huang, Haojie Li, Q. Tian
Due to the semantic gap, the low-level features are not able to semantically represent images well. Besides, traditional semantic related image representation may not be able to cope with large inter class variations and are not very robust to noise. To solve these problems, in this paper, we propose a novel image representation method in the sub-semantic space. First, examplar classifiers are trained by separating each training image from the others and serve as the weak semantic similarity measurement. Then a graph is constructed by combining the visual similarity and weak semantic similarity of these training images. We partition this graph into visually and semantically similar sub-sets. Each sub-set of images are then used to train classifiers in order to separate this sub-set from the others. The learned sub-set classifiers are then used to construct a sub-semantic space based representation of images. This sub-semantic space is not only more semantically meaningful but also more reliable and resistant to noise. Finally, we make categorization of images using this sub-semantic space based representation on several public datasets to demonstrate the effectiveness of the proposed method.
由于语义缺口的存在,底层特征不能很好地在语义上表示图像。此外,传统的语义相关图像表示可能无法处理大的类间变化,并且对噪声的鲁棒性不强。为了解决这些问题,本文提出了一种新的亚语义空间图像表示方法。首先,通过将每个训练图像与其他训练图像分离来训练示例分类器,并作为弱语义相似度度量。然后将这些训练图像的视觉相似度和弱语义相似度结合起来构造一个图。我们将这个图划分为视觉上和语义上相似的子集。然后使用图像的每个子集来训练分类器,以便将该子集与其他子集分开。然后使用学习到的子集分类器来构建基于子语义空间的图像表示。这种亚语义空间不仅语义意义更丰富,而且可靠性更高,抗噪声能力更强。最后,我们在几个公共数据集上使用这种基于子语义空间的表示对图像进行分类,以证明所提出方法的有效性。
{"title":"Beyond bag of words: image representation in sub-semantic space","authors":"Chunjie Zhang, Shuhui Wang, Chao Liang, J. Liu, Qingming Huang, Haojie Li, Q. Tian","doi":"10.1145/2502081.2502132","DOIUrl":"https://doi.org/10.1145/2502081.2502132","url":null,"abstract":"Due to the semantic gap, the low-level features are not able to semantically represent images well. Besides, traditional semantic related image representation may not be able to cope with large inter class variations and are not very robust to noise. To solve these problems, in this paper, we propose a novel image representation method in the sub-semantic space. First, examplar classifiers are trained by separating each training image from the others and serve as the weak semantic similarity measurement. Then a graph is constructed by combining the visual similarity and weak semantic similarity of these training images. We partition this graph into visually and semantically similar sub-sets. Each sub-set of images are then used to train classifiers in order to separate this sub-set from the others. The learned sub-set classifiers are then used to construct a sub-semantic space based representation of images. This sub-semantic space is not only more semantically meaningful but also more reliable and resistant to noise. Finally, we make categorization of images using this sub-semantic space based representation on several public datasets to demonstrate the effectiveness of the proposed method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91192213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Locality preserving verification for image search 图像搜索的局部保持验证
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502140
Shanmin Pang, Jianru Xue, Nanning Zheng, Q. Tian
Establishing correct correspondences between two images has a wide range of applications, such as 2D and 3D registration, structure from motion, and image retrieval. In this paper, we propose a new matching method based on spatial constraints. The proposed method has linear time complexity, and is efficient when applying it to image retrieval. The main assumption behind our method is that, the local geometric structure among a feature point and its neighbors, is not easily affected by both geometric and photometric transformations, and thus should be preserved in their corresponding images. We model this local geometric structure by linear coefficients that reconstruct the point from its neighbors. The method is flexible, as it can not only estimate the number of correct matches between two images efficiently, but also determine the correctness of each match accurately. Furthermore, it is simple and easy to be implemented. When applying the proposed method on re-ranking images in an image search engine, it outperforms the-state-of-the-art techniques.
在两幅图像之间建立正确的对应关系具有广泛的应用,例如2D和3D配准,运动结构和图像检索。本文提出了一种新的基于空间约束的匹配方法。该方法具有线性时间复杂度,应用于图像检索具有较高的效率。该方法的主要假设是,特征点及其相邻点之间的局部几何结构不容易受到几何变换和光度变换的影响,因此应保留在其对应的图像中。我们用线性系数来模拟这个局部几何结构,这些线性系数从它的邻居那里重建这个点。该方法不仅可以有效地估计两幅图像之间的正确匹配数量,而且可以准确地确定每个匹配的正确性,具有一定的灵活性。此外,它简单,易于实现。当将该方法应用于图像搜索引擎中的图像重新排序时,它优于最先进的技术。
{"title":"Locality preserving verification for image search","authors":"Shanmin Pang, Jianru Xue, Nanning Zheng, Q. Tian","doi":"10.1145/2502081.2502140","DOIUrl":"https://doi.org/10.1145/2502081.2502140","url":null,"abstract":"Establishing correct correspondences between two images has a wide range of applications, such as 2D and 3D registration, structure from motion, and image retrieval. In this paper, we propose a new matching method based on spatial constraints. The proposed method has linear time complexity, and is efficient when applying it to image retrieval. The main assumption behind our method is that, the local geometric structure among a feature point and its neighbors, is not easily affected by both geometric and photometric transformations, and thus should be preserved in their corresponding images. We model this local geometric structure by linear coefficients that reconstruct the point from its neighbors. The method is flexible, as it can not only estimate the number of correct matches between two images efficiently, but also determine the correctness of each match accurately. Furthermore, it is simple and easy to be implemented. When applying the proposed method on re-ranking images in an image search engine, it outperforms the-state-of-the-art techniques.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75362746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Order preserving hashing for approximate nearest neighbor search 近似最近邻搜索的保序哈希
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502100
Jianfeng Wang, Jingdong Wang, Nenghai Yu, Shipeng Li
In this paper, we propose a novel method to learn similarity-preserving hash functions for approximate nearest neighbor (NN) search. The key idea is to learn hash functions by maximizing the alignment between the similarity orders computed from the original space and the ones in the hamming space. The problem of mapping the NN points into different hash codes is taken as a classification problem in which the points are categorized into several groups according to the hamming distances to the query. The hash functions are optimized from the classifiers pooled over the training points. Experimental results demonstrate the superiority of our approach over existing state-of-the-art hashing techniques.
在本文中,我们提出了一种学习近似最近邻(NN)搜索中保持相似哈希函数的新方法。关键思想是通过最大化从原始空间计算的相似顺序与汉明空间中的相似顺序之间的对齐来学习哈希函数。将神经网络点映射到不同哈希码的问题作为分类问题,根据到查询的汉明距离将点分成几组。哈希函数通过在训练点上池化的分类器进行优化。实验结果表明,我们的方法优于现有的最先进的哈希技术。
{"title":"Order preserving hashing for approximate nearest neighbor search","authors":"Jianfeng Wang, Jingdong Wang, Nenghai Yu, Shipeng Li","doi":"10.1145/2502081.2502100","DOIUrl":"https://doi.org/10.1145/2502081.2502100","url":null,"abstract":"In this paper, we propose a novel method to learn similarity-preserving hash functions for approximate nearest neighbor (NN) search. The key idea is to learn hash functions by maximizing the alignment between the similarity orders computed from the original space and the ones in the hamming space. The problem of mapping the NN points into different hash codes is taken as a classification problem in which the points are categorized into several groups according to the hamming distances to the query. The hash functions are optimized from the classifiers pooled over the training points. Experimental results demonstrate the superiority of our approach over existing state-of-the-art hashing techniques.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75569000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
CollARt: a tool for creating 3D photo collages using mobile augmented reality CollARt:一个使用移动增强现实创建3D照片拼贴的工具
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502154
A. Marzo, Oscar Ardaiz
A collage is an artistic composition made by assembling different parts to create a new whole. This procedure can be applied for assembling tridimensional objects. In this paper we present CollARt, a Mobile Augmented Reality application which permits to create 3D photo collages. Virtual pieces are textured with pictures taken with the camera and can be blended with real objects. A preliminary user study (N=12) revealed that participants were able to create interesting works of art. The evaluation also suggested that the possibility of itinerantly mixing virtual pieces with the real world increases creativity.
拼贴是一种艺术作品,通过将不同的部分组合成一个新的整体。此程序可用于组装三维物体。在本文中,我们介绍CollARt,一个移动增强现实应用程序,允许创建3D照片拼贴。虚拟物品是用相机拍摄的照片纹理,可以与真实物体混合。初步的用户研究(N=12)显示,参与者能够创造有趣的艺术作品。评估还表明,将虚拟物品与现实世界巡回混合的可能性会增加创造力。
{"title":"CollARt: a tool for creating 3D photo collages using mobile augmented reality","authors":"A. Marzo, Oscar Ardaiz","doi":"10.1145/2502081.2502154","DOIUrl":"https://doi.org/10.1145/2502081.2502154","url":null,"abstract":"A collage is an artistic composition made by assembling different parts to create a new whole. This procedure can be applied for assembling tridimensional objects. In this paper we present CollARt, a Mobile Augmented Reality application which permits to create 3D photo collages. Virtual pieces are textured with pictures taken with the camera and can be blended with real objects. A preliminary user study (N=12) revealed that participants were able to create interesting works of art. The evaluation also suggested that the possibility of itinerantly mixing virtual pieces with the real world increases creativity.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"139 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74230992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Con-text: text detection using background connectivity for fine-grained object classification 上下文-文本:使用背景连接进行文本检测,用于细粒度对象分类
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502197
Sezer Karaoglu, J. V. Gemert, T. Gevers
This paper focuses on fine-grained classification by detecting photographed text in images. We introduce a text detection method that does not try to detect all possible foreground text regions but instead aims to reconstruct the scene background to eliminate non-text regions. Object cues such as color, contrast, and objectiveness are used in corporation with a random forest classifier to detect background pixels in the scene. Results on two publicly available datasets ICDAR03 and a fine-grained Building subcategories of ImageNet shows the effectiveness of the proposed method.
本文的重点是通过检测图像中的照片文本进行细粒度分类。我们引入了一种文本检测方法,它不是试图检测所有可能的前景文本区域,而是旨在重建场景背景以消除非文本区域。物体线索,如颜色、对比度和客观性,与随机森林分类器一起用于检测场景中的背景像素。在两个公开可用的数据集ICDAR03和ImageNet的细粒度构建子类别上的结果表明了所提出方法的有效性。
{"title":"Con-text: text detection using background connectivity for fine-grained object classification","authors":"Sezer Karaoglu, J. V. Gemert, T. Gevers","doi":"10.1145/2502081.2502197","DOIUrl":"https://doi.org/10.1145/2502081.2502197","url":null,"abstract":"This paper focuses on fine-grained classification by detecting photographed text in images. We introduce a text detection method that does not try to detect all possible foreground text regions but instead aims to reconstruct the scene background to eliminate non-text regions. Object cues such as color, contrast, and objectiveness are used in corporation with a random forest classifier to detect background pixels in the scene. Results on two publicly available datasets ICDAR03 and a fine-grained Building subcategories of ImageNet shows the effectiveness of the proposed method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74012323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Augmented and interactive video playback based on global camera pose 增强和交互式视频播放基于全球相机姿势
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502269
Junsheng Fu, Lixin Fan, Yu You, Kimmo Roimela
This paper proposes a video playback system that allows user to expend the field of view to surrounding environments that are not visible in the original video frame, arbitrarily change the viewing angles, and see the superimposed point-of-interest (POIs) data in an augmented reality manner during the video playback. The processing consists of two main steps: in the first step, client uploads a video to the GeoVideo Engine, and then the GeoVideo Engine extracts the geo-metadata and returns them back to the client; in the second step, client requests POIs from server, and then the client renders the video with POIs.
本文提出了一种视频回放系统,允许用户将视场扩展到原始视频帧中不可见的周围环境,任意改变视角,并在视频回放过程中以增强现实的方式看到叠加的兴趣点(poi)数据。该处理主要包括两个步骤:第一步,客户端上传视频到GeoVideo引擎,GeoVideo引擎提取地理元数据并返回给客户端;第二步,客户端向服务器请求poi,然后客户端使用poi渲染视频。
{"title":"Augmented and interactive video playback based on global camera pose","authors":"Junsheng Fu, Lixin Fan, Yu You, Kimmo Roimela","doi":"10.1145/2502081.2502269","DOIUrl":"https://doi.org/10.1145/2502081.2502269","url":null,"abstract":"This paper proposes a video playback system that allows user to expend the field of view to surrounding environments that are not visible in the original video frame, arbitrarily change the viewing angles, and see the superimposed point-of-interest (POIs) data in an augmented reality manner during the video playback. The processing consists of two main steps: in the first step, client uploads a video to the GeoVideo Engine, and then the GeoVideo Engine extracts the geo-metadata and returns them back to the client; in the second step, client requests POIs from server, and then the client renders the video with POIs.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79355823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
AirTouch panel: a re-anchorable virtual touch panel AirTouch面板:可重新固定的虚拟触摸面板
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502164
Shih-Yao Lin, Chuen-Kai Shie, Shen-Chi Chen, Y. Hung
To achieve maximum mobility, device-less approaches for home appliance remote control have received increasing attention in recent years. In this paper, we propose a screen-less virtual touch panel, called AirTouch Panel, which can be positioned at any place with various orientations around users. The proposed virtual touch panel provides a potential ability to remotely control the home appliances, such as television, air conditioner, and so on. The proposed system allows users to anchor the panel at the place with comfortable poses. If the users want to change panel's position or orientation, they only need to re-anchor it, and then the panel will be reset. In this paper, our main contribution is to design a re-anchorable virtual panel for digital home remote control. Most importantly, we explore the design of such imaginary interface through two user studies. In our user studies, we analyze task completion time, satisfaction rate, and the number of miss-clicks. We are interested in the feasibility issues, for example, proper click gesture, panel size and button size, etc. Moreover, based on the AirTouch Panel, we also developed an intelligent TV to demonstrate the usability for controlling home appliance.
为了实现最大的移动性,近年来,家用电器远程控制的无设备方法越来越受到关注。在本文中,我们提出了一种无屏幕的虚拟触摸面板,称为AirTouch panel,它可以定位在用户周围的任何位置,具有不同的方向。提出的虚拟触摸面板提供了远程控制家用电器的潜在能力,如电视、空调等。该系统允许用户以舒适的姿势将面板固定在该位置。如果用户想要改变面板的位置或方向,他们只需要重新锚定它,然后面板将被重置。在本文中,我们的主要贡献是为数字家庭遥控器设计一个可重新固定的虚拟面板。最重要的是,我们通过两个用户研究来探索这种虚拟界面的设计。在我们的用户研究中,我们分析任务完成时间、满意率和未点击次数。我们对可行性问题感兴趣,例如,适当的点击手势,面板大小和按钮大小等。此外,基于AirTouch Panel,我们还开发了一款智能电视,以展示控制家电的可用性。
{"title":"AirTouch panel: a re-anchorable virtual touch panel","authors":"Shih-Yao Lin, Chuen-Kai Shie, Shen-Chi Chen, Y. Hung","doi":"10.1145/2502081.2502164","DOIUrl":"https://doi.org/10.1145/2502081.2502164","url":null,"abstract":"To achieve maximum mobility, device-less approaches for home appliance remote control have received increasing attention in recent years. In this paper, we propose a screen-less virtual touch panel, called AirTouch Panel, which can be positioned at any place with various orientations around users. The proposed virtual touch panel provides a potential ability to remotely control the home appliances, such as television, air conditioner, and so on. The proposed system allows users to anchor the panel at the place with comfortable poses. If the users want to change panel's position or orientation, they only need to re-anchor it, and then the panel will be reset. In this paper, our main contribution is to design a re-anchorable virtual panel for digital home remote control. Most importantly, we explore the design of such imaginary interface through two user studies. In our user studies, we analyze task completion time, satisfaction rate, and the number of miss-clicks. We are interested in the feasibility issues, for example, proper click gesture, panel size and button size, etc. Moreover, based on the AirTouch Panel, we also developed an intelligent TV to demonstrate the usability for controlling home appliance.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84479203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Virtual director technology for social video communication and live event broadcast production 虚拟导演技术用于社交视频传播和活动直播制作
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502213
Rene Kaiser
This thesis investigates several aspects of Virtual Director technology, i.e. software capable of intelligent real-time selection of live media streams. It addresses several research questions in this interdisciplinary field with respect to how a generic Virtual Director framework can be constructed, and how its behavior can be modeled and formalized to realize professional applications with many parallel users within real-time constraints. Prototypes have been built for the applications of group videoconferencing and live event broadcast. The engine executes cinematic principles aiming to enhance the user experience. In group videoconferencing, a Virtual Director aims to support communication goals by selecting from multiple available streams, i.e. automating cuts between shots according to the communication situation. In event broadcast, it enables personalization by framing, animating and cutting virtual camera views as cropping from a high-resolution panorama. While the technical approach and framework has been evaluated in lab experiments, further evaluation involving potential users and cinematic professionals is ongoing.
本文研究了虚拟导演技术的几个方面,即能够智能实时选择直播媒体流的软件。它解决了这个跨学科领域的几个研究问题,涉及如何构建通用的Virtual Director框架,以及如何对其行为进行建模和形式化,以在实时约束下实现具有许多并行用户的专业应用程序。已经建立了用于群组视频会议和现场直播的原型。该引擎执行电影原则,旨在增强用户体验。在群组视频会议中,虚拟导演旨在通过从多个可用流中进行选择来支持通信目标,即根据通信情况自动切换镜头。在事件广播中,它可以通过帧、动画和裁剪虚拟摄像机视图来实现个性化,就像从高分辨率全景中裁剪一样。虽然技术方法和框架已经在实验室实验中进行了评估,但涉及潜在用户和电影专业人员的进一步评估正在进行中。
{"title":"Virtual director technology for social video communication and live event broadcast production","authors":"Rene Kaiser","doi":"10.1145/2502081.2502213","DOIUrl":"https://doi.org/10.1145/2502081.2502213","url":null,"abstract":"This thesis investigates several aspects of Virtual Director technology, i.e. software capable of intelligent real-time selection of live media streams. It addresses several research questions in this interdisciplinary field with respect to how a generic Virtual Director framework can be constructed, and how its behavior can be modeled and formalized to realize professional applications with many parallel users within real-time constraints. Prototypes have been built for the applications of group videoconferencing and live event broadcast. The engine executes cinematic principles aiming to enhance the user experience. In group videoconferencing, a Virtual Director aims to support communication goals by selecting from multiple available streams, i.e. automating cuts between shots according to the communication situation. In event broadcast, it enables personalization by framing, animating and cutting virtual camera views as cropping from a high-resolution panorama. While the technical approach and framework has been evaluated in lab experiments, further evaluation involving potential users and cinematic professionals is ongoing.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81968165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fitted spectral hashing 拟合谱哈希
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502169
Yu Wang, Sheng Tang, Yalin Zhang, Jintao Li, DanYi Chen
Spectral hashing (SpH) is an efficient and simple binary hashing method, which assumes that data are sampled from a multidimensional uniform distribution. However, this assumption is too restrictive in practice. In this paper we propose an improved method, Fitted Spectral Hashing, to relax this distribution assumption. Our work is based on the fact that one-dimensional data of any distribution could be mapped to a uniform distribution without changing the local neighbor relations among data items. We have found that this mapping on each PCA direction has certain regular pattern, and could fit data well by S-Curve function, Sigmoid function. With more parameters Fourier function also fit data well. Thus with Sigmoid function and Fourier function, we propose two binary hashing methods. Experiments show that our methods are efficient and outperform state-of-the-art methods.
谱哈希(SpH)是一种高效、简单的二进制哈希方法,它假设数据是从多维均匀分布中采样的。然而,这种假设在实践中过于严格。本文提出了一种改进的拟合谱哈希方法来放宽这种分布假设。我们的工作是基于这样一个事实,即任何分布的一维数据都可以映射到均匀分布,而不改变数据项之间的局部邻居关系。我们发现这种映射在主成分分析的各个方向上都有一定的规律性,用S-Curve函数、Sigmoid函数可以很好地拟合数据。在参数较多的情况下,傅里叶函数也能很好地拟合数据。因此,我们利用Sigmoid函数和傅里叶函数,提出了两种二元哈希方法。实验表明,我们的方法是有效的,并优于最先进的方法。
{"title":"Fitted spectral hashing","authors":"Yu Wang, Sheng Tang, Yalin Zhang, Jintao Li, DanYi Chen","doi":"10.1145/2502081.2502169","DOIUrl":"https://doi.org/10.1145/2502081.2502169","url":null,"abstract":"Spectral hashing (SpH) is an efficient and simple binary hashing method, which assumes that data are sampled from a multidimensional uniform distribution. However, this assumption is too restrictive in practice. In this paper we propose an improved method, Fitted Spectral Hashing, to relax this distribution assumption. Our work is based on the fact that one-dimensional data of any distribution could be mapped to a uniform distribution without changing the local neighbor relations among data items. We have found that this mapping on each PCA direction has certain regular pattern, and could fit data well by S-Curve function, Sigmoid function. With more parameters Fourier function also fit data well. Thus with Sigmoid function and Fourier function, we propose two binary hashing methods. Experiments show that our methods are efficient and outperform state-of-the-art methods.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79418469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 21st ACM international conference on Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1