首页 > 最新文献

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)最新文献

英文 中文
RSVP: Ridiculously Scalable Video Playback on Clustered Tiled Displays RSVP:在集群平铺显示上不可伸缩的视频播放
J. Kimball, K. Ponto, T. Wypych, F. Kuester
This paper introduces a distributed approach for playback of video content at resolutions of 4K (digital cinema) and well beyond. This approach is designed for scalable, high-resolution, multi-tile display environments, which are controlled by a cluster of machines, with each node driving one or multiple displays. A preparatory tiling pass separates the original video into a user definable n-by-m array of equally sized video tiles, each of which is individually compressed. By only reading and rendering the video tiles that correspond to a given node's viewpoint, the computation power required for video playback can be distributed over multiple machines, resulting in a highly scalable video playback system. This approach exploits the computational parallelism of the display cluster while only using minimal network resources in order to maintain software-level synchronization of the video playback. While network constraints limit the maximum resolution of other high-resolution video playback approaches, this algorithm is able to scale to video at resolutions of tens of millions of pixels and beyond. Furthermore the system allows for flexible control of the video characteristics, allowing content to be interactively reorganized while maintaining smooth playback. This approach scales well for concurrent playback of multiple videos and does not require any specialized video decoding hardware to achieve ultra-high resolution video playback.
本文介绍了一种以4K(数字电影)及更高分辨率播放视频内容的分布式方法。这种方法是为可扩展的、高分辨率的、多块显示环境而设计的,这些环境由一组机器控制,每个节点驱动一个或多个显示。一个预备平铺通道将原始视频分割成用户可定义的n × m大小相等的视频平铺数组,每个平铺都被单独压缩。通过仅读取和呈现与给定节点的视点相对应的视频块,视频播放所需的计算能力可以分布在多台机器上,从而形成高度可扩展的视频播放系统。这种方法利用了显示集群的计算并行性,同时只使用最小的网络资源,以保持视频播放的软件级同步。虽然网络限制限制了其他高分辨率视频播放方法的最大分辨率,但该算法能够扩展到数千万像素甚至更高分辨率的视频。此外,该系统允许灵活地控制视频特性,允许在保持平稳播放的同时交互式地重新组织内容。这种方法可以很好地扩展多个视频的并发播放,并且不需要任何专门的视频解码硬件来实现超高分辨率的视频播放。
{"title":"RSVP: Ridiculously Scalable Video Playback on Clustered Tiled Displays","authors":"J. Kimball, K. Ponto, T. Wypych, F. Kuester","doi":"10.1109/ISM.2013.12","DOIUrl":"https://doi.org/10.1109/ISM.2013.12","url":null,"abstract":"This paper introduces a distributed approach for playback of video content at resolutions of 4K (digital cinema) and well beyond. This approach is designed for scalable, high-resolution, multi-tile display environments, which are controlled by a cluster of machines, with each node driving one or multiple displays. A preparatory tiling pass separates the original video into a user definable n-by-m array of equally sized video tiles, each of which is individually compressed. By only reading and rendering the video tiles that correspond to a given node's viewpoint, the computation power required for video playback can be distributed over multiple machines, resulting in a highly scalable video playback system. This approach exploits the computational parallelism of the display cluster while only using minimal network resources in order to maintain software-level synchronization of the video playback. While network constraints limit the maximum resolution of other high-resolution video playback approaches, this algorithm is able to scale to video at resolutions of tens of millions of pixels and beyond. Furthermore the system allows for flexible control of the video characteristics, allowing content to be interactively reorganized while maintaining smooth playback. This approach scales well for concurrent playback of multiple videos and does not require any specialized video decoding hardware to achieve ultra-high resolution video playback.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"50 1","pages":"9-16"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90226581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Action Recognition Using Effective Mask Patterns Selected from a Classificational Viewpoint 从分类角度选择有效掩模模式的动作识别
Takumi Hayashi, K. Hotta
This paper presents action recognition using effective mask patterns selected from an classificational viewpoint. Cubic higher-order local auto-correlation (CHLAC) feature is robust to position changes of human actions in a video, and its effectiveness for action recognition was already shown. However, the mask patterns for extracting cubic higher-order local auto-correlation (CHLAC) features are fixed. In other words, the mask patterns are independent of action classes, and the features extracted from those mask patterns are not specialized for each action. Thus, we propose automatic creation of specialized mask patterns for each action. Our approach consists of 2 steps. First, mask patterns are created by clustering of local spatio-temporal regions in each action. However, unnecessary mask patterns such as same patterns and mask patterns with all 0 or 1 are included. Then we select the effective mask patterns for classification by feature selection techniques. Through experiments using the KTH dataset, the effectiveness of our method is shown.
本文从分类的角度出发,提出了一种基于有效掩模模式的动作识别方法。三次高阶局部自相关(CHLAC)特征对视频中人体动作的位置变化具有鲁棒性,其在动作识别中的有效性已得到验证。然而,用于提取三次高阶局部自相关(CHLAC)特征的掩模模式是固定的。换句话说,掩码模式独立于操作类,并且从这些掩码模式中提取的特征不是针对每个操作的。因此,我们建议为每个动作自动创建专门的掩码模式。我们的方法包括两个步骤。首先,通过对每个动作的局部时空区域进行聚类来创建掩模模式。然而,不必要的掩码模式,如相同的模式和全0或1的掩码模式也包括在内。然后通过特征选择技术选择有效的掩模模式进行分类。通过KTH数据集的实验,验证了该方法的有效性。
{"title":"Action Recognition Using Effective Mask Patterns Selected from a Classificational Viewpoint","authors":"Takumi Hayashi, K. Hotta","doi":"10.1109/ISM.2013.31","DOIUrl":"https://doi.org/10.1109/ISM.2013.31","url":null,"abstract":"This paper presents action recognition using effective mask patterns selected from an classificational viewpoint. Cubic higher-order local auto-correlation (CHLAC) feature is robust to position changes of human actions in a video, and its effectiveness for action recognition was already shown. However, the mask patterns for extracting cubic higher-order local auto-correlation (CHLAC) features are fixed. In other words, the mask patterns are independent of action classes, and the features extracted from those mask patterns are not specialized for each action. Thus, we propose automatic creation of specialized mask patterns for each action. Our approach consists of 2 steps. First, mask patterns are created by clustering of local spatio-temporal regions in each action. However, unnecessary mask patterns such as same patterns and mask patterns with all 0 or 1 are included. Then we select the effective mask patterns for classification by feature selection techniques. Through experiments using the KTH dataset, the effectiveness of our method is shown.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"67 1","pages":"140-146"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84408069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relational Social Image Search 关系社会形象搜索
P. Aarabi
This paper proposes a method of finding the relationship between objects based on their spatial arrangement in a set of tagged images. Based on the relative coordinates of each object tag, we compute a joint Relativity between each tag pair. We then propose an efficient image search method using the joint Relativity graphs and provide simple examples where the proposed Relational Social Image (RSI) search produces more relevant and intuitive results than simple search.
本文提出了一种基于一组标记图像中物体的空间排列来寻找物体之间关系的方法。根据每个目标标签的相对坐标,计算每个标签对之间的联合相对性。然后,我们提出了一种使用联合相关性图的高效图像搜索方法,并提供了简单的示例,其中所提出的关系社会图像(RSI)搜索比简单搜索产生更相关和直观的结果。
{"title":"Relational Social Image Search","authors":"P. Aarabi","doi":"10.1109/ISM.2013.105","DOIUrl":"https://doi.org/10.1109/ISM.2013.105","url":null,"abstract":"This paper proposes a method of finding the relationship between objects based on their spatial arrangement in a set of tagged images. Based on the relative coordinates of each object tag, we compute a joint Relativity between each tag pair. We then propose an efficient image search method using the joint Relativity graphs and provide simple examples where the proposed Relational Social Image (RSI) search produces more relevant and intuitive results than simple search.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"401 1","pages":"520-521"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84849976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Biological Image Temporal Stage Classification via Multi-layer Model Collaboration 基于多层模型协作的生物图像时间阶段分类
Tao Meng, M. Shyu
In current biological image analysis, the temporal stage information, such as the developmental stage in the Drosophila development in situ hybridization images, is important for biological knowledge discovery. Such information is usually gained through visual inspection by experts. However, as the high-throughput imaging technology becomes increasingly popular, the demand for labor effort on annotating, labeling, and organizing the images for efficient image retrieval has increased tremendously, making manual data processing infeasible. In this paper, a novel multi-layer classification framework is proposed to discover the temporal information of the biological images automatically. Rather than solving the problem directly, the proposed framework uses the idea of ``divide and conquer'' to create some middle level classes, which are relatively easy to annotate, and to train the proposed subspace-based classifiers on the subsets of data belonging to these categories. Next, the results from these classifiers are integrated to improve the final classification performance. In order to appropriately integrate the outputs from different classifiers, a multi-class based closed form quadratic cost function is defined as the optimization target and the parameters are estimated using the gradient descent algorithm. Our proposed framework is tested on three biological image data sets and compared with other state-of-the-art algorithms. The experimental results demonstrate that the proposed middle-level classes and the proper integration of the results from the corresponding classifiers are promising for mining the temporal stage information of the biological images.
在当前的生物图像分析中,时间阶段信息,如果蝇原位杂交图像中的发育阶段信息,对生物学知识的发现具有重要意义。这些信息通常是通过专家的目视检查获得的。然而,随着高通量成像技术的日益普及,对图像注释、标记和组织以实现高效图像检索的人工需求急剧增加,使得人工数据处理变得不可行的。本文提出了一种新的多层分类框架来自动发现生物图像的时间信息。提议的框架不是直接解决问题,而是使用“分而治之”的思想来创建一些相对容易注释的中间级别类,并在属于这些类别的数据子集上训练提议的基于子空间的分类器。接下来,将这些分类器的结果集成在一起,以提高最终的分类性能。为了合理地整合不同分类器的输出,定义了一个基于多类的封闭式二次代价函数作为优化目标,并使用梯度下降算法估计参数。我们提出的框架在三个生物图像数据集上进行了测试,并与其他最先进的算法进行了比较。实验结果表明,所提出的中级分类和相应分类器结果的适当整合对于挖掘生物图像的时间阶段信息是有希望的。
{"title":"Biological Image Temporal Stage Classification via Multi-layer Model Collaboration","authors":"Tao Meng, M. Shyu","doi":"10.1109/ISM.2013.15","DOIUrl":"https://doi.org/10.1109/ISM.2013.15","url":null,"abstract":"In current biological image analysis, the temporal stage information, such as the developmental stage in the Drosophila development in situ hybridization images, is important for biological knowledge discovery. Such information is usually gained through visual inspection by experts. However, as the high-throughput imaging technology becomes increasingly popular, the demand for labor effort on annotating, labeling, and organizing the images for efficient image retrieval has increased tremendously, making manual data processing infeasible. In this paper, a novel multi-layer classification framework is proposed to discover the temporal information of the biological images automatically. Rather than solving the problem directly, the proposed framework uses the idea of ``divide and conquer'' to create some middle level classes, which are relatively easy to annotate, and to train the proposed subspace-based classifiers on the subsets of data belonging to these categories. Next, the results from these classifiers are integrated to improve the final classification performance. In order to appropriately integrate the outputs from different classifiers, a multi-class based closed form quadratic cost function is defined as the optimization target and the parameters are estimated using the gradient descent algorithm. Our proposed framework is tested on three biological image data sets and compared with other state-of-the-art algorithms. The experimental results demonstrate that the proposed middle-level classes and the proper integration of the results from the corresponding classifiers are promising for mining the temporal stage information of the biological images.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"14 1","pages":"30-37"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85048678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Mobile Scene Flow Synthesis 移动场景流合成
V. Ly, C. Kambhamettu
Scene flow is the motion of the 3D world, it is used in obstacle avoidance, slow motion interpolation, surveillance, studying human behavior, and much more. A mobile implementation of scene flow can greatly increase the flexibility of scene flow applications. Furthermore, combining multiple scene flows to one panoramic scene flow can aid in these tasks: allowing coverage of dead spots in surveillance, studying human motion from multiple views, or simply obtain a larger motion view of a scene. In this paper, a robust algorithm for building panoramic scene flow obtained from a mobile device is described. Since scene flow is estimated from a mobile device, observer motion is compensated for using least squares fitting over the entire scene. Furthermore, noise is reduced and outliers are eliminated from the 3D motion field using motion model fitting. The results demonstrate the effectiveness of the suggested algorithm for constructing a scene flow panorama from moving sources.
场景流是3D世界的运动,它被用于避障、慢动作插值、监视、研究人类行为等等。场景流的移动实现可以大大增加场景流应用程序的灵活性。此外,将多个场景流结合到一个全景场景流可以帮助完成这些任务:允许覆盖监视中的死点,从多个视图研究人体运动,或者简单地获得一个场景的更大的运动视图。本文描述了一种鲁棒的构建移动设备全景场景流的算法。由于场景流是从移动设备估计的,因此在整个场景中使用最小二乘拟合来补偿观察者的运动。此外,采用运动模型拟合的方法降低了噪声,消除了三维运动场中的异常值。实验结果证明了该算法在从移动光源构建场景流全景图方面的有效性。
{"title":"Mobile Scene Flow Synthesis","authors":"V. Ly, C. Kambhamettu","doi":"10.1109/ISM.2013.85","DOIUrl":"https://doi.org/10.1109/ISM.2013.85","url":null,"abstract":"Scene flow is the motion of the 3D world, it is used in obstacle avoidance, slow motion interpolation, surveillance, studying human behavior, and much more. A mobile implementation of scene flow can greatly increase the flexibility of scene flow applications. Furthermore, combining multiple scene flows to one panoramic scene flow can aid in these tasks: allowing coverage of dead spots in surveillance, studying human motion from multiple views, or simply obtain a larger motion view of a scene. In this paper, a robust algorithm for building panoramic scene flow obtained from a mobile device is described. Since scene flow is estimated from a mobile device, observer motion is compensated for using least squares fitting over the entire scene. Furthermore, noise is reduced and outliers are eliminated from the 3D motion field using motion model fitting. The results demonstrate the effectiveness of the suggested algorithm for constructing a scene flow panorama from moving sources.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"6 1","pages":"439-444"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82391933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Similarity-Based Browsing of Image Search Results 基于相似度的图像搜索结果浏览
David Edmundson, G. Schaefer, M. E. Celebi
In this demo paper, we present an image browsing system that is suitable for online visualisation and browsing of search results from Google Images. Our approach is based on the Huffman tables available in the JPEG headers of Google Images thumbnails. Since these are adapted to the images, we employ them directly as image features. We then generate a visualisation of the search results by projection onto a 2-dimensional visualisation space based on principal component analysis derived from the Huffman entries. Images are dynamically placed into a grid structure and organised in a tree-like hierarchy for visual browsing. Since we utilise information only from the JPEG header, the requirements in terms of bandwidth are low, while no explicit feature calculation needs to be performed, thus allowing for interactive browsing of online image search results.
在这篇演示论文中,我们展示了一个图像浏览系统,该系统适用于在线可视化和浏览谷歌图像的搜索结果。我们的方法是基于Google Images缩略图的JPEG标题中可用的Huffman表。由于这些特征与图像相适应,我们直接将它们作为图像特征。然后,我们通过投影到二维可视化空间来生成搜索结果的可视化,该可视化空间基于从霍夫曼条目导出的主成分分析。图像被动态地放置到一个网格结构中,并以树状的层次结构组织,以供视觉浏览。由于我们只利用JPEG报头的信息,因此对带宽的要求很低,同时不需要进行显式的特征计算,从而允许交互式浏览在线图像搜索结果。
{"title":"Similarity-Based Browsing of Image Search Results","authors":"David Edmundson, G. Schaefer, M. E. Celebi","doi":"10.1109/ISM.2013.97","DOIUrl":"https://doi.org/10.1109/ISM.2013.97","url":null,"abstract":"In this demo paper, we present an image browsing system that is suitable for online visualisation and browsing of search results from Google Images. Our approach is based on the Huffman tables available in the JPEG headers of Google Images thumbnails. Since these are adapted to the images, we employ them directly as image features. We then generate a visualisation of the search results by projection onto a 2-dimensional visualisation space based on principal component analysis derived from the Huffman entries. Images are dynamically placed into a grid structure and organised in a tree-like hierarchy for visual browsing. Since we utilise information only from the JPEG header, the requirements in terms of bandwidth are low, while no explicit feature calculation needs to be performed, thus allowing for interactive browsing of online image search results.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"194 1","pages":"502-503"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73195898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Big Data and NSA Surveillance -- Survey of Technology and Legal Issues 大数据与国家安全局监控——技术与法律问题调查
Chanmin Park, Taehyung Wang
As technology for data storage and computing enhances on a new dimension, real world evidence based decision has become possible, and a new concept called "Big Data" has emerged. Leaders for Big Data technology development are mainly national security and medical field due to their urgent need of decision based on big data. Recent revelation of collection and analysis activities of National Security Agency (NSA) that cover massive data corresponding to the area of people's privacy triggered much controversy about concerns for infringing privacy and violation of constitutional rights. This, in turn, became opportunity of producing various analysis and opinions about the current level of big data technology, future possibility, types of work that big data has enabled, and technological and legal means for preventing expected abuse. The purposes of this paper are surveying materials about legal issues regarding NSA's activity with big data, and organizing feedbacks collected from various sources of the society.
随着数据存储和计算技术在一个新的维度上的增强,现实世界中基于证据的决策成为可能,并且出现了一个新的概念,称为“大数据”。大数据技术发展的领导者主要是国家安全和医疗领域,因为他们迫切需要基于大数据的决策。最近,美国国家安全局(NSA)涉及大量个人隐私数据的收集和分析活动被曝光,引发了人们对侵犯隐私和侵犯宪法权利的担忧。这反过来又成为了对当前大数据技术水平、未来可能性、大数据所能实现的工作类型以及防止预期滥用的技术和法律手段进行各种分析和发表意见的机会。本文的目的是调查有关NSA利用大数据活动的法律问题的资料,并整理从社会各方面收集到的反馈。
{"title":"Big Data and NSA Surveillance -- Survey of Technology and Legal Issues","authors":"Chanmin Park, Taehyung Wang","doi":"10.1109/ISM.2013.103","DOIUrl":"https://doi.org/10.1109/ISM.2013.103","url":null,"abstract":"As technology for data storage and computing enhances on a new dimension, real world evidence based decision has become possible, and a new concept called \"Big Data\" has emerged. Leaders for Big Data technology development are mainly national security and medical field due to their urgent need of decision based on big data. Recent revelation of collection and analysis activities of National Security Agency (NSA) that cover massive data corresponding to the area of people's privacy triggered much controversy about concerns for infringing privacy and violation of constitutional rights. This, in turn, became opportunity of producing various analysis and opinions about the current level of big data technology, future possibility, types of work that big data has enabled, and technological and legal means for preventing expected abuse. The purposes of this paper are surveying materials about legal issues regarding NSA's activity with big data, and organizing feedbacks collected from various sources of the society.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"15 1","pages":"516-517"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80696047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Sparse Representation-Based Human Action Recognition Using an Action Region-Aware Dictionary 基于稀疏表示的基于动作区域感知字典的人类动作识别
Hyun-seok Min, W. D. Neve, Yong Man Ro
Automatic human action recognition is a core functionality of systems for video surveillance and human-object interaction. Conventional vision-based systems for human action recognition require the use of segmentation in order to achieve an acceptable level of recognition effectiveness. However, generic techniques for automatic segmentation are currently not available yet. Therefore, in this paper, we propose a novel sparse representation-based method for human action recognition, taking advantage of the observation that, although the location and size of the action region in a test video clip is unknown, the construction of a dictionary can leverage information about the location and size of action regions in training video clips. That way, we are able to segment, implicitly, action and context information in a test video clip, thus improving the effectiveness of classification. That way, we are also able to develop a context-adaptive classification strategy. As shown by comparative experimental results obtained for the UCF Sports Action data set, the proposed method facilitates effective human action recognition, even when testing does not rely on explicit segmentation.
人的动作自动识别是视频监控和人机交互系统的核心功能。传统的基于视觉的人体动作识别系统需要使用分割来达到可接受的识别效率水平。然而,目前还没有通用的自动分割技术。因此,在本文中,我们提出了一种新的基于稀疏表示的人类动作识别方法,该方法利用了这样一个观察结果,即尽管测试视频片段中动作区域的位置和大小是未知的,但字典的构建可以利用训练视频片段中动作区域的位置和大小的信息。这样,我们就可以隐式地对测试视频片段中的动作和上下文信息进行分割,从而提高分类的有效性。这样,我们也能够开发一种上下文自适应的分类策略。UCF Sports Action数据集的对比实验结果表明,即使在测试不依赖于显式分割的情况下,该方法也能促进有效的人体动作识别。
{"title":"Sparse Representation-Based Human Action Recognition Using an Action Region-Aware Dictionary","authors":"Hyun-seok Min, W. D. Neve, Yong Man Ro","doi":"10.1109/ISM.2013.30","DOIUrl":"https://doi.org/10.1109/ISM.2013.30","url":null,"abstract":"Automatic human action recognition is a core functionality of systems for video surveillance and human-object interaction. Conventional vision-based systems for human action recognition require the use of segmentation in order to achieve an acceptable level of recognition effectiveness. However, generic techniques for automatic segmentation are currently not available yet. Therefore, in this paper, we propose a novel sparse representation-based method for human action recognition, taking advantage of the observation that, although the location and size of the action region in a test video clip is unknown, the construction of a dictionary can leverage information about the location and size of action regions in training video clips. That way, we are able to segment, implicitly, action and context information in a test video clip, thus improving the effectiveness of classification. That way, we are also able to develop a context-adaptive classification strategy. As shown by comparative experimental results obtained for the UCF Sports Action data set, the proposed method facilitates effective human action recognition, even when testing does not rely on explicit segmentation.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"446 1","pages":"133-139"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82900745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
QUEST: Towards a Multi-modal CBIR Framework Combining Query-by-Example, Query-by-Sketch, and Text Search QUEST:迈向结合按例查询、按草图查询和文本搜索的多模态CBIR框架
Ihab Al Kabary, Ivan Giangreco, H. Schuldt, Fabrice Matulic, M. Norrie
The enormous increase of digital image collections urgently necessitates effective, efficient, and in particular highly flexible approaches to image retrieval. Different search paradigms such as text search, query-by-example, or query-by-sketch need to be seamlessly combined and integrated to support different information needs and to allow users to start (and subsequently refine) queries with any type of object. In this paper, we present QUEST (Query by Example, Sketch and Text), a novel flexible multi-modal content-based image retrieval (CBIR) framework. QUEST seamlessly integrates and blends multiple modes of image retrieval, thereby accumulating the strengths of each individual mode. Moreover, it provides several implementations of the different query modes and allows users to select, combine and even superimpose the mode(s) most appropriate for each search task. The combination of search paradigms is by itself done in a very flexible way: either sequentially, where one query mode starts with the result set of the previous one (i.e., for incrementally refining and/or extending a query) or by supporting different paradigms at the same time (e.g., creating an artificial query image by superimposing a query image with a sketch, thereby directly integrating query-by-example and query-by-sketch). We present the overall architecture of QUEST and the dynamic combination and integration of the query modes it supports. Furthermore, we provide first evaluation results that show the effectiveness and the gain in efficiency that can be achieved with the combination of different search modes in QUEST.
数字图像馆藏的大量增加迫切需要有效、高效、特别是高度灵活的图像检索方法。不同的搜索范式(如文本搜索、按示例查询或按草图查询)需要无缝地组合和集成,以支持不同的信息需求,并允许用户使用任何类型的对象启动(并随后改进)查询。在本文中,我们提出了QUEST (Query by Example, Sketch and Text),一种新颖灵活的多模态基于内容的图像检索(CBIR)框架。QUEST将多种图像检索模式无缝集成和融合,从而积累了每种模式的优势。此外,它还提供了几种不同查询模式的实现,并允许用户选择、组合甚至叠加最适合每个搜索任务的模式。搜索范式的组合本身是以一种非常灵活的方式完成的:要么是顺序的,其中一个查询模式从前一个查询模式的结果集开始(即,增量地精炼和/或扩展查询),要么是同时支持不同的范式(例如,通过将查询图像与草图叠加来创建人工查询图像,从而直接集成按示例查询和按草图查询)。我们给出了QUEST的总体架构以及它所支持的查询模式的动态组合和集成。此外,我们提供了第一个评估结果,显示了QUEST中不同搜索模式组合可以实现的有效性和效率增益。
{"title":"QUEST: Towards a Multi-modal CBIR Framework Combining Query-by-Example, Query-by-Sketch, and Text Search","authors":"Ihab Al Kabary, Ivan Giangreco, H. Schuldt, Fabrice Matulic, M. Norrie","doi":"10.1109/ISM.2013.84","DOIUrl":"https://doi.org/10.1109/ISM.2013.84","url":null,"abstract":"The enormous increase of digital image collections urgently necessitates effective, efficient, and in particular highly flexible approaches to image retrieval. Different search paradigms such as text search, query-by-example, or query-by-sketch need to be seamlessly combined and integrated to support different information needs and to allow users to start (and subsequently refine) queries with any type of object. In this paper, we present QUEST (Query by Example, Sketch and Text), a novel flexible multi-modal content-based image retrieval (CBIR) framework. QUEST seamlessly integrates and blends multiple modes of image retrieval, thereby accumulating the strengths of each individual mode. Moreover, it provides several implementations of the different query modes and allows users to select, combine and even superimpose the mode(s) most appropriate for each search task. The combination of search paradigms is by itself done in a very flexible way: either sequentially, where one query mode starts with the result set of the previous one (i.e., for incrementally refining and/or extending a query) or by supporting different paradigms at the same time (e.g., creating an artificial query image by superimposing a query image with a sketch, thereby directly integrating query-by-example and query-by-sketch). We present the overall architecture of QUEST and the dynamic combination and integration of the query modes it supports. Furthermore, we provide first evaluation results that show the effectiveness and the gain in efficiency that can be achieved with the combination of different search modes in QUEST.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"80 1","pages":"433-438"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77368689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards Context-Aware Recommendations of Multimedia in an Ambient Intelligence Environment 环境智能环境下多媒体的情境感知推荐
Mohammed F. Alhamid, Majdi Rawashdeh, Abdulmotaleb El Saddik
Given today's mobile and smart devices, and the ability to access different multimedia contents in real-time, it is difficult for users to find the right multimedia content from such a large number of choices. Users also consume diverse multimedia based on many contexts, with different personal preferences and settings. For these reasons, there is a need to reinforce recommendation process with context-adaptive information that can be used to select the right multimedia content and deliver the recommendations in preferred mechanisms. This paper proposes a framework to establish a bridge between the multimedia content, the user and joint preferences, contextual information including the physiological parameters, and the Ambient Intelligent (AmI) environment, using multi-modal recommendation interfaces.
考虑到今天的移动和智能设备,以及实时访问不同多媒体内容的能力,用户很难从如此大量的选择中找到合适的多媒体内容。用户还会根据不同的上下文、不同的个人偏好和设置使用不同的多媒体。由于这些原因,有必要使用上下文自适应信息来加强推荐过程,这些信息可用于选择正确的多媒体内容,并以首选机制提供推荐。本文提出了一个框架,利用多模态推荐接口在多媒体内容、用户和联合偏好、包括生理参数在内的上下文信息和环境智能(AmI)环境之间建立桥梁。
{"title":"Towards Context-Aware Recommendations of Multimedia in an Ambient Intelligence Environment","authors":"Mohammed F. Alhamid, Majdi Rawashdeh, Abdulmotaleb El Saddik","doi":"10.1109/ISM.2013.80","DOIUrl":"https://doi.org/10.1109/ISM.2013.80","url":null,"abstract":"Given today's mobile and smart devices, and the ability to access different multimedia contents in real-time, it is difficult for users to find the right multimedia content from such a large number of choices. Users also consume diverse multimedia based on many contexts, with different personal preferences and settings. For these reasons, there is a need to reinforce recommendation process with context-adaptive information that can be used to select the right multimedia content and deliver the recommendations in preferred mechanisms. This paper proposes a framework to establish a bridge between the multimedia content, the user and joint preferences, contextual information including the physiological parameters, and the Ambient Intelligent (AmI) environment, using multi-modal recommendation interfaces.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"1 1","pages":"409-414"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85343348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1