2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)最新文献

英文中文

A Quality of Experience Testbed for Video-Mediated Group Communication 基于视频中介的群通信体验质量测试平台

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.102

Marwin Schmitt, S. Gunkel, Pablo César

Video-Mediated group communication is quickly moving from the office to the home, where network conditions might fluctuate. If we are to provide a software component that can, in real-time, monitor the Quality of Experience (QoE), we would have to carry out extensive experiments under different varying (but controllable) conditions. Unfortunately, there are no tools available that provide us the required fined-grained level of control. This paper reports on our efforts implementing such a test bed. The test bed provides the experiment conductor full control over the complete media pipeline, and the possibility of modifying in real-time network and media conditions. Additionally, it has facilities to easily develop an experiment with custom layouts, task integration, and assessment of subjective ratings through questionnaires. We have already used the test bed in a number of evaluations, reported in this paper for discussing the benefits and drawbacks of our solution. The test bed have been proven to be a flexible and effective canvas for better understanding QoE on video-mediated group communication.

以视频为媒介的群体交流正迅速从办公室转移到家庭，而家庭的网络条件可能会有所波动。如果我们要提供一个能够实时监控体验质量(QoE)的软件组件，我们就必须在不同的(但可控的)条件下进行大量的实验。不幸的是，没有可用的工具为我们提供所需的细粒度级别的控制。本文报告了我们为实现这样一个试验台所做的努力。该试验台为实验指挥员提供了对整个介质管道的完全控制，以及在实时网络和介质条件下进行修改的可能性。此外，它还具有方便地开发自定义布局、任务集成和通过问卷评估主观评分的实验的功能。我们已经在许多评估中使用了测试平台，在本文中讨论了我们的解决方案的优点和缺点。该测试平台已被证明是一个灵活而有效的画布，可以更好地理解视频中介组通信的QoE。

{"title":"A Quality of Experience Testbed for Video-Mediated Group Communication","authors":"Marwin Schmitt, S. Gunkel, Pablo César","doi":"10.1109/ISM.2013.102","DOIUrl":"https://doi.org/10.1109/ISM.2013.102","url":null,"abstract":"Video-Mediated group communication is quickly moving from the office to the home, where network conditions might fluctuate. If we are to provide a software component that can, in real-time, monitor the Quality of Experience (QoE), we would have to carry out extensive experiments under different varying (but controllable) conditions. Unfortunately, there are no tools available that provide us the required fined-grained level of control. This paper reports on our efforts implementing such a test bed. The test bed provides the experiment conductor full control over the complete media pipeline, and the possibility of modifying in real-time network and media conditions. Additionally, it has facilities to easily develop an experiment with custom layouts, task integration, and assessment of subjective ratings through questionnaires. We have already used the test bed in a number of evaluations, reported in this paper for discussing the benefits and drawbacks of our solution. The test bed have been proven to be a flexible and effective canvas for better understanding QoE on video-mediated group communication.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"19 1","pages":"514-515"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90243245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

GaliTracker: Real-Time Lecturer-Tracking for Lecture Capturing GaliTracker:实时讲师跟踪讲座捕捉

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.89

Elisardo González-Agulla, J. Alba-Castro, Hector Canto, Vicente Goyanes

This paper describes a fully automated Real-Time Lecturer-Tracking module (RTLT) and the seamless integration into a Matter horn-based Lecture Capturing System (LCS). The main purpose of the RTLT module is obtaining a lecturer's portrait image for creating an integrated slides lecturer single-stream ready to distribute and consume in portable devices, where displayed contents must be optimized. The module robustly tracks any number of presenters in real-time using a set of visual cues and delivers frame-rate metadata to plug into a Virtual Cinematographer module. The so-called Gal tracker RTLT module allows broadcasting live in conjunction with the LCS, Gal caster, or processing off-line as a video-production engine inserted into the Matter horn workflow.

本文介绍了一个完全自动化的实时讲师跟踪模块(RTLT)，并将其无缝集成到一个基于物质角的讲座捕捉系统(LCS)中。RTLT模块的主要目的是获取讲师的肖像图像，用于创建集成幻灯片讲师单流，准备在便携式设备中分发和消费，其中显示的内容必须进行优化。该模块使用一组视觉线索实时跟踪任意数量的演示者，并提供帧率元数据，以插入虚拟电影摄影师模块。所谓的Gal跟踪器RTLT模块允许与LCS, Gal cast一起直播，或作为插入Matter horn工作流程的视频制作引擎离线处理。

引用次数: 7

Investigation of Japanese Onomatopoeias as Features for SHITSUKAN-Based Image Retrieval 基于shitsukan的日语拟声词特征检索研究

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.75

Yuxing Wu, K. Hirai, T. Horiuchi

The term "shitsukan" refers to the perception of materials and surface qualities of natural and manmade objects. The main goal of our study is to extend the current content-based image retrieval (CBIR) systems to develop a shitsukan-based image retrieval (SBIR) system. This paper focuses on Japanese onomatopoeias as a feature for SBIR and verifies their suitability based on psychophysical experiments. In this study, we conducted two different experiments. In the first experiment, subjects assigned suitable onomatopoeias to 50 test images. In the second experiment, they selected perceptually similar images for each test image from the other 49 test images. Then, we investigated the relationship between the assigned onomatopoeias and the selected similar images. The results indicate that perceptually similar images were assigned to the same onomatopoeias with correlation, and that onomatopoeias were effective as a feature for SBIR.

“shitsukan”一词指的是对自然和人造物体的材料和表面质量的感知。本研究的主要目标是扩展现有的基于内容的图像检索(CBIR)系统，以开发基于shitsukan的图像检索(SBIR)系统。本文将日语拟声词作为SBIR的特征，并通过心理物理实验验证其适用性。在这项研究中，我们进行了两个不同的实验。在第一个实验中，受试者为50幅测试图像分配合适的拟声词。在第二个实验中，他们从其他49个测试图像中为每个测试图像选择感知上相似的图像。然后，我们研究了指定的拟声词与选择的相似图像之间的关系。结果表明，感知相似的图像被分配到具有相关性的相同的拟声词上，拟声词作为SBIR的特征是有效的。

引用次数: 2

Stream Processors Texture Generation Model for 3D Virtual Worlds: Learning Tools in vAcademia 流处理器纹理生成模型的3D虚拟世界:学习工具在v学术界

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.13

A. Smorkalov, Mikhail Fominykh, M. Morozov

In this paper, we address the challenges of applying three-dimensional virtual worlds for learning. Despite the numerous positive conclusions, this technology is far from becoming mainstream in education. The most common problems with applying it in everyday teaching and learning are steep learning curve and demand for computational and network resources. In order to address these problems, we developed a stream processors texture generation model for displaying educational content in 3D virtual worlds. The model suggests conducting image-processing tasks on stream processors in order to reduce the load on CPU. It allows designing convenient and sophisticated tools for collaborative work with graphics inside a 3D environment. Such tools simplify the use of a 3D virtual environment, and therefore, improve the negative learning curve effect. We present the methods of generating images based on the suggested model, the design and implementation of a set of tools for collaborative work with 2D graphical content in vAcademia virtual world. In addition, we provide the evaluation of the suggested model based on a series of tests which we applied to the whole system and specific algorithms. We also present the initial result of user evaluation.

在本文中，我们解决了应用三维虚拟世界进行学习的挑战。尽管有许多积极的结论，但这项技术远未成为教育的主流。在日常教学中应用它最常见的问题是陡峭的学习曲线和对计算资源和网络资源的需求。为了解决这些问题，我们开发了一个流处理器纹理生成模型，用于在3D虚拟世界中显示教育内容。该模型建议在流处理器上执行图像处理任务，以减少CPU的负载。它允许设计方便和复杂的工具，用于在3D环境中与图形协同工作。这样的工具简化了3D虚拟环境的使用，因此，提高了负学习曲线效果。我们提出了基于建议模型生成图像的方法，设计和实现了一套工具，用于在vAcademia虚拟世界中与2D图形内容协同工作。此外，我们还提供了基于我们应用于整个系统和具体算法的一系列测试的建议模型的评估。我们也给出了用户评价的初步结果。

{"title":"Stream Processors Texture Generation Model for 3D Virtual Worlds: Learning Tools in vAcademia","authors":"A. Smorkalov, Mikhail Fominykh, M. Morozov","doi":"10.1109/ISM.2013.13","DOIUrl":"https://doi.org/10.1109/ISM.2013.13","url":null,"abstract":"In this paper, we address the challenges of applying three-dimensional virtual worlds for learning. Despite the numerous positive conclusions, this technology is far from becoming mainstream in education. The most common problems with applying it in everyday teaching and learning are steep learning curve and demand for computational and network resources. In order to address these problems, we developed a stream processors texture generation model for displaying educational content in 3D virtual worlds. The model suggests conducting image-processing tasks on stream processors in order to reduce the load on CPU. It allows designing convenient and sophisticated tools for collaborative work with graphics inside a 3D environment. Such tools simplify the use of a 3D virtual environment, and therefore, improve the negative learning curve effect. We present the methods of generating images based on the suggested model, the design and implementation of a set of tools for collaborative work with 2D graphical content in vAcademia virtual world. In addition, we provide the evaluation of the suggested model based on a series of tests which we applied to the whole system and specific algorithms. We also present the initial result of user evaluation.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"94 1","pages":"17-24"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80335124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

MultiFacet: A Faceted Interface for Browsing Large Multimedia Collections MultiFacet:用于浏览大型多媒体集合的多面界面

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.66

Michael J. Henry, Shawn D. Hampton, A. Endert, Ian Roberts, D. Payne

Faceted browsing is a common technique for exploring collections where the data can be grouped into a number of pre-defined categories, most often generated from textual metadata. Historically, faceted browsing has been applied to a single data type such as text or image data. However, typical collections contain multiple data types, such as information from web pages that contain text, images, and video. Additionally, when browsing a collection of images and video, facets are often created based on the metadata which may be incomplete, inaccurate, or missing altogether instead of the actual visual content contained within those images and video. In this work we address these limitations by presenting MultiFacet, a faceted browsing interface that supports multiple data types. MultiFacet constructs facets for images and video in a collection from the visual content using computer vision techniques. These visual facets can then be browsed in conjunction with text facets within a single interface to reveal relationships and phenomena within multimedia collections. Additionally, we present a use case based on real-world data, demonstrating the utility of this approach towards browsing a large multimedia data collection.

分面浏览是一种用于探索集合的常用技术，其中可以将数据分组到许多预定义的类别中，这些类别通常是从文本元数据生成的。从历史上看，分面浏览已应用于单一数据类型，如文本或图像数据。但是，典型的集合包含多种数据类型，例如来自包含文本、图像和视频的网页的信息。此外，在浏览图像和视频集合时，facet通常是基于可能不完整、不准确或完全缺失的元数据创建的，而不是基于这些图像和视频中包含的实际视觉内容。在这项工作中，我们通过提供MultiFacet来解决这些限制，MultiFacet是一个支持多种数据类型的多面浏览界面。MultiFacet使用计算机视觉技术为视觉内容集合中的图像和视频构建facet。然后可以在单个界面中与文本切面一起浏览这些视觉切面，以揭示多媒体集合中的关系和现象。此外，我们还提供了一个基于真实数据的用例，演示了这种方法在浏览大型多媒体数据集方面的实用性。

{"title":"MultiFacet: A Faceted Interface for Browsing Large Multimedia Collections","authors":"Michael J. Henry, Shawn D. Hampton, A. Endert, Ian Roberts, D. Payne","doi":"10.1109/ISM.2013.66","DOIUrl":"https://doi.org/10.1109/ISM.2013.66","url":null,"abstract":"Faceted browsing is a common technique for exploring collections where the data can be grouped into a number of pre-defined categories, most often generated from textual metadata. Historically, faceted browsing has been applied to a single data type such as text or image data. However, typical collections contain multiple data types, such as information from web pages that contain text, images, and video. Additionally, when browsing a collection of images and video, facets are often created based on the metadata which may be incomplete, inaccurate, or missing altogether instead of the actual visual content contained within those images and video. In this work we address these limitations by presenting MultiFacet, a faceted browsing interface that supports multiple data types. MultiFacet constructs facets for images and video in a collection from the visual content using computer vision techniques. These visual facets can then be browsed in conjunction with text facets within a single interface to reveal relationships and phenomena within multimedia collections. Additionally, we present a use case based on real-world data, demonstrating the utility of this approach towards browsing a large multimedia data collection.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"34 1","pages":"347-350"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81190132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

AvCloak: A Tool for Black Box Latency Measurements in Video Conferencing Applications AvCloak:一个用于视频会议应用中黑箱延迟测量的工具

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.52

Andrew Kryczka, A. Arefin, K. Nahrstedt

The usage and number of available video conferencing (VC) applications are rising as the high-bandwidth, low latency networks on which they depend become increasingly prevalent. Since VC applications support real-time human interaction, problems with performance that impair interactivity are social issues. Currently, performance measurements cannot easily be obtained due to the proprietary nature of VC applications, however, such measurements would be useful because they enable researchers to easily evaluate the performance impact of architectural and design decisions, quantitatively compare VC applications, and determine service level agreement (SLA) compliance. In this paper, we present a tool called Av Cloak that is capable of measuring several key performance metrics in proprietary VC applications: mouth-to-ear latency and jitter, capture-to-display latency and jitter, and audio-visual synchronization skew. AvCloak takes these measurements by wrapping ("cloaking") the VC application's audio/video inputs/outputs and transmitting timestamp data through them. At the sender side, AvCloak synthesizes media data encoding timestamps and feeds them to the VC application's media inputs, while at the receiver side, AvCloak decodes timestamps from the VC application's media outputs. Since AvCloak interacts with the target VC application only through its media inputs and outputs, it treats the target application as a black box and is thus applicable to arbitrary VC applications. We provide extensive analyses to measure AvCloak's overhead and show how to improve accuracy in measurements using two popular VC applications: Skype and Google+ Hangouts.

随着视频会议(VC)所依赖的高带宽、低延迟网络的日益普及，可用视频会议(VC)应用程序的使用量和数量正在增加。由于VC应用程序支持实时人机交互，因此损害交互性的性能问题是社会问题。目前，由于VC应用程序的专有性质，性能测量不容易获得，然而，这种测量将是有用的，因为它们使研究人员能够轻松评估架构和设计决策对性能的影响，定量比较VC应用程序，并确定服务水平协议(SLA)遵从性。在本文中，我们介绍了一个名为Av Cloak的工具，它能够测量专有VC应用程序中的几个关键性能指标:嘴到耳朵的延迟和抖动，捕获到显示的延迟和抖动，以及视听同步偏差。AvCloak通过包裹(“隐形”)VC应用程序的音频/视频输入/输出并通过它们传输时间戳数据来实现这些测量。在发送端，AvCloak合成媒体数据编码时间戳，并将其提供给VC应用程序的媒体输入，而在接收端，AvCloak从VC应用程序的媒体输出中解码时间戳。由于AvCloak仅通过其媒体输入和输出与目标VC应用程序交互，因此它将目标应用程序视为黑盒，因此适用于任意VC应用程序。我们提供广泛的分析来测量AvCloak的开销，并展示如何使用两个流行的VC应用程序提高测量的准确性:Skype和b谷歌+ Hangouts。

{"title":"AvCloak: A Tool for Black Box Latency Measurements in Video Conferencing Applications","authors":"Andrew Kryczka, A. Arefin, K. Nahrstedt","doi":"10.1109/ISM.2013.52","DOIUrl":"https://doi.org/10.1109/ISM.2013.52","url":null,"abstract":"The usage and number of available video conferencing (VC) applications are rising as the high-bandwidth, low latency networks on which they depend become increasingly prevalent. Since VC applications support real-time human interaction, problems with performance that impair interactivity are social issues. Currently, performance measurements cannot easily be obtained due to the proprietary nature of VC applications, however, such measurements would be useful because they enable researchers to easily evaluate the performance impact of architectural and design decisions, quantitatively compare VC applications, and determine service level agreement (SLA) compliance. In this paper, we present a tool called Av Cloak that is capable of measuring several key performance metrics in proprietary VC applications: mouth-to-ear latency and jitter, capture-to-display latency and jitter, and audio-visual synchronization skew. AvCloak takes these measurements by wrapping (\"cloaking\") the VC application's audio/video inputs/outputs and transmitting timestamp data through them. At the sender side, AvCloak synthesizes media data encoding timestamps and feeds them to the VC application's media inputs, while at the receiver side, AvCloak decodes timestamps from the VC application's media outputs. Since AvCloak interacts with the target VC application only through its media inputs and outputs, it treats the target application as a black box and is thus applicable to arbitrary VC applications. We provide extensive analyses to measure AvCloak's overhead and show how to improve accuracy in measurements using two popular VC applications: Skype and Google+ Hangouts.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"69 1","pages":"271-278"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81483165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Signaling Performance for SIP over IPv6 Mobile Ad-Hoc Network (MANET) 基于IPv6移动自组网(MANET)的SIP信令性能研究

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.44

M. Alshamrani, H. Cruickshank, Zhili Sun, Vahid Fami, B. Elmasri, Emad Danish

The unstable nature of MANETs over different types of wireless topologies and mobility models affects the Quality of Service (QoS) for real time applications such as Voice over IP (VoIP). One of the most efficient signaling systems for VoIP applications is the Session Initiation Protocol (SIP) which is mainly used to initiate, manage, and terminate VoIP calls over different types of IP based network systems. As a part of upgrading to Next Generation Network, MANETs will be considering IPv6 for different types of applications and devices. Therefore, SIP signaling over IPv6 MANETs needs to be investigated with different QoS performance metrics such as bandwidth, packet loss, delay and jitter. In this paper, an evaluation of SIP signaling is conducted for SIP based VoIP calls using GSM voice codec system over MANETs with Static, Uniform, and Random mobility models. This evaluation considered AODV as a reactive routing protocol and OLSR as a proactive routing protocol over both IPv4 as well as IPv6. The evaluation study of SIP signaling examined call setup time, number of active calls, number of rejected calls and calls duration. The results of this study show that, in general, IPv4 has better performance over different types of mobility models, while IPv6 upholds longer delays and poor performance over Random mobility models.

在不同类型的无线拓扑和移动模型上，manet的不稳定性会影响实时应用(如IP语音(VoIP))的服务质量(QoS)。会话发起协议(SIP)是VoIP应用中最有效的信令系统之一，它主要用于在不同类型的基于IP的网络系统上发起、管理和终止VoIP呼叫。作为升级到下一代网络的一部分，manet将考虑针对不同类型的应用和设备使用IPv6。因此，需要用不同的QoS性能指标(如带宽、丢包、延迟和抖动)来研究IPv6 manet上的SIP信令。在本文中，对基于SIP的VoIP呼叫进行了SIP信令的评估，该呼叫使用GSM语音编解码系统在具有静态、统一和随机移动模型的manet上进行。该评估认为AODV是一种被动路由协议，OLSR是IPv4和IPv6上的一种主动路由协议。SIP信令的评估研究考察了呼叫建立时间、主动呼叫数、被拒绝呼叫数和呼叫持续时间。本研究结果表明，总体而言，IPv4在不同类型的移动模型中具有更好的性能，而IPv6在随机移动模型中具有更长的延迟和较差的性能。

{"title":"Signaling Performance for SIP over IPv6 Mobile Ad-Hoc Network (MANET)","authors":"M. Alshamrani, H. Cruickshank, Zhili Sun, Vahid Fami, B. Elmasri, Emad Danish","doi":"10.1109/ISM.2013.44","DOIUrl":"https://doi.org/10.1109/ISM.2013.44","url":null,"abstract":"The unstable nature of MANETs over different types of wireless topologies and mobility models affects the Quality of Service (QoS) for real time applications such as Voice over IP (VoIP). One of the most efficient signaling systems for VoIP applications is the Session Initiation Protocol (SIP) which is mainly used to initiate, manage, and terminate VoIP calls over different types of IP based network systems. As a part of upgrading to Next Generation Network, MANETs will be considering IPv6 for different types of applications and devices. Therefore, SIP signaling over IPv6 MANETs needs to be investigated with different QoS performance metrics such as bandwidth, packet loss, delay and jitter. In this paper, an evaluation of SIP signaling is conducted for SIP based VoIP calls using GSM voice codec system over MANETs with Static, Uniform, and Random mobility models. This evaluation considered AODV as a reactive routing protocol and OLSR as a proactive routing protocol over both IPv4 as well as IPv6. The evaluation study of SIP signaling examined call setup time, number of active calls, number of rejected calls and calls duration. The results of this study show that, in general, IPv4 has better performance over different types of mobility models, while IPv6 upholds longer delays and poor performance over Random mobility models.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"17 1","pages":"231-236"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90225919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Block Division Based CAMShift Algorithm for Real-Time Object Tracking Using Distributed Smart Cameras 基于分块分割的CAMShift算法用于分布式智能摄像机实时目标跟踪

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.56

Manjunath Kulkarni, Paras Wadekar, Haresh Dagale

In this paper, we present a histogram based real-time object tracking system using distributed smart cameras. Each such smart camera module consists of a camera and an embedded device that is capable of performing the task of object tracking entirely by itself. The module recognizes and tracks the object in real time. The processed video stream containing the marked object is then transmitted to a central server for display. The embedded device runs a novel block division based CAMShift algorithm proposed in this paper. We show that this technique reduces the number of computations required and hence is more suitable for embedded platforms. The solution is implemented using a central server and multiple camera modules with non-overlapping fields of view in indoor settings. We validate the improvement in the performance by comparing the experimental results with existing solutions.

本文提出了一种基于直方图的分布式智能摄像机实时目标跟踪系统。每个这样的智能摄像头模块包括一个摄像头和一个嵌入式设备，该设备能够完全自行执行目标跟踪任务。该模块可以实时识别和跟踪目标。然后将包含所标记对象的经过处理的视频流传输到中央服务器以进行显示。在嵌入式设备上运行了本文提出的基于分块的新型CAMShift算法。我们表明，这种技术减少了所需的计算次数，因此更适合嵌入式平台。该解决方案是通过一个中央服务器和多个相机模块实现的，这些模块在室内环境中具有不重叠的视场。通过将实验结果与现有解决方案进行比较，验证了性能的改进。

引用次数: 3

Fast Similarity Retrieval of Vector Images Using Representative Queries 基于代表性查询的矢量图像快速相似性检索

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.95

Takahiro Hayashi, A. Sato

This paper presents a fast similarity retrieval method for vector images. To reduce the computational cost of similarity matching, the proposed method uses pre-calculation results of similarity matching, which are obtained in advance by matching DB images with previously selected images called representative queries. At runtime the proposed method just matches the actual query (the user-inputted query) and the representative queries. Comparing the similarities with the precalculated similarities, the proposed method quickly estimates the actual similarities of DB images to the actual query. Experimental results have shown that the retrieval time is greatly reduced by the proposed method without much deterioration of retrieval accuracy.

提出了一种矢量图像的快速相似度检索方法。为了降低相似度匹配的计算成本，该方法使用了相似度匹配的预计算结果，通过将DB图像与先前选择的图像进行匹配，即代表性查询，提前获得相似度匹配的预计算结果。在运行时，建议的方法只匹配实际查询(用户输入的查询)和代表性查询。通过与预先计算的相似度进行比较，该方法可以快速估计出DB图像与实际查询的实际相似度。实验结果表明，该方法在不影响检索精度的前提下，大大缩短了检索时间。

引用次数: 1

Emotion Recognition Modulating the Behavior of Intelligent Systems 情绪识别调节智能系统的行为

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

Pub Date : 2013-12-09 DOI: 10.1109/ISM.2013.72

A. Smailagic, D. Siewiorek, A. Rudnicky, Sandeep Nallan Chakravarthula, Anshuman Kar, Nivedita Jagdale, Saksham Gautam, Rohit Vijayaraghavan, S. Jagtap

The paper presents an audio-based emotion recognition system that is able to classify emotions as anger, fear, happy, neutral, sadness or disgust in real time. We use the virtual coach as an application example of how emotion recognition can be used to modulate intelligent systems' behavior. A novel minimum-error feature removal mechanism to reduce bandwidth and increase accuracy of our emotion recognition system has been introduced. A two-stage hierarchical classification approach along with a One-Against-All (OAA) framework are used. We obtained an average accuracy of 82.07% using the OAA approach, and 87.70% with a two-stage hierarchical approach, by pruning the feature set and using Support Vector Machines (SVMs) for classification.

本文提出了一种基于音频的情绪识别系统，该系统能够实时地将情绪分为愤怒、恐惧、快乐、中性、悲伤或厌恶。我们使用虚拟教练作为情感识别如何用于调节智能系统行为的应用示例。介绍了一种新的最小误差特征去除机制，以减少带宽和提高我们的情绪识别系统的准确性。使用了两阶段分层分类方法以及“一对全”(One-Against-All, OAA)框架。通过修剪特征集并使用支持向量机(svm)进行分类，使用OAA方法获得了82.07%的平均准确率，使用两阶段分层方法获得了87.70%的平均准确率。

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀