首页 > 最新文献

Ninth IEEE International Symposium on Multimedia (ISM 2007)最新文献

英文 中文
Layered Clustering for Solar Powered Wireless Visual Sensor Networks 太阳能无线视觉传感器网络的分层聚类
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.35
Xiaoming Fan, W. Shaw, I. Lee
Visual-based wireless sensor networks have been implemented in several different fields such as environment monitoring, military applications, and robotic applications. Due to the limitation of node's specification, the bandwidth and energy become critical issues for sensor nodes. In this paper, we employ a solar cell recharging model and a layered clustering model to deal with the restrict energy consumption under the consideration of visual quality. The system lifetime can be prolonged by rechargeable solar cell that can be recharged by solar panel in daytime. In addition, we analyze the simulation results of energy consumption and total transmitted packets by changing the aggregation rate and gate energy (GE). With the aggregation rate decreasing, the cluster head in inner layer can support more visual nodes and reserve more bandwidth. The lower GE can reduce the packets loss during the system charging process. The analysis and experiment result obtained in this paper prove that with the combination of layered clustering and solar recharging, the performance of wireless visual sensor network can be enhanced under the consideration of the restrict node's capacity and video distortion.
基于视觉的无线传感器网络已经在几个不同的领域实现,如环境监测、军事应用和机器人应用。由于节点规格的限制,带宽和能量成为传感器节点的关键问题。本文在考虑视觉质量的情况下,采用太阳能电池充电模型和分层聚类模型来处理能量消耗限制问题。可充电太阳能电池可在白天通过太阳能电池板进行充电,从而延长系统的使用寿命。此外,通过改变聚合速率和栅极能量(GE),分析了能耗和总传输包数的仿真结果。随着聚合速率的降低,内层簇头可以支持更多的可视节点,预留更多的带宽。降低GE值可以减少系统计费过程中的丢包。本文的分析和实验结果证明,在考虑限制节点容量和视频失真的情况下,分层聚类和太阳能充电相结合可以提高无线视觉传感器网络的性能。
{"title":"Layered Clustering for Solar Powered Wireless Visual Sensor Networks","authors":"Xiaoming Fan, W. Shaw, I. Lee","doi":"10.1109/ISM.2007.35","DOIUrl":"https://doi.org/10.1109/ISM.2007.35","url":null,"abstract":"Visual-based wireless sensor networks have been implemented in several different fields such as environment monitoring, military applications, and robotic applications. Due to the limitation of node's specification, the bandwidth and energy become critical issues for sensor nodes. In this paper, we employ a solar cell recharging model and a layered clustering model to deal with the restrict energy consumption under the consideration of visual quality. The system lifetime can be prolonged by rechargeable solar cell that can be recharged by solar panel in daytime. In addition, we analyze the simulation results of energy consumption and total transmitted packets by changing the aggregation rate and gate energy (GE). With the aggregation rate decreasing, the cluster head in inner layer can support more visual nodes and reserve more bandwidth. The lower GE can reduce the packets loss during the system charging process. The analysis and experiment result obtained in this paper prove that with the combination of layered clustering and solar recharging, the performance of wireless visual sensor network can be enhanced under the consideration of the restrict node's capacity and video distortion.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"771 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121181034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Moving Region Detection by Transportation Problem Solving 基于运输问题求解的移动区域检测
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.28
T. Yokoyama, S. Furukawa, Toshinori Watanabe
In this paper, we propose a novel moving region detection method from the viewpoint of solving the transportation problem. This method extracts the relations between regions as a solution to the transformation problem between pixels belonging to adjacent frames. Moving regions are detected by utilizing the properties of these relations. This method does not require any models such as prior knowledge or particular assumptions about moving objects or backgrounds in a video. Since the method adaptively detects moving regions from input frame data, it can deal with the fluctuations of moving objects or backgrounds. We demonstrate the effectiveness of the proposed method through several experiments conducted using actual videos.
本文从解决交通问题的角度出发,提出了一种新的运动区域检测方法。该方法提取区域之间的关系,解决相邻帧像素之间的变换问题。利用这些关系的属性来检测移动区域。这种方法不需要任何模型,比如关于视频中移动物体或背景的先验知识或特定假设。该方法从输入帧数据中自适应检测运动区域,可以处理运动物体或背景的波动。我们通过使用实际视频进行的几个实验证明了所提出方法的有效性。
{"title":"Moving Region Detection by Transportation Problem Solving","authors":"T. Yokoyama, S. Furukawa, Toshinori Watanabe","doi":"10.1109/ISM.2007.28","DOIUrl":"https://doi.org/10.1109/ISM.2007.28","url":null,"abstract":"In this paper, we propose a novel moving region detection method from the viewpoint of solving the transportation problem. This method extracts the relations between regions as a solution to the transformation problem between pixels belonging to adjacent frames. Moving regions are detected by utilizing the properties of these relations. This method does not require any models such as prior knowledge or particular assumptions about moving objects or backgrounds in a video. Since the method adaptively detects moving regions from input frame data, it can deal with the fluctuations of moving objects or backgrounds. We demonstrate the effectiveness of the proposed method through several experiments conducted using actual videos.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123161092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Feature-Based Full-Frame Image Stabilization 基于功能的全帧图像稳定
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.44
Chih-Yuan Chung, Homer H. Chen
Digital image stabilization usually discards boundary pixels and outputs a smaller video. In this paper, we present a new digital image stabilization algorithm that preserves the frame size of output video by pixel filling. The proposed algorithm eliminates the accumulation error by directly estimating the global motions in a transformation chain with reference to a fixed frame. A feature matching method is adopted to save the computational cost of the global motion estimation and to handle large motions. The experimental results show that the proposed algorithm produces stabilized full-frame video sequences with better frame alignment.
数字防抖通常丢弃边界像素,输出较小的视频。本文提出了一种通过像素填充保持输出视频帧大小的数字稳像算法。该算法通过参照固定坐标系直接估计变换链中的全局运动来消除累积误差。为了节省全局运动估计的计算量和处理大型运动,采用了特征匹配方法。实验结果表明,该算法能产生稳定的全帧视频序列,帧对齐效果较好。
{"title":"Feature-Based Full-Frame Image Stabilization","authors":"Chih-Yuan Chung, Homer H. Chen","doi":"10.1109/ISM.2007.44","DOIUrl":"https://doi.org/10.1109/ISM.2007.44","url":null,"abstract":"Digital image stabilization usually discards boundary pixels and outputs a smaller video. In this paper, we present a new digital image stabilization algorithm that preserves the frame size of output video by pixel filling. The proposed algorithm eliminates the accumulation error by directly estimating the global motions in a transformation chain with reference to a fixed frame. A feature matching method is adopted to save the computational cost of the global motion estimation and to handle large motions. The experimental results show that the proposed algorithm produces stabilized full-frame video sequences with better frame alignment.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126385358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Spatial-Temporal Error Detection Scheme for Video Transmission over Noisy Channels 噪声信道上视频传输的时空误差检测方法
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.40
Guan-Lin Wu, Shao-Yi Chien
Error detection plays an important role in an error- robust video decoder. In this paper, a spatial-temporal error detection scheme for a video decoder is proposed. By considering inherently spatial and temporal similarities in video sequences, the visually corrected macroblocks in the decoded frames are detected by employing a set of error detection procedures, where one cross-boundary similarity index and one cross-frame similarity index are defined for spatial and temporal error detection, respectively. An adaptive threshold scheme is also proposed to make the proposed error detection method suitable for different video sequences. After being integrated with an H.264 decoder with error concealment techniques, the video quality improvement of 0.5-2.4 dB in PSNR is achieved. This method can also be integrated with other video codecs to improve the decoded video quality over noisy channels.
错误检测在视频解码器的错误鲁棒性中起着重要的作用。本文提出了一种用于视频解码器的时空误差检测方案。通过考虑视频序列固有的空间和时间相似性,采用一套错误检测程序检测解码帧中的视觉校正宏块,其中定义了一个跨边界相似性指数和一个跨帧相似性指数,分别用于空间和时间错误检测。为了使所提出的错误检测方法适用于不同的视频序列,还提出了一种自适应阈值方案。在与H.264解码器集成错误隐藏技术后,视频质量的PSNR提高了0.5-2.4 dB。该方法还可以与其他视频编解码器集成,以提高在噪声信道上解码的视频质量。
{"title":"Spatial-Temporal Error Detection Scheme for Video Transmission over Noisy Channels","authors":"Guan-Lin Wu, Shao-Yi Chien","doi":"10.1109/ISM.2007.40","DOIUrl":"https://doi.org/10.1109/ISM.2007.40","url":null,"abstract":"Error detection plays an important role in an error- robust video decoder. In this paper, a spatial-temporal error detection scheme for a video decoder is proposed. By considering inherently spatial and temporal similarities in video sequences, the visually corrected macroblocks in the decoded frames are detected by employing a set of error detection procedures, where one cross-boundary similarity index and one cross-frame similarity index are defined for spatial and temporal error detection, respectively. An adaptive threshold scheme is also proposed to make the proposed error detection method suitable for different video sequences. After being integrated with an H.264 decoder with error concealment techniques, the video quality improvement of 0.5-2.4 dB in PSNR is achieved. This method can also be integrated with other video codecs to improve the decoded video quality over noisy channels.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123553134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Magic Mirror 魔镜
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.11
Jun-Ren Ding, Chien-Lin Huang, Ji-Kun Lin, J. Yang, Chung-Hsien Wu
This investigation describes a novel design and implementation of an interactive multimedia mirror system, called "Magic Mirror." The Magic Mirror can be easily implemented in existing personal computers or hand-held device with normal peripherals and regular reflective glass by integrating image/speech processing, Internet connectivity, and 3D and multimedia software. The integrated Magic Mirror, which includes speech recognition, speech synthesis, face detection/modified/recognition, 3D virtual genius, hidden LCD mirror, and camera, performs simple syndication to capture information about peripherals and network connections. The user can easily activate personal multimedia services using verbal commands. The Magic Mirror can function like a good friend who listens to the user's questions and automatically responds to these requests, providing relaxation and consolation. Moreover, the Magic Mirror can detect a user's feeling based on speech and image recognition features to select the appropriate music and speech to alter the user's mood.
本研究描述了一种新颖的交互式多媒体镜像系统的设计和实现,称为“魔镜”。魔镜通过集成图像/语音处理、互联网连接、3D和多媒体软件,可以很容易地在现有的个人电脑或手持设备上使用普通外围设备和普通反射玻璃。集成的魔镜包括语音识别、语音合成、人脸检测/修改/识别、3D虚拟天才、隐藏LCD镜子和摄像头,执行简单的联合来捕获有关外设和网络连接的信息。用户可以使用口头命令轻松激活个人多媒体服务。魔镜可以像一个好朋友一样倾听用户的问题,并自动响应这些请求,提供放松和安慰。此外,魔镜还可以根据语音和图像识别功能检测用户的情绪,选择合适的音乐和语音来改变用户的情绪。
{"title":"Magic Mirror","authors":"Jun-Ren Ding, Chien-Lin Huang, Ji-Kun Lin, J. Yang, Chung-Hsien Wu","doi":"10.1109/ISM.2007.11","DOIUrl":"https://doi.org/10.1109/ISM.2007.11","url":null,"abstract":"This investigation describes a novel design and implementation of an interactive multimedia mirror system, called \"Magic Mirror.\" The Magic Mirror can be easily implemented in existing personal computers or hand-held device with normal peripherals and regular reflective glass by integrating image/speech processing, Internet connectivity, and 3D and multimedia software. The integrated Magic Mirror, which includes speech recognition, speech synthesis, face detection/modified/recognition, 3D virtual genius, hidden LCD mirror, and camera, performs simple syndication to capture information about peripherals and network connections. The user can easily activate personal multimedia services using verbal commands. The Magic Mirror can function like a good friend who listens to the user's questions and automatically responds to these requests, providing relaxation and consolation. Moreover, the Magic Mirror can detect a user's feeling based on speech and image recognition features to select the appropriate music and speech to alter the user's mood.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121143084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
The Design of a Multi-party VoIP Conferencing System over the Internet 基于Internet的多方VoIP会议系统的设计
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.48
B. Sat, Zixia Huang, B. Wah
In this paper, we present the design of a VoIP conferencing system that enables the voice communication of multiple users in the Internet. After studying the conversational dynamics in multi-party conferencing, we identify user-observable metrics that affect the perception of conversational quality and their trade-offs. Based on the dynamics and the behavior on delays, jitters, and losses of Internet traces collected in the PlanetLab, we design the transmission topology and schemes for loss concealments and play-out scheduling. Last, we compare the performance of our system and Skype (version 3.5.0.214) using repeatable experiments that simulate human participants and network conditions in a multi-party conferencing scenario.
在本文中,我们提出了一个VoIP会议系统的设计,使多个用户在互联网上的语音通信。在研究了多方会议中的会话动态之后,我们确定了影响会话质量感知的用户可观察指标及其权衡。基于PlanetLab采集的网络轨迹的动态特性和时延、抖动、损耗等特性,我们设计了传输拓扑结构以及损耗隐藏和播放调度方案。最后,我们使用可重复的实验来模拟多方会议场景中的人类参与者和网络条件,比较了我们的系统和Skype(版本3.5.0.214)的性能。
{"title":"The Design of a Multi-party VoIP Conferencing System over the Internet","authors":"B. Sat, Zixia Huang, B. Wah","doi":"10.1109/ISM.2007.48","DOIUrl":"https://doi.org/10.1109/ISM.2007.48","url":null,"abstract":"In this paper, we present the design of a VoIP conferencing system that enables the voice communication of multiple users in the Internet. After studying the conversational dynamics in multi-party conferencing, we identify user-observable metrics that affect the perception of conversational quality and their trade-offs. Based on the dynamics and the behavior on delays, jitters, and losses of Internet traces collected in the PlanetLab, we design the transmission topology and schemes for loss concealments and play-out scheduling. Last, we compare the performance of our system and Skype (version 3.5.0.214) using repeatable experiments that simulate human participants and network conditions in a multi-party conferencing scenario.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115385045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Motion Retrieval Based on Energy Morphing 基于能量变形的运动检索
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.15
G. Tam, Qingzheng Zheng, M. Corbyn, Rynson W. H. Lau
Matching and retrieval of motion sequences has become an important research area in recent years, due to the increasing availability and popularity of motion capture data. The main challenge in matching two motion sequences is the diversity of the captured motions, including variable length, local shifting, local and global scaling. Most existing methods employ Dynamic Time Warping (DTW) or Uniform Scaling to handle these problems. In this paper, we propose a novel content-based method for matching of this human motion captured data. We convert the matching problem of motion capture data into a transportation problem. To solve this problem efficiently, we employ Earth Mover's Distance (EMD) as the matching framework. To penalize any strayed matching, we provide a ground distance that works similar to Sakoe- Chiba band of DTW. Empirical results obtained are encouraging.
近年来,随着运动捕捉数据的日益普及和普及,运动序列的匹配与检索已成为一个重要的研究领域。匹配两个运动序列的主要挑战是捕获运动的多样性,包括可变长度、局部移动、局部和全局缩放。现有的方法大多采用动态时间翘曲(DTW)或均匀缩放来处理这些问题。在本文中,我们提出了一种新的基于内容的方法来匹配这些人体运动捕获数据。我们将运动捕捉数据的匹配问题转化为运输问题。为了有效地解决这一问题,我们采用了土动器距离(EMD)作为匹配框架。为了防止任何误匹配,我们提供了一个与DTW的Sakoe- Chiba波段相似的地面距离。得到的实证结果令人鼓舞。
{"title":"Motion Retrieval Based on Energy Morphing","authors":"G. Tam, Qingzheng Zheng, M. Corbyn, Rynson W. H. Lau","doi":"10.1109/ISM.2007.15","DOIUrl":"https://doi.org/10.1109/ISM.2007.15","url":null,"abstract":"Matching and retrieval of motion sequences has become an important research area in recent years, due to the increasing availability and popularity of motion capture data. The main challenge in matching two motion sequences is the diversity of the captured motions, including variable length, local shifting, local and global scaling. Most existing methods employ Dynamic Time Warping (DTW) or Uniform Scaling to handle these problems. In this paper, we propose a novel content-based method for matching of this human motion captured data. We convert the matching problem of motion capture data into a transportation problem. To solve this problem efficiently, we employ Earth Mover's Distance (EMD) as the matching framework. To penalize any strayed matching, we provide a ground distance that works similar to Sakoe- Chiba band of DTW. Empirical results obtained are encouraging.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121395971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Implementation and Evaluation of Late Data Choice for TCP in Linux Linux下TCP后期数据选择的实现与评价
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.18
E. Birkedal, C. Griwodz, P. Halvorsen
Real-time delivery of time-dependent data over the Internet is challenging. UDP has often been used to transport data in a timely manner, but its lack of congestion control is often criticized. This criticism is a reason that the vast majority of applications today use TCP. The downside of this is that TCP has problems with the timely delivery of data. A transport protocol that adds congestion control to an otherwise UDP-like behaviour is DCCP For this protocol, late data choice (LDC) [8] has been proposed to allow adaptive applications control over data packets up to the actual transmission time. We find, however, that application developers appreciate other TCP features as well, such as its reliability. We have therefore implemented and tested the LDC ideas for TCP. It allows the application to modify or drop packets that have been handed to TCP until they are actually transmitted to the network. This is achieved with a shared packet ring and indexes to hold the current status. Our experiments show that we can send more useful data with LDC than without in a streaming scenario. We can therefore claim that we achieve a better utilization of the throughput, giving us a higher goodput with LDC than without.
在因特网上实时传递与时间相关的数据是一项挑战。UDP经常被用来及时地传输数据,但它缺乏拥塞控制经常被批评。这种批评是当今绝大多数应用程序使用TCP的原因之一。这样做的缺点是TCP在及时传递数据方面存在问题。一种将拥塞控制添加到类似udp行为的传输协议是DCCP。对于该协议,延迟数据选择(late data choice, LDC)[8]已被提出,允许自适应应用程序控制数据包直至实际传输时间。然而,我们发现应用程序开发人员也很欣赏TCP的其他特性,比如它的可靠性。因此,我们实施并测试了最不发达国家关于TCP的想法。它允许应用程序修改或丢弃已经交给TCP的数据包,直到它们实际传输到网络。这是通过共享数据包环和保存当前状态的索引来实现的。我们的实验表明,在流场景中,使用LDC比不使用LDC可以发送更多有用的数据。因此,我们可以声称我们实现了更好的吞吐量利用率,与不使用LDC相比,我们获得了更高的效益。
{"title":"Implementation and Evaluation of Late Data Choice for TCP in Linux","authors":"E. Birkedal, C. Griwodz, P. Halvorsen","doi":"10.1109/ISM.2007.18","DOIUrl":"https://doi.org/10.1109/ISM.2007.18","url":null,"abstract":"Real-time delivery of time-dependent data over the Internet is challenging. UDP has often been used to transport data in a timely manner, but its lack of congestion control is often criticized. This criticism is a reason that the vast majority of applications today use TCP. The downside of this is that TCP has problems with the timely delivery of data. A transport protocol that adds congestion control to an otherwise UDP-like behaviour is DCCP For this protocol, late data choice (LDC) [8] has been proposed to allow adaptive applications control over data packets up to the actual transmission time. We find, however, that application developers appreciate other TCP features as well, such as its reliability. We have therefore implemented and tested the LDC ideas for TCP. It allows the application to modify or drop packets that have been handed to TCP until they are actually transmitted to the network. This is achieved with a shared packet ring and indexes to hold the current status. Our experiments show that we can send more useful data with LDC than without in a streaming scenario. We can therefore claim that we achieve a better utilization of the throughput, giving us a higher goodput with LDC than without.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131608925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-stream Asynchrony Modeling for Audio-Visual Speech Recognition 视听语音识别的多流异步建模
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.21
Guoyun Lv, D. Jiang, R. Zhao, Yunshu Hou
In this paper, two multi-stream asynchrony Dynamic Bayesian Network models (MS-ADBN model and MM-ADBN model) are proposed for audio-visual speech recognition (AVSR). The proposed models, with different topology structures, loose the asynchrony of audio and visual streams to word level. For MS-ADBN model, both in audio stream and in visual stream, each word is composed of its corresponding phones, and each phone is associated with observation vector. MM- ADBN model is an augmentation of MS-ADBN model, a level of hidden nodes--state level, is added between the phone level and the observation node level, to describe the dynamic process of phones. Essentially, MS-ADBN model is a word model, while MM-ADBN model is a phone model. Speech recognition experiments are done on a digit audio-visual (A-V) database, as well as on a continuous A-V database. The results demonstrate that the asynchrony description between audio and visual stream is important for AVSR system, and MM-ADBN model has the best performance for the task of continuous A-V speech recognition.
本文提出了用于视听语音识别的两种多流异步动态贝叶斯网络模型(MS-ADBN模型和MM-ADBN模型)。所提出的模型具有不同的拓扑结构,将音视频流的异步性降低到字级。对于MS-ADBN模型,无论是在音频流还是在视觉流中,每个单词都由其对应的电话组成,每个电话与观测向量相关联。MM- ADBN模型是对MS-ADBN模型的增强,在手机级和观测节点级之间增加了一个隐藏节点级——状态级,用以描述手机的动态过程。MS-ADBN模型本质上是一个词模型,MM-ADBN模型本质上是一个电话模型。语音识别实验分别在数字视听数据库和连续视听数据库上进行。结果表明,音频和视频流之间的异步描述对AVSR系统至关重要,MM-ADBN模型对于连续的A-V语音识别任务具有最佳的性能。
{"title":"Multi-stream Asynchrony Modeling for Audio-Visual Speech Recognition","authors":"Guoyun Lv, D. Jiang, R. Zhao, Yunshu Hou","doi":"10.1109/ISM.2007.21","DOIUrl":"https://doi.org/10.1109/ISM.2007.21","url":null,"abstract":"In this paper, two multi-stream asynchrony Dynamic Bayesian Network models (MS-ADBN model and MM-ADBN model) are proposed for audio-visual speech recognition (AVSR). The proposed models, with different topology structures, loose the asynchrony of audio and visual streams to word level. For MS-ADBN model, both in audio stream and in visual stream, each word is composed of its corresponding phones, and each phone is associated with observation vector. MM- ADBN model is an augmentation of MS-ADBN model, a level of hidden nodes--state level, is added between the phone level and the observation node level, to describe the dynamic process of phones. Essentially, MS-ADBN model is a word model, while MM-ADBN model is a phone model. Speech recognition experiments are done on a digit audio-visual (A-V) database, as well as on a continuous A-V database. The results demonstrate that the asynchrony description between audio and visual stream is important for AVSR system, and MM-ADBN model has the best performance for the task of continuous A-V speech recognition.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130157824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
MuMiVA: A Multimedia Delivery Platform Using Format-Agnostic, XML-Driven Content Adaptation MuMiVA:一个使用与格式无关、xml驱动的内容适应的多媒体交付平台
Pub Date : 2007-12-10 DOI: 10.1109/ISM.2007.13
D. V. Deursen, S. D. Bruyne, W. V. Lancker, W. D. Neve, D. D. Schrijver, H. Hellwagner, R. Walle
Due to the increasing heterogeneity in the current multimedia landscape, the delivery of multimedia content has become an important issue today. This heterogeneity is not only reflected by a plethora of different usage environments, but also by the presence of multiple (scalable) coding formats. Therefore, format-independent adaptation engines have to be used within a multimedia delivery platform, which are able to adapt the multimedia content according to a certain usage environment, independent of the underlying coding format of the content. By relying on automatically created textual descriptions of the high-level syntax of binary media resources, a format-independent adaptation engine can be built. MPEG-21 generic bitstream syntax schema (gBS schema) is a tool that is part of the MPEG-21 multimedia framework. It enables the use of generic bitstream syntax descriptions (gBSDs), i.e., textual descriptions in XML, to steer the adaptation of a binary media resource, using format-independent adaptation logic. In this paper, we address the design and performance evaluation of a multimedia delivery platform that relies on gBS schema-driven adaptation engines. Our platform is called MuMiVA; it is a fully integrated, extensible platform for multimedia delivery in heterogeneous usage environments, using streaming technologies. To demonstrate the flexibility of our multimedia delivery platform, we discuss the functioning of two different applications (i.e., exploitation of temporal scalability and shot selection) applied to two different coding formats (i.e., MPEG-4 visual and H.264/AVC).
由于当前多媒体领域的异构性日益增加,多媒体内容的交付已成为当今的一个重要问题。这种异构性不仅反映在大量不同的使用环境上,而且还反映在多种(可伸缩的)编码格式的存在上。因此,必须在多媒体交付平台中使用与格式无关的适配引擎,它能够根据特定的使用环境适配多媒体内容,而不依赖于内容的底层编码格式。通过依赖于自动创建的二进制媒体资源高级语法的文本描述,可以构建与格式无关的自适应引擎。MPEG-21通用位流语法模式(gBS模式)是MPEG-21多媒体框架的一部分。它允许使用通用的比特流语法描述(gbsd),即XML中的文本描述,使用与格式无关的适应逻辑来引导二进制媒体资源的适应。在本文中,我们讨论了依赖于gBS模式驱动的自适应引擎的多媒体交付平台的设计和性能评估。我们的平台叫做MuMiVA;它是一个完全集成的、可扩展的平台,用于在异构使用环境中使用流媒体技术进行多媒体传输。为了展示多媒体传输平台的灵活性,我们讨论了两种不同编码格式(即MPEG-4 visual和H.264/AVC)下的两种不同应用程序(即利用时间可扩展性和镜头选择)的功能。
{"title":"MuMiVA: A Multimedia Delivery Platform Using Format-Agnostic, XML-Driven Content Adaptation","authors":"D. V. Deursen, S. D. Bruyne, W. V. Lancker, W. D. Neve, D. D. Schrijver, H. Hellwagner, R. Walle","doi":"10.1109/ISM.2007.13","DOIUrl":"https://doi.org/10.1109/ISM.2007.13","url":null,"abstract":"Due to the increasing heterogeneity in the current multimedia landscape, the delivery of multimedia content has become an important issue today. This heterogeneity is not only reflected by a plethora of different usage environments, but also by the presence of multiple (scalable) coding formats. Therefore, format-independent adaptation engines have to be used within a multimedia delivery platform, which are able to adapt the multimedia content according to a certain usage environment, independent of the underlying coding format of the content. By relying on automatically created textual descriptions of the high-level syntax of binary media resources, a format-independent adaptation engine can be built. MPEG-21 generic bitstream syntax schema (gBS schema) is a tool that is part of the MPEG-21 multimedia framework. It enables the use of generic bitstream syntax descriptions (gBSDs), i.e., textual descriptions in XML, to steer the adaptation of a binary media resource, using format-independent adaptation logic. In this paper, we address the design and performance evaluation of a multimedia delivery platform that relies on gBS schema-driven adaptation engines. Our platform is called MuMiVA; it is a fully integrated, extensible platform for multimedia delivery in heterogeneous usage environments, using streaming technologies. To demonstrate the flexibility of our multimedia delivery platform, we discuss the functioning of two different applications (i.e., exploitation of temporal scalability and shot selection) applied to two different coding formats (i.e., MPEG-4 visual and H.264/AVC).","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121330098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
Ninth IEEE International Symposium on Multimedia (ISM 2007)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1