首页 > 最新文献

2006 IEEE International Conference on Multimedia and Expo最新文献

英文 中文
Subjective Evaluations of an Experimental Gesturephone 实验性手势听筒的主观评价
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262524
Mohd Nazri Ramliy, N. Arif, R. Komiya
This paper presents the findings of our subjective evaluations on the integration of gestures in telecommunication. The experimental setup for tracking and imitating the human arm gesture are described. Our research investigates the possibility of transferring this often overlooked communication medium in our daily communication, for its application in telecommunication using robotics. Based on the subjective evaluation, a maximum allowable delay for an imperceptible gesture reconstruction in the lateral setup is suggested
本文介绍了我们对手势在电信中的整合的主观评价的结果。描述了跟踪和模仿人类手臂手势的实验装置。我们的研究探讨了在我们的日常交流中转移这种经常被忽视的通信媒介的可能性,以利用机器人技术将其应用于电信。在主观评价的基础上,提出了横向设置中不可察觉手势重建的最大允许延迟
{"title":"Subjective Evaluations of an Experimental Gesturephone","authors":"Mohd Nazri Ramliy, N. Arif, R. Komiya","doi":"10.1109/ICME.2006.262524","DOIUrl":"https://doi.org/10.1109/ICME.2006.262524","url":null,"abstract":"This paper presents the findings of our subjective evaluations on the integration of gestures in telecommunication. The experimental setup for tracking and imitating the human arm gesture are described. Our research investigates the possibility of transferring this often overlooked communication medium in our daily communication, for its application in telecommunication using robotics. Based on the subjective evaluation, a maximum allowable delay for an imperceptible gesture reconstruction in the lateral setup is suggested","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133427470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Motion Segmentation of 3D Video using Modified Shape Distribution 基于改进形状分布的三维视频运动分割
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262929
T. Yamasaki, K. Aizawa
In this paper, temporal segmentation of 3D video based on motion analysis is presented. 3D video is a sequence of 3D models made for a real-world dynamic object. A modified shape distribution algorithm is proposed to realize stable shape feature representation. In our approach, representative points are generated by clustering vertices based on their spatial distribution instead of randomly sampling vertices as in the original shape distribution algorithm. Motion segmentation is conducted analyzing local minima in degree of motion calculated in the feature vector space. The segmentation algorithm developed in this paper does not require any predefined threshold values but rely on relative relationships among local minima and local maxima of the motion. Therefore, robust segmentation has been achieved. The experiments using 3D video of traditional dances yielded encouraging results with the precision and recall rates of 93% and 88%, respectively, on average
提出了一种基于运动分析的三维视频时间分割方法。3D视频是为现实世界的动态对象制作的一系列3D模型。为了实现稳定的形状特征表示,提出了一种改进的形状分布算法。在我们的方法中,代表点是由基于空间分布的顶点聚类产生的,而不是像原始形状分布算法那样随机采样顶点。运动分割是对特征向量空间中计算的运动度的局部极小值进行分析。本文提出的分割算法不需要任何预定义的阈值,而是依赖于运动的局部极小值和局部最大值之间的相对关系。因此,实现了鲁棒分割。利用传统舞蹈3D视频进行的实验取得了令人鼓舞的结果,准确率和召回率平均分别达到93%和88%
{"title":"Motion Segmentation of 3D Video using Modified Shape Distribution","authors":"T. Yamasaki, K. Aizawa","doi":"10.1109/ICME.2006.262929","DOIUrl":"https://doi.org/10.1109/ICME.2006.262929","url":null,"abstract":"In this paper, temporal segmentation of 3D video based on motion analysis is presented. 3D video is a sequence of 3D models made for a real-world dynamic object. A modified shape distribution algorithm is proposed to realize stable shape feature representation. In our approach, representative points are generated by clustering vertices based on their spatial distribution instead of randomly sampling vertices as in the original shape distribution algorithm. Motion segmentation is conducted analyzing local minima in degree of motion calculated in the feature vector space. The segmentation algorithm developed in this paper does not require any predefined threshold values but rely on relative relationships among local minima and local maxima of the motion. Therefore, robust segmentation has been achieved. The experiments using 3D video of traditional dances yielded encouraging results with the precision and recall rates of 93% and 88%, respectively, on average","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133522819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Analysis of Multi-User Congestion Control for Video Streaming Over Wireless Networks 无线视频流的多用户拥塞控制分析
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262639
Xiaoqing Zhu, B. Girod
When multiple video sources are live-encoded and transmitted over a common wireless network, each stream needs to adapt its encoding parameters to wireless channel fluctuations, so as to avoid congesting the network. We present a stochastic system model for analyzing multi-user congestion control for live video coding and streaming over a wireless network. Variations in video content complexities and wireless channel conditions are modeled as independent Markov processes, which jointly determine the bottleneck queue size of each stream. Interaction among multiple users are captured by a simple model of random traffic contention. Using the model, we investigate two distributed congestion control policies: an approach based on stochastic dynamic programming (SDP) and a greedy heuristic. Compared to fixed-quality coding with no congestion control, performance gains in the range of 0.5-1.3 dB in average video quality are reported for the optimized schemes from simulation results
当对多个视频源进行实时编码并在同一无线网络上传输时,每个流都需要根据无线信道的波动来调整其编码参数,以避免网络拥塞。我们提出了一个随机系统模型来分析无线网络上实时视频编码和流媒体的多用户拥塞控制。视频内容复杂性和无线信道条件的变化被建模为独立的马尔可夫过程,它们共同决定了每个流的瓶颈队列大小。多个用户之间的交互通过一个简单的随机流量争用模型来捕获。利用该模型,我们研究了两种分布式拥塞控制策略:基于随机动态规划(SDP)的方法和贪婪启发式方法。仿真结果表明,与没有拥塞控制的固定质量编码相比,优化方案的平均视频质量提高了0.5-1.3 dB
{"title":"Analysis of Multi-User Congestion Control for Video Streaming Over Wireless Networks","authors":"Xiaoqing Zhu, B. Girod","doi":"10.1109/ICME.2006.262639","DOIUrl":"https://doi.org/10.1109/ICME.2006.262639","url":null,"abstract":"When multiple video sources are live-encoded and transmitted over a common wireless network, each stream needs to adapt its encoding parameters to wireless channel fluctuations, so as to avoid congesting the network. We present a stochastic system model for analyzing multi-user congestion control for live video coding and streaming over a wireless network. Variations in video content complexities and wireless channel conditions are modeled as independent Markov processes, which jointly determine the bottleneck queue size of each stream. Interaction among multiple users are captured by a simple model of random traffic contention. Using the model, we investigate two distributed congestion control policies: an approach based on stochastic dynamic programming (SDP) and a greedy heuristic. Compared to fixed-quality coding with no congestion control, performance gains in the range of 0.5-1.3 dB in average video quality are reported for the optimized schemes from simulation results","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133597995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A Robust Method for TV Logo Tracking in Video Streams 视频流中电视标志跟踪的鲁棒方法
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262712
Jinqiao Wang, Ling-yu Duan, Zhenglong Li, J. Liu, Hanqing Lu, Jesse S. Jin
Most broadcast stations rely on TV logos to claim video content ownership or visually distinguish the broadcast from the interrupting commercial block. Detecting and tracking a TV logo is of interest to TV commercial skipping applications and logo-based broadcasting surveillance (abnormal signal is accompanied by logo absence). Pixel-wise difference computing within predetermined logo regions cannot address semi-transparent TV logos well for the blending effects of a logo itself and inconstant background images. Edge-based template matching is weak for semi-transparent ones when incomplete edges appear. In this paper we present a more robust approach to detect and track TV logos in video streams on the basis of multispectral images gradient. Instead of single frame based detection, our approach makes use of the temporal correlation of multiple consecutive frames. Since it is difficult to manually delineate logos of irregular shape, an adaptive threshold is applied to the gradient image in subpixel space to extract the logo mask. TV logo tracking is finally carried out by matching the masked region with a known template. An extensive comparison experiment has shown our proposed algorithm outperforms traditional methods such as frame difference, single frame-based edge matching. Our experimental dataset comes from part of TRECVID2005 news corpus and several Chinese TV channels with challenging TV logos
大多数广播电台依靠电视标志来声明视频内容的所有权,或者在视觉上将广播与中断的商业块区分开来。电视标识的检测和跟踪是电视商业跳播应用和基于标识的广播监控(异常信号伴随着标识缺失)的重要内容。在预定的标志区域内进行像素级差分计算不能很好地处理半透明电视标志,因为标志本身和不恒定的背景图像的混合效果。当出现不完整的边缘时,基于边缘的模板匹配对于半透明的模板是弱的。在本文中,我们提出了一种基于多光谱图像梯度的更鲁棒的视频流电视标识检测和跟踪方法。我们的方法不是基于单帧的检测,而是利用多个连续帧的时间相关性。由于不规则形状的logo难以手工勾画,在亚像素空间对渐变图像应用自适应阈值提取logo蒙版。最后通过将蒙面区域与已知模板进行匹配,实现电视logo的跟踪。大量的对比实验表明,我们提出的算法优于传统的方法,如帧差,单帧边缘匹配。我们的实验数据集来自部分TRECVID2005新闻语料库和几个具有挑战性电视徽标的中国电视频道
{"title":"A Robust Method for TV Logo Tracking in Video Streams","authors":"Jinqiao Wang, Ling-yu Duan, Zhenglong Li, J. Liu, Hanqing Lu, Jesse S. Jin","doi":"10.1109/ICME.2006.262712","DOIUrl":"https://doi.org/10.1109/ICME.2006.262712","url":null,"abstract":"Most broadcast stations rely on TV logos to claim video content ownership or visually distinguish the broadcast from the interrupting commercial block. Detecting and tracking a TV logo is of interest to TV commercial skipping applications and logo-based broadcasting surveillance (abnormal signal is accompanied by logo absence). Pixel-wise difference computing within predetermined logo regions cannot address semi-transparent TV logos well for the blending effects of a logo itself and inconstant background images. Edge-based template matching is weak for semi-transparent ones when incomplete edges appear. In this paper we present a more robust approach to detect and track TV logos in video streams on the basis of multispectral images gradient. Instead of single frame based detection, our approach makes use of the temporal correlation of multiple consecutive frames. Since it is difficult to manually delineate logos of irregular shape, an adaptive threshold is applied to the gradient image in subpixel space to extract the logo mask. TV logo tracking is finally carried out by matching the masked region with a known template. An extensive comparison experiment has shown our proposed algorithm outperforms traditional methods such as frame difference, single frame-based edge matching. Our experimental dataset comes from part of TRECVID2005 news corpus and several Chinese TV channels with challenging TV logos","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133682013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Identify Sports Video Shots with "Happy" or "Sad" Emotions 识别带有“快乐”或“悲伤”情绪的体育视频镜头
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262641
Jinjun Wang, Chng Eng Siong, Changsheng Xu, Hanqing Lu, Xiaofeng Tong
Semantic video content extraction and selection are critical steps in sports video analysis and editing. The identification of video segments can be from various semantic perspectives, e.g. certain event, player or emotional state. In this paper, we examined the possibility of automatically identifying shots with "happy" or "sad" emotion from broadcast sports video. Our proposed model first performs the sports highlight extraction to obtain candidate shots that possibly contain emotion information and then classifies these shots into either "happy" or "sad" emotion groups using hidden Markov model based method. The final experimental results are satisfactory
语义视频内容的提取和选择是体育视频分析和编辑的关键步骤。视频片段的识别可以从不同的语义角度进行,例如特定的事件、玩家或情绪状态。在本文中,我们研究了从广播体育视频中自动识别带有“快乐”或“悲伤”情绪的镜头的可能性。我们提出的模型首先进行运动亮点提取,获得可能包含情绪信息的候选镜头,然后使用基于隐马尔可夫模型的方法将这些镜头分类为“快乐”或“悲伤”情绪组。最后的实验结果令人满意
{"title":"Identify Sports Video Shots with \"Happy\" or \"Sad\" Emotions","authors":"Jinjun Wang, Chng Eng Siong, Changsheng Xu, Hanqing Lu, Xiaofeng Tong","doi":"10.1109/ICME.2006.262641","DOIUrl":"https://doi.org/10.1109/ICME.2006.262641","url":null,"abstract":"Semantic video content extraction and selection are critical steps in sports video analysis and editing. The identification of video segments can be from various semantic perspectives, e.g. certain event, player or emotional state. In this paper, we examined the possibility of automatically identifying shots with \"happy\" or \"sad\" emotion from broadcast sports video. Our proposed model first performs the sports highlight extraction to obtain candidate shots that possibly contain emotion information and then classifies these shots into either \"happy\" or \"sad\" emotion groups using hidden Markov model based method. The final experimental results are satisfactory","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134000704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
More: A Mobile Open Rich Media Environment 更多:移动开放富媒体环境
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262612
V. Setlur, T. Çapin, Suresh Chitturi, Ramakrishna Vedantham, M. Ingrassia
'Rich media' is a term that implies the integration of all of the advances we have made in the mobile space delivering music, speech, text, graphics and video. This is true, but it is more than the sum of its parts. Rich media is the ability to deliver these modalities, to interact with these modalities, and to do it in a way that allows for the construction, delivery and use of compelling mobile services in an effective and economic manner. In this paper, we introduce a system called mobile open rich-media environment ('MORE') that helps realize such mobile rich media services, combining various technologies of W3C, OMA, 3GPP and IETF standards. The different components of the system include formatting, packaging, transporting, rendering and interacting with rich media files and streams
“富媒体”这个术语意味着我们在移动领域所取得的所有进步的整合,包括音乐、语音、文本、图像和视频。这是事实,但它不仅仅是各部分的总和。富媒体是交付这些模式、与这些模式交互的能力,并以一种允许以有效和经济的方式构建、交付和使用引人注目的移动服务的方式来实现这一点。本文介绍了一种结合W3C、OMA、3GPP和IETF等多种标准技术的移动开放富媒体环境(mobile open rich-media environment,简称“MORE”)系统,帮助实现移动富媒体服务。该系统的不同组件包括格式化、打包、传输、渲染以及与富媒体文件和流的交互
{"title":"More: A Mobile Open Rich Media Environment","authors":"V. Setlur, T. Çapin, Suresh Chitturi, Ramakrishna Vedantham, M. Ingrassia","doi":"10.1109/ICME.2006.262612","DOIUrl":"https://doi.org/10.1109/ICME.2006.262612","url":null,"abstract":"'Rich media' is a term that implies the integration of all of the advances we have made in the mobile space delivering music, speech, text, graphics and video. This is true, but it is more than the sum of its parts. Rich media is the ability to deliver these modalities, to interact with these modalities, and to do it in a way that allows for the construction, delivery and use of compelling mobile services in an effective and economic manner. In this paper, we introduce a system called mobile open rich-media environment ('MORE') that helps realize such mobile rich media services, combining various technologies of W3C, OMA, 3GPP and IETF standards. The different components of the system include formatting, packaging, transporting, rendering and interacting with rich media files and streams","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132239938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Minimum Distortion Look-Up Table Based Data Hiding 基于数据隐藏的最小失真查找表
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262786
Xiaofeng Wang, Xiao-Ping Zhang
In this paper, we present a novel data hiding scheme based on the minimum distortion look-up table (LUT) embedding that achieves good distortion-robustness performance. We first analyze the distortion introduced by LUT embedding and formulate its relationship with run constraints of LUT. Subsequently, a Viterbi algorithm is presented to find the minimum distortion LUT. Theoretical analysis and numerical results show that the new LUT design achieves not only less distortion but also more robustness than the traditional LUT based data embedding schemes
本文提出了一种基于最小失真查找表(LUT)嵌入的数据隐藏方案,该方案具有良好的失真鲁棒性。首先分析了LUT嵌入带来的畸变,并阐述了其与LUT运行约束的关系。随后,提出了一种求最小失真LUT的Viterbi算法。理论分析和数值结果表明,与传统的基于LUT的数据嵌入方案相比,新的LUT设计不仅具有较小的失真,而且具有更强的鲁棒性
{"title":"Minimum Distortion Look-Up Table Based Data Hiding","authors":"Xiaofeng Wang, Xiao-Ping Zhang","doi":"10.1109/ICME.2006.262786","DOIUrl":"https://doi.org/10.1109/ICME.2006.262786","url":null,"abstract":"In this paper, we present a novel data hiding scheme based on the minimum distortion look-up table (LUT) embedding that achieves good distortion-robustness performance. We first analyze the distortion introduced by LUT embedding and formulate its relationship with run constraints of LUT. Subsequently, a Viterbi algorithm is presented to find the minimum distortion LUT. Theoretical analysis and numerical results show that the new LUT design achieves not only less distortion but also more robustness than the traditional LUT based data embedding schemes","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134461038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable Image Retrieval from Distributed Images Database 分布式图像数据库的可扩展图像检索
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262899
T. Tillo, Marco Grangetto, G. Olmo
In order to store, and retrieve images from large databases, we propose a framework, based on multiple description coding paradigms, that disseminates images over distributed servers. Consequently, decentralized download can be performed, thus reducing links overload and hotspot areas without penalizing downloads speed. Moreover, the tradeoff between system reliability and storage requirement can be achieved by tuning descriptions redundancy, thus providing high flexibility in terms of storage resources, reliability of access, and performance. The scalability of the proposed framework is achieved by the intrinsic progressivity of the multiple description schemes. Moreover, we demonstrate that system can work properly regardless of server crashes
为了从大型数据库中存储和检索图像,我们提出了一个基于多种描述编码范式的框架,该框架在分布式服务器上传播图像。因此,可以执行分散下载,从而减少链接过载和热点区域,而不会影响下载速度。此外,可以通过调优描述冗余来实现系统可靠性和存储需求之间的权衡,从而在存储资源、访问可靠性和性能方面提供高度的灵活性。该框架的可扩展性是通过多种描述方案的内在递进性来实现的。此外,我们还演示了系统可以在服务器崩溃的情况下正常工作
{"title":"Scalable Image Retrieval from Distributed Images Database","authors":"T. Tillo, Marco Grangetto, G. Olmo","doi":"10.1109/ICME.2006.262899","DOIUrl":"https://doi.org/10.1109/ICME.2006.262899","url":null,"abstract":"In order to store, and retrieve images from large databases, we propose a framework, based on multiple description coding paradigms, that disseminates images over distributed servers. Consequently, decentralized download can be performed, thus reducing links overload and hotspot areas without penalizing downloads speed. Moreover, the tradeoff between system reliability and storage requirement can be achieved by tuning descriptions redundancy, thus providing high flexibility in terms of storage resources, reliability of access, and performance. The scalability of the proposed framework is achieved by the intrinsic progressivity of the multiple description schemes. Moreover, we demonstrate that system can work properly regardless of server crashes","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134510607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Music Signal Synthesis using Sinusoid Models and Sliding-Window Esprit 音乐信号合成使用正弦模型和滑动窗口Esprit
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262798
Anders Gunnarsson, I. Gu
This paper proposes a music signal synthesis scheme that is based on sinusoid modeling and sliding-window ESPRIT. Despite widely used audio coding standards, effectively synthesizing music using sinusoid models, more suitable for harmonic rich music signals, remains an open issue. In the proposed scheme, music signals are modeled by a sum of damped sinusoids in noise. A sliding window ESPRIT algorithm is applied. A continuity constraint is then imposed for tracking the time trajectories of sinusoids in music and for removing spurious spectral peaks in order to adapt to the changing number of sinusoid contents in dynamic music. Simulations have been performed to several music signals with a range of complexities, including music recorded from banjo, flute and music with mixed instruments. The results from listening and spectrograms have strongly indicated that the proposed method is very robust for music synthesis with good quality
提出了一种基于正弦波建模和滑动窗口ESPRIT的音乐信号合成方案。尽管音频编码标准被广泛使用,但如何使用更适合谐波丰富的音乐信号的正弦波模型有效地合成音乐仍然是一个悬而未决的问题。在该方案中,音乐信号由噪声中阻尼正弦波的和来建模。采用滑动窗口ESPRIT算法。然后施加连续性约束来跟踪音乐中正弦波的时间轨迹,并去除伪谱峰,以适应动态音乐中正弦波内容数量的变化。模拟已经执行了一些复杂的音乐信号,包括从五弦琴,长笛和音乐混合乐器录制的音乐。实验结果表明,该方法对高质量的音乐合成具有较强的鲁棒性
{"title":"Music Signal Synthesis using Sinusoid Models and Sliding-Window Esprit","authors":"Anders Gunnarsson, I. Gu","doi":"10.1109/ICME.2006.262798","DOIUrl":"https://doi.org/10.1109/ICME.2006.262798","url":null,"abstract":"This paper proposes a music signal synthesis scheme that is based on sinusoid modeling and sliding-window ESPRIT. Despite widely used audio coding standards, effectively synthesizing music using sinusoid models, more suitable for harmonic rich music signals, remains an open issue. In the proposed scheme, music signals are modeled by a sum of damped sinusoids in noise. A sliding window ESPRIT algorithm is applied. A continuity constraint is then imposed for tracking the time trajectories of sinusoids in music and for removing spurious spectral peaks in order to adapt to the changing number of sinusoid contents in dynamic music. Simulations have been performed to several music signals with a range of complexities, including music recorded from banjo, flute and music with mixed instruments. The results from listening and spectrograms have strongly indicated that the proposed method is very robust for music synthesis with good quality","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134526987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Efficient Hand Gesture Rendering and Decoding using a Simple Gesture Library 高效的手势渲染和解码使用一个简单的手势库
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262916
Jason Smith, L. Yin
Recent work in hand gesture rendering and decoding has treated the two fields as separate and distinct. As the work of rendering evolves, it emphasizes exact movement replication, including more muscle and skeletal parameterization. The work in gesture decoding is largely centered on trained systems, which require large amounts of time in front of a camera rendering a gesture in order to decode movement. This paper presents a new scheme which more tightly couples the gesture rendering and decoding processes. While this scheme is simpler than existing techniques, the rendering remains natural looking, and decoding a new gesture does not require extensive training
最近在手势呈现和解码方面的工作将这两个领域视为独立的和不同的。随着渲染工作的发展,它强调精确的运动复制,包括更多的肌肉和骨骼参数化。手势解码的工作主要集中在训练有素的系统上,这些系统需要在摄像机前花费大量时间渲染手势以解码运动。本文提出了一种更紧密耦合手势呈现和解码过程的新方案。虽然这个方案比现有的技术更简单,但渲染仍然保持自然的外观,并且解码一个新的手势不需要大量的训练
{"title":"Efficient Hand Gesture Rendering and Decoding using a Simple Gesture Library","authors":"Jason Smith, L. Yin","doi":"10.1109/ICME.2006.262916","DOIUrl":"https://doi.org/10.1109/ICME.2006.262916","url":null,"abstract":"Recent work in hand gesture rendering and decoding has treated the two fields as separate and distinct. As the work of rendering evolves, it emphasizes exact movement replication, including more muscle and skeletal parameterization. The work in gesture decoding is largely centered on trained systems, which require large amounts of time in front of a camera rendering a gesture in order to decode movement. This paper presents a new scheme which more tightly couples the gesture rendering and decoding processes. While this scheme is simpler than existing techniques, the rendering remains natural looking, and decoding a new gesture does not require extensive training","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133475097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2006 IEEE International Conference on Multimedia and Expo
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1