首页 > 最新文献

Found. Trends Signal Process.最新文献

英文 中文
Source Coding: Part I of Fundamentals of Source and Video Coding 源编码:源和视频编码基础的第一部分
Pub Date : 2011-01-05 DOI: 10.1561/2000000010
T. Wiegand, H. Schwarz
Digital media technologies have become an integral part of the way we create, communicate, and consume information. At the core of these technologies are source coding methods that are described in this monograph. Based on the fundamentals of information and rate distortion theory, the most relevant techniques used in source coding algorithms are described: entropy coding, quantization as well as predictive and transform coding. The emphasis is put onto algorithms that are also used in video coding, which will be explained in the other part of this two-part monograph.
数字媒体技术已经成为我们创造、交流和消费信息的一个组成部分。这些技术的核心是本专著中描述的源编码方法。基于信息和率失真理论的基本原理,描述了在信源编码算法中使用的最相关的技术:熵编码、量化以及预测和变换编码。重点放在视频编码中也使用的算法上,这将在本由两部分组成的专著的另一部分进行解释。
{"title":"Source Coding: Part I of Fundamentals of Source and Video Coding","authors":"T. Wiegand, H. Schwarz","doi":"10.1561/2000000010","DOIUrl":"https://doi.org/10.1561/2000000010","url":null,"abstract":"Digital media technologies have become an integral part of the way we create, communicate, and consume information. At the core of these technologies are source coding methods that are described in this monograph. Based on the fundamentals of information and rate distortion theory, the most relevant techniques used in source coding algorithms are described: entropy coding, quantization as well as predictive and transform coding. The emphasis is put onto algorithms that are also used in video coding, which will be explained in the other part of this two-part monograph.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"61 1","pages":"1-222"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82784981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol 分组网络上实时数字语音的历史:线性预测编码和互联网协议的第二部分
Pub Date : 2010-04-01 DOI: 10.1561/2000000036
R. Gray
In December 1974 the first realtime conversation on the ARPAnet took place between Culler-Harrison Incorporated in Goleta, California, and MIT Lincoln Laboratory in Lexington, Massachusetts. This was the first successful application of realtime digital speech communication over a packet network and an early milestone in the explosion of realtime signal processing of speech, audio, images, and video that we all take for granted today. It could be considered as the first voice over Internet Protocol (VoIP), except that the Internet Protocol (IP) had not yet been established. In fact, the interest in realtime signal processing had an indirect, but major, impact on the development of IP. This is the story of the development of linear predictive coded (LPC) speech and how it came to be used in the first successful packet speech experiments. Several related stories are recounted as well. This is the second part of a two part monograph on linear predictive coding (LPC) and the Internet protocol (IP). The first part presented an introduction to this history and a tutorial on linear prediction and its applications to speech, providing background and context to the technical history of the second part.
1974年12月,位于加州戈莱塔的Culler-Harrison公司和位于马萨诸塞州列克星敦的麻省理工学院林肯实验室在阿帕网上进行了第一次实时对话。这是在分组网络上实时数字语音通信的第一次成功应用,也是语音、音频、图像和视频实时信号处理爆炸式增长的早期里程碑,而这些我们今天都认为是理所当然的。除了互联网协议(IP)尚未建立之外,它可以被认为是第一个互联网协议语音(VoIP)。事实上,对实时信号处理的兴趣对IP的发展产生了间接但重大的影响。这是线性预测编码(LPC)语音发展的故事,以及它是如何在第一个成功的分组语音实验中使用的。书中还叙述了几个相关的故事。这是关于线性预测编码(LPC)和互联网协议(IP)的两部分专著的第二部分。第一部分介绍了这段历史,并介绍了关于线性预测及其在语音中的应用的教程,为第二部分的技术历史提供了背景和背景。
{"title":"A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol","authors":"R. Gray","doi":"10.1561/2000000036","DOIUrl":"https://doi.org/10.1561/2000000036","url":null,"abstract":"In December 1974 the first realtime conversation on the ARPAnet took place between Culler-Harrison Incorporated in Goleta, California, and MIT Lincoln Laboratory in Lexington, Massachusetts. This was the first successful application of realtime digital speech communication over a packet network and an early milestone in the explosion of realtime signal processing of speech, audio, images, and video that we all take for granted today. It could be considered as the first voice over Internet Protocol (VoIP), except that the Internet Protocol (IP) had not yet been established. In fact, the interest in realtime signal processing had an indirect, but major, impact on the development of IP. This is the story of the development of linear predictive coded (LPC) speech and how it came to be used in the first successful packet speech experiments. Several related stories are recounted as well. \u0000 \u0000This is the second part of a two part monograph on linear predictive coding (LPC) and the Internet protocol (IP). The first part presented an introduction to this history and a tutorial on linear prediction and its applications to speech, providing background and context to the technical history of the second part.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"70 1","pages":"203-303"},"PeriodicalIF":0.0,"publicationDate":"2010-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86222082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Survey of Linear Predictive Coding: Part I of Linear Predictive Coding and the Internet Protocol 线性预测编码综述:线性预测编码与互联网协议第一部分
Pub Date : 2010-03-01 DOI: 10.1561/2000000029
R. Gray
Linear prediction has long played an important role in speech processing, especially in the development during the late 1960s of the first low bit rate speech compression/coding systems. The approach, which eventually became known as linear predictive coding (LPC), coincidentally came to fruition at the right time to be adopted as the speech compression technique in the first successful realtime packet speech communication through the nascent ARPAnet in December 1974 — the ancestor of voice over the Internet Protocol (IP) and, more generally, of realtime signal processing through the Internet. This first part of a two part monograph on LPC and the IP provides a tutorial overview of linear prediction and its application to speech coding. A variety of viewpoints provides background and context for the second part, which comprises a technical and personal history of LPC, its use in the first packet speech demonstrations, and many related stories of the early applications of LPC and the prehistory of the Internet.
线性预测长期以来在语音处理中发挥着重要作用,特别是在20世纪60年代末第一个低比特率语音压缩/编码系统的发展中。这种方法最终被称为线性预测编码(LPC),巧合的是,它在1974年12月通过新生的阿帕网(ARPAnet)进行的第一次成功的实时分组语音通信中被用作语音压缩技术。阿帕网是互联网协议语音(IP)的祖先,更广泛地说,是通过互联网进行的实时信号处理。这是关于LPC和IP的两部分专著的第一部分,提供了线性预测及其在语音编码中的应用的教程概述。各种观点为第二部分提供了背景和上下文,其中包括LPC的技术和个人历史,它在第一个分组语音演示中的使用,以及LPC早期应用和互联网史前的许多相关故事。
{"title":"A Survey of Linear Predictive Coding: Part I of Linear Predictive Coding and the Internet Protocol","authors":"R. Gray","doi":"10.1561/2000000029","DOIUrl":"https://doi.org/10.1561/2000000029","url":null,"abstract":"Linear prediction has long played an important role in speech processing, especially in the development during the late 1960s of the first low bit rate speech compression/coding systems. The approach, which eventually became known as linear predictive coding (LPC), coincidentally came to fruition at the right time to be adopted as the speech compression technique in the first successful realtime packet speech communication through the nascent ARPAnet in December 1974 — the ancestor of voice over the Internet Protocol (IP) and, more generally, of realtime signal processing through the Internet. This first part of a two part monograph on LPC and the IP provides a tutorial overview of linear prediction and its application to speech coding. A variety of viewpoints provides background and context for the second part, which comprises a technical and personal history of LPC, its use in the first packet speech demonstrations, and many related stories of the early applications of LPC and the prehistory of the Internet.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"28 1","pages":"153-202"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86131769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Statistical Methods and Models for Video-Based Tracking, Modeling, and Recognition 基于视频的跟踪、建模和识别的统计方法和模型
Pub Date : 2010-01-06 DOI: 10.1561/2000000007
R. Chellappa, Aswin C. Sankaranarayanan, A. Veeraraghavan, P. Turaga
Computer vision systems attempt to understand a scene and its components from mostly visual information. The geometry exhibited by the real world, the influence of material properties on scattering of incident light, and the process of imaging introduce constraints and properties that are key to interpreting scenes and recognizing objects, their structure and kinematics. In the presence of noisy observations and other uncertainties, computer vision algorithms make use of statistical methods for robust inference. In this monograph, we highlight the role of geometric constraints in statistical estimation methods, and how the interplay between geometry and statistics leads to the choice and design of algorithms for video-based tracking, modeling and recognition of objects. In particular, we illustrate the role of imaging, illumination, and motion constraints in classical vision problems such as tracking, structure from motion, metrology, activity analysis and recognition, and present appropriate statistical methods used in each of these problems.
计算机视觉系统试图从主要的视觉信息中理解场景及其组成部分。真实世界所呈现的几何形状、材料特性对入射光散射的影响以及成像过程引入了解释场景和识别物体、其结构和运动学的关键约束和特性。在存在噪声观测和其他不确定性的情况下,计算机视觉算法利用统计方法进行鲁棒推理。在这本专著中,我们强调几何约束在统计估计方法中的作用,以及几何和统计之间的相互作用如何导致基于视频的对象跟踪,建模和识别算法的选择和设计。特别是,我们说明了成像,照明和运动约束在经典视觉问题中的作用,如跟踪,运动结构,计量,活动分析和识别,并提出了在这些问题中使用的适当统计方法。
{"title":"Statistical Methods and Models for Video-Based Tracking, Modeling, and Recognition","authors":"R. Chellappa, Aswin C. Sankaranarayanan, A. Veeraraghavan, P. Turaga","doi":"10.1561/2000000007","DOIUrl":"https://doi.org/10.1561/2000000007","url":null,"abstract":"Computer vision systems attempt to understand a scene and its components from mostly visual information. The geometry exhibited by the real world, the influence of material properties on scattering of incident light, and the process of imaging introduce constraints and properties that are key to interpreting scenes and recognizing objects, their structure and kinematics. In the presence of noisy observations and other uncertainties, computer vision algorithms make use of statistical methods for robust inference. In this monograph, we highlight the role of geometric constraints in statistical estimation methods, and how the interplay between geometry and statistics leads to the choice and design of algorithms for video-based tracking, modeling and recognition of objects. In particular, we illustrate the role of imaging, illumination, and motion constraints in classical vision problems such as tracking, structure from motion, metrology, activity analysis and recognition, and present appropriate statistical methods used in each of these problems.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"1 1","pages":"1-151"},"PeriodicalIF":0.0,"publicationDate":"2010-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79801602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Multiple Reference Motion Compensation: A Tutorial Introduction and Survey 多参考运动补偿:教程介绍与综述
Pub Date : 2009-08-22 DOI: 10.1561/2000000019
A. Leontaris, P. Cosman, A. Tourapis
Motion compensation exploits temporal correlation in a video sequence to yield high compression efficiency. Multiple reference frame motion compensation is an extension of motion compensation that exploits temporal correlation over a longer time scale. Devised mainly for increasing compression efficiency, it exhibits useful properties such as enhanced error resilience and error concealment. In this survey, we explore different aspects of multiple reference frame motion compensation, including multihypothesis prediction, global motion prediction, improved error resilience and concealment for multiple references, and algorithms for fast motion estimation in the context of multiple reference frame video encoders.
运动补偿利用视频序列中的时间相关性来获得较高的压缩效率。多参考帧运动补偿是运动补偿的扩展,在更长的时间尺度上利用时间相关性。设计主要是为了提高压缩效率,它显示了有用的特性,如增强的错误弹性和错误隐藏。在本研究中,我们探讨了多参考帧运动补偿的不同方面,包括多假设预测、全局运动预测、改进的误差恢复能力和多参考帧隐藏,以及多参考帧视频编码器背景下的快速运动估计算法。
{"title":"Multiple Reference Motion Compensation: A Tutorial Introduction and Survey","authors":"A. Leontaris, P. Cosman, A. Tourapis","doi":"10.1561/2000000019","DOIUrl":"https://doi.org/10.1561/2000000019","url":null,"abstract":"Motion compensation exploits temporal correlation in a video sequence to yield high compression efficiency. Multiple reference frame motion compensation is an extension of motion compensation that exploits temporal correlation over a longer time scale. Devised mainly for increasing compression efficiency, it exhibits useful properties such as enhanced error resilience and error concealment. In this survey, we explore different aspects of multiple reference frame motion compensation, including multihypothesis prediction, global motion prediction, improved error resilience and concealment for multiple references, and algorithms for fast motion estimation in the context of multiple reference frame video encoders.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"28 1","pages":"247-364"},"PeriodicalIF":0.0,"publicationDate":"2009-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74532268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An Introduction to Frames 框架介绍
Pub Date : 2008-10-13 DOI: 10.1561/2000000006
J. Kovacevic, A. Chebira
This survey gives an introduction to redundant signal representations called frames. These representations have recently emerged as yet another powerful tool in the signal processing toolbox and have become popular through use in numerous applications. Our aim is to familiarize a general audience with the area, while at the same time giving a snapshot of the current state-of-the-art.
这个调查给出了一个冗余信号表示称为帧的介绍。这些表示最近作为信号处理工具箱中的另一个强大工具出现,并通过在许多应用中使用而变得流行。我们的目标是让普通观众熟悉该地区,同时提供当前最先进技术的快照。
{"title":"An Introduction to Frames","authors":"J. Kovacevic, A. Chebira","doi":"10.1561/2000000006","DOIUrl":"https://doi.org/10.1561/2000000006","url":null,"abstract":"This survey gives an introduction to redundant signal representations called frames. These representations have recently emerged as yet another powerful tool in the signal processing toolbox and have become popular through use in numerous applications. Our aim is to familiarize a general audience with the area, while at the same time giving a snapshot of the current state-of-the-art.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"121 1","pages":"1-94"},"PeriodicalIF":0.0,"publicationDate":"2008-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73465229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 186
Rethinking Biased Estimation: Improving Maximum Likelihood and the Cramér-Rao Bound 重新思考有偏差估计:改进极大似然和cram<s:1> - rao界
Pub Date : 2008-07-04 DOI: 10.1561/2000000008
Yonina C. Eldar
One of the prime goals of statistical estimation theory is the development of performance bounds when estimating parameters of interest in a given model, as well as constructing estimators that achieve these limits. When the parameters to be estimated are deterministic, a popular approach is to bound the mean-squared error (MSE) achievable within the class of unbiased estimators. Although it is well-known that lower MSE can be obtained by allowing for a bias, in applications it is typically unclear how to choose an appropriate bias. In this survey we introduce MSE bounds that are lower than the unbiased Cramer–Rao bound (CRB) for all values of the unknowns. We then present a general framework for constructing biased estimators with smaller MSE than the standard maximum-likelihood (ML) approach, regardless of the true unknown values. Specializing the results to the linear Gaussian model, we derive a class of estimators that dominate least-squares in terms of MSE. We also introduce methods for choosing regularization parameters in penalized ML estimators that outperform standard techniques such as cross validation.
统计估计理论的主要目标之一是在估计给定模型中感兴趣的参数时开发性能界限,以及构造达到这些限制的估计器。当待估计的参数是确定的时,一种流行的方法是在无偏估计器类中限定可实现的均方误差(MSE)。虽然众所周知,允许偏差可以获得较低的MSE,但在应用中,通常不清楚如何选择适当的偏差。在这个调查中,我们引入了MSE界低于无偏Cramer-Rao界(CRB)的所有值的未知数。然后,我们提出了一个通用框架,用于构建具有比标准最大似然(ML)方法更小的MSE的有偏估计量,而不考虑真正的未知值。将结果专门化到线性高斯模型,我们导出了一类在MSE方面占主导地位的最小二乘估计器。我们还介绍了在惩罚ML估计器中选择正则化参数的方法,这些方法优于交叉验证等标准技术。
{"title":"Rethinking Biased Estimation: Improving Maximum Likelihood and the Cramér-Rao Bound","authors":"Yonina C. Eldar","doi":"10.1561/2000000008","DOIUrl":"https://doi.org/10.1561/2000000008","url":null,"abstract":"One of the prime goals of statistical estimation theory is the development of performance bounds when estimating parameters of interest in a given model, as well as constructing estimators that achieve these limits. When the parameters to be estimated are deterministic, a popular approach is to bound the mean-squared error (MSE) achievable within the class of unbiased estimators. Although it is well-known that lower MSE can be obtained by allowing for a bias, in applications it is typically unclear how to choose an appropriate bias. \u0000 \u0000In this survey we introduce MSE bounds that are lower than the unbiased Cramer–Rao bound (CRB) for all values of the unknowns. We then present a general framework for constructing biased estimators with smaller MSE than the standard maximum-likelihood (ML) approach, regardless of the true unknown values. Specializing the results to the linear Gaussian model, we derive a class of estimators that dominate least-squares in terms of MSE. We also introduce methods for choosing regularization parameters in penalized ML estimators that outperform standard techniques such as cross validation.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"18 1","pages":"305-449"},"PeriodicalIF":0.0,"publicationDate":"2008-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85807630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
Set Partition Coding: Part II of Set Partition Coding and Image Wavelet Coding Systems 集分割编码:集分割编码与图像小波编码系统的第二部分
Pub Date : 2008-03-01 DOI: 10.1561/2000000014
W. Pearlman, A. Said
This monograph describes current-day wavelet transform image coding systems. As in the first part, steps of the algorithms are explained thoroughly and set apart. An image coding system consists of several stages: transformation, quantization, set partition or adaptive entropy coding or both, decoding including rate control, inverse transformation, de-quantization, and optional processing (see Figure 1.6). Wavelet transform systems can provide many desirable properties besides high efficiency, such as scalability in quality, scalability in resolution, and region-of-interest access to the coded bitstream. These properties are built into the JPEG2000 standard, so its coding will be fully described. Since JPEG2000 codes subblocks of subbands, other methods, such as SBHP (Subband Block Hierarchical Partitioning) [3] and EZBC (Embedded Zero Block Coder) [8], that code subbands or its subblocks independently are also described. The emphasis in this part is the use of the basic algorithms presented in the previous part in ways that achieve these desirable bitstream properties. In this vein, we describe a modification of the tree-based coding in SPIHT (Set Partitioning In Hierarchical Trees) [15], whose output bitstream can be decoded partially corresponding to a designated region of interest and is simultaneously quality and resolution scalable. This monograph is extracted and adapted from the forthcoming textbook entitled Digital Signal Compression: Principles and Practice by William A. Pearlman and Amir Said, Cambridge University Press, 2009.
这本专著描述了当前的小波变换图像编码系统。与第一部分一样,算法的步骤进行了彻底的解释并进行了区分。图像编码系统包括变换、量化、集合分割或自适应熵编码或两者同时进行的几个阶段,解码包括速率控制、逆变换、去量化和可选处理(见图1.6)。小波变换系统除了效率高外,还能提供许多理想的特性,如质量的可扩展性、分辨率的可扩展性和对编码比特流的兴趣区域访问。这些属性内置于JPEG2000标准中,因此将对其编码进行完整描述。由于JPEG2000对子带的子块进行编码,因此还描述了对子带或其子块进行独立编码的其他方法,如shbhp (Subband Block Hierarchical Partitioning)[3]和EZBC (Embedded Zero Block Coder)[8]。本部分的重点是使用前一部分中介绍的基本算法来实现这些理想的比特流属性。在这方面,我们描述了SPIHT (Set Partitioning In Hierarchical Trees)[15]中基于树的编码的一种修改,其输出比特流可以部分对应于指定的感兴趣区域进行解码,同时具有质量和分辨率可扩展性。本专著摘自即将出版的教科书《数字信号压缩:原理与实践》,作者是William A. Pearlman和Amir Said,剑桥大学出版社,2009年。
{"title":"Set Partition Coding: Part II of Set Partition Coding and Image Wavelet Coding Systems","authors":"W. Pearlman, A. Said","doi":"10.1561/2000000014","DOIUrl":"https://doi.org/10.1561/2000000014","url":null,"abstract":"This monograph describes current-day wavelet transform image coding systems. As in the first part, steps of the algorithms are explained thoroughly and set apart. An image coding system consists of several stages: transformation, quantization, set partition or adaptive entropy coding or both, decoding including rate control, inverse transformation, de-quantization, and optional processing (see Figure 1.6). Wavelet transform systems can provide many desirable properties besides high efficiency, such as scalability in quality, scalability in resolution, and region-of-interest access to the coded bitstream. These properties are built into the JPEG2000 standard, so its coding will be fully described. Since JPEG2000 codes subblocks of subbands, other methods, such as SBHP (Subband Block Hierarchical Partitioning) [3] and EZBC (Embedded Zero Block Coder) [8], that code subbands or its subblocks independently are also described. The emphasis in this part is the use of the basic algorithms presented in the previous part in ways that achieve these desirable bitstream properties. In this vein, we describe a modification of the tree-based coding in SPIHT (Set Partitioning In Hierarchical Trees) [15], whose output bitstream can be decoded partially corresponding to a designated region of interest and is simultaneously quality and resolution scalable. \u0000 \u0000This monograph is extracted and adapted from the forthcoming textbook entitled Digital Signal Compression: Principles and Practice by William A. Pearlman and Amir Said, Cambridge University Press, 2009.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"102 1","pages":"181-246"},"PeriodicalIF":0.0,"publicationDate":"2008-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74046815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Set Partition Coding: Part I of Set Partition Coding and Image Wavelet Coding Systems 集分区编码:集分区编码与图像小波编码系统的第一部分
Pub Date : 2008-02-28 DOI: 10.1561/2000000013
W. Pearlman, A. Said
The purpose of this two-part monograph is to present a tutorial on set partition coding, with emphasis and examples on image wavelet transform coding systems, and describe their use in modern image coding systems. Set partition coding is a procedure that recursively splits groups of integer data or transform elements guided by a sequence of threshold tests, producing groups of elements whose magnitudes are between two known thresholds, therefore, setting the maximum number of bits required for their binary representation. It produces groups of elements whose magnitudes are less than a certain known threshold. Therefore, the number of bits for representing an element in a particular group is no more than the base-2 logarithm of its threshold rounded up to the nearest integer. SPIHT (Set Partitioning in Hierarchical Trees) and SPECK (Set Partitioning Embedded blocK) are popular state-of-the-art image coders that use set partition coding as the primary entropy coding method. JPEG2000 and EZW (Embedded Zerotree Wavelet) use it in an auxiliary manner. Part I elucidates the fundamentals of set partition coding and explains the setting of thresholds and the block and tree modes of partitioning. Algorithms are presented for the techniques of AGP (Amplitude and Group Partitioning), SPIHT, SPECK, and EZW. Numerical examples are worked out in detail for the latter three techniques. Part II describes various wavelet image coding systems that use set partitioning primarily, such as SBHP (Subband Block Hierarchical Partitioning), SPIHT, and EZBC (Embedded Zero-Block Coder). The basic JPEG2000 coder is also described. The coding procedures and the specific methods are presented both logically and in algorithmic form, where possible. Besides the obvious objective of obtaining small file sizes, much emphasis is placed on achieving low computational complexity and desirable output bitstream attributes, such as embeddedness, scalability in resolution, and random access decodability. This monograph is extracted and adapted from the forthcoming textbook entitled Digital Signal Compression: Principles and Practice by William A. Pearlman and Amir Said, Cambridge University Press, 2009.
这两部分专著的目的是提供一个集分割编码的教程,重点和图像小波变换编码系统的例子,并描述它们在现代图像编码系统中的应用。设置分割编码是一个过程,它递归地分割整数数据组或在一系列阈值测试的指导下转换元素,从而产生大小在两个已知阈值之间的元素组,从而设置其二进制表示所需的最大位数。它产生的元素群的大小小于某个已知阈值。因此,表示特定组中的元素的位数不超过其阈值的以2为底的对数,四舍五入到最接近的整数。SPIHT (Set Partitioning in Hierarchical Trees)和SPECK (Set Partitioning Embedded blocK)是目前流行的图像编码器,它们使用集合划分编码作为主要的熵编码方法。JPEG2000和EZW(嵌入式零树小波)以辅助的方式使用它。第1部分阐述了集合分区编码的基础知识,并解释了阈值的设置以及分区的块和树模式。提出了AGP(振幅和群划分)、SPIHT、SPECK和EZW技术的算法。对后三种技术进行了详细的数值算例。第二部分描述了主要使用集合分区的各种小波图像编码系统,如shbhp(子带块分层分区)、SPIHT和EZBC(嵌入式零块编码器)。介绍了基本的JPEG2000编码器。编码过程和具体方法在可能的情况下以逻辑和算法的形式呈现。除了获得小文件大小的明显目标外,还非常重视实现低计算复杂度和理想的输出比特流属性,例如嵌入性、分辨率的可伸缩性和随机访问可解码性。本专著摘自即将出版的教科书《数字信号压缩:原理与实践》,作者是William A. Pearlman和Amir Said,剑桥大学出版社,2009年。
{"title":"Set Partition Coding: Part I of Set Partition Coding and Image Wavelet Coding Systems","authors":"W. Pearlman, A. Said","doi":"10.1561/2000000013","DOIUrl":"https://doi.org/10.1561/2000000013","url":null,"abstract":"The purpose of this two-part monograph is to present a tutorial on set partition coding, with emphasis and examples on image wavelet transform coding systems, and describe their use in modern image coding systems. Set partition coding is a procedure that recursively splits groups of integer data or transform elements guided by a sequence of threshold tests, producing groups of elements whose magnitudes are between two known thresholds, therefore, setting the maximum number of bits required for their binary representation. It produces groups of elements whose magnitudes are less than a certain known threshold. Therefore, the number of bits for representing an element in a particular group is no more than the base-2 logarithm of its threshold rounded up to the nearest integer. SPIHT (Set Partitioning in Hierarchical Trees) and SPECK (Set Partitioning Embedded blocK) are popular state-of-the-art image coders that use set partition coding as the primary entropy coding method. JPEG2000 and EZW (Embedded Zerotree Wavelet) use it in an auxiliary manner. Part I elucidates the fundamentals of set partition coding and explains the setting of thresholds and the block and tree modes of partitioning. Algorithms are presented for the techniques of AGP (Amplitude and Group Partitioning), SPIHT, SPECK, and EZW. Numerical examples are worked out in detail for the latter three techniques. Part II describes various wavelet image coding systems that use set partitioning primarily, such as SBHP (Subband Block Hierarchical Partitioning), SPIHT, and EZBC (Embedded Zero-Block Coder). The basic JPEG2000 coder is also described. The coding procedures and the specific methods are presented both logically and in algorithmic form, where possible. Besides the obvious objective of obtaining small file sizes, much emphasis is placed on achieving low computational complexity and desirable output bitstream attributes, such as embeddedness, scalability in resolution, and random access decodability. \u0000 \u0000This monograph is extracted and adapted from the forthcoming textbook entitled Digital Signal Compression: Principles and Practice by William A. Pearlman and Amir Said, Cambridge University Press, 2009.","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"109 1","pages":"95-180"},"PeriodicalIF":0.0,"publicationDate":"2008-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80675851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Introduction to Digital Speech Processing 数字语音处理导论
Pub Date : 2007-11-30 DOI: 10.1561/2000000001
L. Rabiner, R. Schafer
Since even before the time of Alexander Graham Bell's revolutionary invention, engineers and scientists have studied the phenomenon of speech communication with an eye on creating more efficient and effective systems of human-to-human and human-to-machine communication. Starting in the 1960s, digital signal processing (DSP), assumed a central role in speech studies, and today DSP is the key to realizing the fruits of the knowledge that has been gained through decades of research. Concomitant advances in integrated circuit technology and computer architecture have aligned to create a technological environment with virtually limitless opportunities for innovation in speech communication applications. In this text, we highlight the central role of DSP techniques in modern speech communication research and applications. We present a comprehensive overview of digital speech processing that ranges from the basic nature of the speech signal, through a variety of methods of representing speech in digital form, to applications in voice communication and automatic synthesis and recognition of speech. The breadth of this subject does not allow us to discuss any aspect of speech processing to great depth; hence our goal is to provide a useful introduction to the wide range of important concepts that comprise the field of digital speech processing. A more comprehensive treatment will appear in the forthcoming book, Theory and Application of Digital Speech Processing [101].
早在亚历山大·格雷厄姆·贝尔(Alexander Graham Bell)的革命性发明之前,工程师和科学家就开始研究语音交流现象,着眼于创造更高效的人机交流系统。从20世纪60年代开始,数字信号处理(DSP)在语音研究中起着核心作用,今天DSP是实现几十年研究成果的关键。伴随着集成电路技术和计算机体系结构的进步,为语音通信应用的创新创造了几乎无限的技术环境。在本文中,我们强调了DSP技术在现代语音通信研究和应用中的核心作用。我们对数字语音处理进行了全面的概述,从语音信号的基本性质,通过各种以数字形式表示语音的方法,到语音通信和语音自动合成和识别的应用。这个主题的广度不允许我们深入讨论语音处理的任何方面;因此,我们的目标是为组成数字语音处理领域的广泛重要概念提供有用的介绍。更全面的处理将出现在即将出版的《数字语音处理理论与应用》一书中[101]。
{"title":"Introduction to Digital Speech Processing","authors":"L. Rabiner, R. Schafer","doi":"10.1561/2000000001","DOIUrl":"https://doi.org/10.1561/2000000001","url":null,"abstract":"Since even before the time of Alexander Graham Bell's revolutionary invention, engineers and scientists have studied the phenomenon of speech communication with an eye on creating more efficient and effective systems of human-to-human and human-to-machine communication. Starting in the 1960s, digital signal processing (DSP), assumed a central role in speech studies, and today DSP is the key to realizing the fruits of the knowledge that has been gained through decades of research. Concomitant advances in integrated circuit technology and computer architecture have aligned to create a technological environment with virtually limitless opportunities for innovation in speech communication applications. In this text, we highlight the central role of DSP techniques in modern speech communication research and applications. We present a comprehensive overview of digital speech processing that ranges from the basic nature of the speech signal, through a variety of methods of representing speech in digital form, to applications in voice communication and automatic synthesis and recognition of speech. The breadth of this subject does not allow us to discuss any aspect of speech processing to great depth; hence our goal is to provide a useful introduction to the wide range of important concepts that comprise the field of digital speech processing. A more comprehensive treatment will appear in the forthcoming book, Theory and Application of Digital Speech Processing [101].","PeriodicalId":12340,"journal":{"name":"Found. Trends Signal Process.","volume":"73 1","pages":"1-194"},"PeriodicalIF":0.0,"publicationDate":"2007-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74071887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
期刊
Found. Trends Signal Process.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1