首页 > 最新文献

2008 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献

英文 中文
Rate-optimal MIMO transmission with mean and covariance feedback at low SNR 在低信噪比下具有均值和协方差反馈的速率最优MIMO传输
Pub Date : 2009-02-24 DOI: 10.1109/TVT.2009.2015670
R. Gohary, W. Mesbah, T. Davidson
We consider a multiple-input multiple-output (MIMO) wireless communication scenario in which the channel follows a general spatially-correlated complex Gaussian distribution with non-zero mean. We derive an explicit characterization of the optimal input covariance from an ergodic rate perspective for systems that operate at low SNRs. This characterization is in terms of the eigen decomposition of a matrix that depends on the mean and the covariance of the channel, and typically results in a beamforming strategy along the principal eigenvector of that matrix. Simulation results show the potential impact of (jointly) exploiting the mean and the covariance of the channel on the ergodic achievable rate at both low and moderate- to-high SNRs.
我们考虑了一个多输入多输出(MIMO)无线通信场景,其中信道遵循一般空间相关的非零均值复高斯分布。我们从遍历率的角度推导出在低信噪比下运行的系统的最优输入协方差的明确表征。这种特性是根据依赖于信道的均值和协方差的矩阵的特征分解来描述的,并且通常会导致沿着该矩阵的主特征向量的波束形成策略。仿真结果表明,在低信噪比和中高信噪比条件下,(联合)利用信道均值和协方差对遍历可达速率的潜在影响。
{"title":"Rate-optimal MIMO transmission with mean and covariance feedback at low SNR","authors":"R. Gohary, W. Mesbah, T. Davidson","doi":"10.1109/TVT.2009.2015670","DOIUrl":"https://doi.org/10.1109/TVT.2009.2015670","url":null,"abstract":"We consider a multiple-input multiple-output (MIMO) wireless communication scenario in which the channel follows a general spatially-correlated complex Gaussian distribution with non-zero mean. We derive an explicit characterization of the optimal input covariance from an ergodic rate perspective for systems that operate at low SNRs. This characterization is in terms of the eigen decomposition of a matrix that depends on the mean and the covariance of the channel, and typically results in a beamforming strategy along the principal eigenvector of that matrix. Simulation results show the potential impact of (jointly) exploiting the mean and the covariance of the channel on the ergodic achievable rate at both low and moderate- to-high SNRs.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125222574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Complexity adaptive H.264 encoding using multiple reference frames 复杂度自适应H.264编码使用多个参考帧
Pub Date : 2008-10-06 DOI: 10.1109/ICASSP.2008.4517787
Sui-Yuk Lam, O. Au, P. Wong
The state-of-the-art H.264/AVC video coding standard achieves significant improvements in coding efficiency by introducing many new coding techniques. However the computation complexity is inevitably increased during both the encoding and decoding process. Many previous works, such as fast motion estimation and fast mode decision algorithms, have been proposed aiming at reducing the encoder complexity while maintaining the coding efficiency. In this paper, we propose a new encoding approach which accounts for the decoding complexity. Simulation results show that the decoding complexity can be reduced by up to 15% in terms of motion compensation operations, which is the most complex part of the decoder, while maintaining the R-D performance with only about 0.1 dB degradation.
最先进的H.264/AVC视频编码标准通过引入许多新的编码技术,大大提高了编码效率。然而,在编码和解码过程中,不可避免地增加了计算复杂度。为了在保持编码效率的同时降低编码器的复杂度,人们提出了快速运动估计算法和快速模式决策算法。在本文中,我们提出了一种新的编码方法来考虑解码的复杂性。仿真结果表明,在保持R-D性能仅下降约0.1 dB的情况下,解码器最复杂的部分——运动补偿操作的解码复杂度可降低15%。
{"title":"Complexity adaptive H.264 encoding using multiple reference frames","authors":"Sui-Yuk Lam, O. Au, P. Wong","doi":"10.1109/ICASSP.2008.4517787","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4517787","url":null,"abstract":"The state-of-the-art H.264/AVC video coding standard achieves significant improvements in coding efficiency by introducing many new coding techniques. However the computation complexity is inevitably increased during both the encoding and decoding process. Many previous works, such as fast motion estimation and fast mode decision algorithms, have been proposed aiming at reducing the encoder complexity while maintaining the coding efficiency. In this paper, we propose a new encoding approach which accounts for the decoding complexity. Simulation results show that the decoding complexity can be reduced by up to 15% in terms of motion compensation operations, which is the most complex part of the decoder, while maintaining the R-D performance with only about 0.1 dB degradation.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127756511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multisensor very lowbit rate speech coding using segment quantization 多传感器非常低比特率语音编码使用分段量化
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4518530
A. McCree, K. Brady, T. Quatieri
We present two approaches to noise robust very low bit rate speech coding using wideband MELP analysis/synthesis. Both methods exploit multiple acoustic and non-acoustic input sensors, using our previously-presented dynamic waveform fusion algorithm to simultaneously perform waveform fusion, noise suppression, and cross-channel noise cancellation. One coder uses a 600 bps scalable phonetic vocoder, with a phonetic speech recognizer followed by joint predictive vector quantization of the error in wideband MELP parameters. The second coder operates at 300 bps with fixed 80 ms segments, using novel variable-rate multistage matrix quantization techniques. Formal test results show that both coders achieve equivalent intelligibility to the 2.4 kbps NATO standard MELPe coder in harsh acoustic noise environments, at much lower bit rates, with only modest quality loss.
我们提出了两种使用宽带MELP分析/合成实现噪声鲁棒的超低比特率语音编码的方法。这两种方法都利用多个声学和非声学输入传感器,使用我们之前提出的动态波形融合算法同时进行波形融合、噪声抑制和跨通道噪声消除。一个编码器使用600 bps可扩展的语音声码器,带语音识别器,然后对宽带MELP参数中的误差进行联合预测向量量化。第二个编码器以300bps的速度运行,使用新颖的可变速率多级矩阵量化技术,固定80ms段。正式测试结果表明,在恶劣的噪声环境下,这两种编码器都能以更低的比特率实现与2.4 kbps北约标准MELPe编码器相当的清晰度,并且只有适度的质量损失。
{"title":"Multisensor very lowbit rate speech coding using segment quantization","authors":"A. McCree, K. Brady, T. Quatieri","doi":"10.1109/ICASSP.2008.4518530","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518530","url":null,"abstract":"We present two approaches to noise robust very low bit rate speech coding using wideband MELP analysis/synthesis. Both methods exploit multiple acoustic and non-acoustic input sensors, using our previously-presented dynamic waveform fusion algorithm to simultaneously perform waveform fusion, noise suppression, and cross-channel noise cancellation. One coder uses a 600 bps scalable phonetic vocoder, with a phonetic speech recognizer followed by joint predictive vector quantization of the error in wideband MELP parameters. The second coder operates at 300 bps with fixed 80 ms segments, using novel variable-rate multistage matrix quantization techniques. Formal test results show that both coders achieve equivalent intelligibility to the 2.4 kbps NATO standard MELPe coder in harsh acoustic noise environments, at much lower bit rates, with only modest quality loss.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114987539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Improved image authentication using closed-form compensation and spread-spectrum watermarking 改进的图像认证使用封闭形式补偿和扩频水印
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4517976
S. Ababneh, R. Ansari, A. Khokhar
This paper presents an image authentication scheme based on compensated watermarking employing a Lagrangian-based closed-form solution to compensate for signature perturbation due to the embedding operation. The proposed scheme uses a spread-spectrum based watermarking technique and a blind detector, thus making it attractive for applications that may not have the original image available at the time of authentication. Existing compensated signature embedding frameworks use an iterative mechanism to reach a desired compensation. The iterative approach is time consuming and less effective than the closed-form approach proposed in this paper, which performs an accurate compensation in one step while meeting the minimum distortion criteria of image least mean square distortion to guarantee image fidelity. Simulation results are presented to show the proposed scheme's efficiency and accuracy.
本文提出了一种基于补偿水印的图像认证方案,该方案采用基于拉格朗日的封闭解来补偿嵌入操作引起的签名扰动。该方案采用了基于扩频的水印技术和盲检测器,因此对于在认证时可能无法获得原始图像的应用具有吸引力。现有的补偿签名嵌入框架使用迭代机制来达到期望的补偿。迭代法在满足图像最小均方失真的最小失真准则的前提下,一步完成精确的补偿,保证了图像的保真度。仿真结果表明了该方案的有效性和准确性。
{"title":"Improved image authentication using closed-form compensation and spread-spectrum watermarking","authors":"S. Ababneh, R. Ansari, A. Khokhar","doi":"10.1109/ICASSP.2008.4517976","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4517976","url":null,"abstract":"This paper presents an image authentication scheme based on compensated watermarking employing a Lagrangian-based closed-form solution to compensate for signature perturbation due to the embedding operation. The proposed scheme uses a spread-spectrum based watermarking technique and a blind detector, thus making it attractive for applications that may not have the original image available at the time of authentication. Existing compensated signature embedding frameworks use an iterative mechanism to reach a desired compensation. The iterative approach is time consuming and less effective than the closed-form approach proposed in this paper, which performs an accurate compensation in one step while meeting the minimum distortion criteria of image least mean square distortion to guarantee image fidelity. Simulation results are presented to show the proposed scheme's efficiency and accuracy.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115141216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robust sensor estimation using temporal information 基于时间信息的鲁棒传感器估计
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4518050
Chao Yuan, Claus Neubauer
We propose a dynamic Bayesian framework for sensor estimation, a critical step of many machine condition monitoring systems. The temporal behavior of normal sensor data is described by a stationary switching autoregressive (SSAR) model that possesses two advantages over traditional switching autoregressive (SAR) models. First, the SSAR model removes time dependency of signals during mode switching and fits sensor data better. Secondly, the SSAR model is stationary in that at each time, sensor data have the same distribution which represents the normal operating range of a system; this ensures that estimates are accurate and are not distracted by deviations. During monitoring the deviation covariance is estimated adaptively, which effectively handles variable levels of deviations. Tests on gas turbine data are presented.
我们提出了一个动态贝叶斯框架用于传感器估计,这是许多机器状态监测系统的关键步骤。正常传感器数据的时间行为由平稳切换自回归(SSAR)模型描述,该模型与传统的切换自回归(SAR)模型相比具有两个优点。首先,SSAR模型消除了模式切换过程中信号的时间依赖性,更好地拟合了传感器数据。其次,SSAR模型是平稳性的,即在每个时刻,传感器数据具有相同的分布,代表系统的正常工作范围;这确保了估计是准确的,不会因偏差而分心。在监测过程中,自适应估计偏差协方差,有效地处理了不同程度的偏差。介绍了燃气轮机数据的试验。
{"title":"Robust sensor estimation using temporal information","authors":"Chao Yuan, Claus Neubauer","doi":"10.1109/ICASSP.2008.4518050","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518050","url":null,"abstract":"We propose a dynamic Bayesian framework for sensor estimation, a critical step of many machine condition monitoring systems. The temporal behavior of normal sensor data is described by a stationary switching autoregressive (SSAR) model that possesses two advantages over traditional switching autoregressive (SAR) models. First, the SSAR model removes time dependency of signals during mode switching and fits sensor data better. Secondly, the SSAR model is stationary in that at each time, sensor data have the same distribution which represents the normal operating range of a system; this ensures that estimates are accurate and are not distracted by deviations. During monitoring the deviation covariance is estimated adaptively, which effectively handles variable levels of deviations. Tests on gas turbine data are presented.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115143101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A probabilistic union approach to robust face recognition with partial distortion and occlusion 基于概率联合的部分失真和遮挡鲁棒人脸识别方法
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4517779
Jie Lin, J. Ming, D. Crookes
This paper presents a new approach to face recognition where the images are subject to unknown, partial distortion/occlusion. The new approach is a probabilistic decision-based neural network (PDBNN), built on a statistical method called the posterior union model (PUM). PUM is an approach for ignoring severely mismatched local features and focusing the recognition mainly on the matched local features. It thereby improves the robustness while assuming no prior information about the corruption. We call the new approach the posterior union decision-based neural network (PUDBNN). The new PUDBNN has been evaluated on two face image databases, XM2VTS and ORL, using testing images subjected to various types of partial distortion and occlusion. The new system has demonstrated improved performance over other systems.
本文提出了一种新的人脸识别方法,其中图像受到未知的,部分失真/遮挡。新方法是一种基于概率决策的神经网络(PDBNN),建立在一种称为后验联合模型(PUM)的统计方法之上。PUM是一种忽略严重不匹配的局部特征,将识别重点放在匹配的局部特征上的方法。因此,它在假设没有关于损坏的先验信息的情况下提高了鲁棒性。我们将这种新方法称为基于后向联合决策的神经网络(PUDBNN)。在两个人脸图像数据库XM2VTS和ORL上,使用不同类型的部分失真和遮挡的测试图像对新的pubdbnn进行了评估。新系统的性能优于其他系统。
{"title":"A probabilistic union approach to robust face recognition with partial distortion and occlusion","authors":"Jie Lin, J. Ming, D. Crookes","doi":"10.1109/ICASSP.2008.4517779","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4517779","url":null,"abstract":"This paper presents a new approach to face recognition where the images are subject to unknown, partial distortion/occlusion. The new approach is a probabilistic decision-based neural network (PDBNN), built on a statistical method called the posterior union model (PUM). PUM is an approach for ignoring severely mismatched local features and focusing the recognition mainly on the matched local features. It thereby improves the robustness while assuming no prior information about the corruption. We call the new approach the posterior union decision-based neural network (PUDBNN). The new PUDBNN has been evaluated on two face image databases, XM2VTS and ORL, using testing images subjected to various types of partial distortion and occlusion. The new system has demonstrated improved performance over other systems.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115421576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A novel approach to part-of-speech tagging based on latent analogy 一种基于潜在类比的词性标注新方法
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4518702
J. Bellegarda
Part-of-speech tagging is a necessary pre-processing step for many natural language tasks. Recent statistical approaches, such as conditional random fields, rely on well chosen feature functions to ensure that important characteristics of the empirical training distribution are reflected in the trained model. In practice, however, it is not always clear how to best select these feature functions in order to obtain a suitably robust model. This paper proposes an alternative strategy based on the principle of latent analogy. For each sentence under consideration, we construct a neighborhood of globally relevant training sentences through an appropriate data-driven mapping of the input surface form. Tagging then proceeds via locally optimal sequence alignment and maximum likelihood position scoring. Empirical evidence shows that this solution is competitive with state-of-the-art Markovian techniques.
词性标注是许多自然语言任务的必要预处理步骤。最近的统计方法,如条件随机场,依赖于精心选择的特征函数,以确保经验训练分布的重要特征反映在训练模型中。然而,在实践中,如何最好地选择这些特征函数以获得合适的鲁棒模型并不总是很清楚。本文提出了一种基于潜在类比原理的替代策略。对于所考虑的每个句子,我们通过输入表面形式的适当数据驱动映射来构建全局相关训练句子的邻域。然后通过局部最优序列比对和最大似然位置评分进行标记。经验证据表明,该解决方案与最先进的马尔可夫技术具有竞争力。
{"title":"A novel approach to part-of-speech tagging based on latent analogy","authors":"J. Bellegarda","doi":"10.1109/ICASSP.2008.4518702","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518702","url":null,"abstract":"Part-of-speech tagging is a necessary pre-processing step for many natural language tasks. Recent statistical approaches, such as conditional random fields, rely on well chosen feature functions to ensure that important characteristics of the empirical training distribution are reflected in the trained model. In practice, however, it is not always clear how to best select these feature functions in order to obtain a suitably robust model. This paper proposes an alternative strategy based on the principle of latent analogy. For each sentence under consideration, we construct a neighborhood of globally relevant training sentences through an appropriate data-driven mapping of the input surface form. Tagging then proceeds via locally optimal sequence alignment and maximum likelihood position scoring. Empirical evidence shows that this solution is competitive with state-of-the-art Markovian techniques.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115468257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
I/Q imbalance mitigation for STBC MIMO-OFDM communication systems STBC MIMO-OFDM通信系统的I/Q失衡缓解
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4518304
Mingzheng Cao, H. Ge
In this work we study the performance degradation caused by the in-phase/quadrature (I/Q) imbalance in space-time block coded (STBC) multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) communication systems. The 2-Tx Alamouti scheme, 4-Tx quasi- orthogonal STBC (QOSTBC) scheme, and 4-Tx rotated QOSTBC (RQOSTBC) scheme with I/Q imbalance are examined in details. Our study shows that I/Q imbalance causes severe distortion in STBC MIMO-OFDM systems. By exploiting the structure of the received signal, low-complexity solutions are developed to mitigate the resultant distortion successfully.
本文研究了空时块编码(STBC)多输入多输出(MIMO)正交频分复用(OFDM)通信系统中由同相/正交(I/Q)不平衡引起的性能下降。详细研究了具有I/Q不平衡的2-Tx Alamouti方案、4-Tx拟正交STBC (QOSTBC)方案和4-Tx旋转QOSTBC (RQOSTBC)方案。我们的研究表明,在STBC MIMO-OFDM系统中,I/Q失衡会导致严重的失真。通过利用接收信号的结构,开发了低复杂度的解决方案来成功地减轻由此产生的失真。
{"title":"I/Q imbalance mitigation for STBC MIMO-OFDM communication systems","authors":"Mingzheng Cao, H. Ge","doi":"10.1109/ICASSP.2008.4518304","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518304","url":null,"abstract":"In this work we study the performance degradation caused by the in-phase/quadrature (I/Q) imbalance in space-time block coded (STBC) multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) communication systems. The 2-Tx Alamouti scheme, 4-Tx quasi- orthogonal STBC (QOSTBC) scheme, and 4-Tx rotated QOSTBC (RQOSTBC) scheme with I/Q imbalance are examined in details. Our study shows that I/Q imbalance causes severe distortion in STBC MIMO-OFDM systems. By exploiting the structure of the received signal, low-complexity solutions are developed to mitigate the resultant distortion successfully.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115611388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Query by humming of midi and audio using locality sensitive hashing 使用位置敏感哈希,通过哼唱midi和音频进行查询
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4518093
M. Ryynänen, Anssi Klapuri
This paper proposes a query by humming method based on locality sensitive hashing (LSH). The method constructs an index of melodic fragments by extracting pitch vectors from a database of melodies. In retrieval, the method automatically transcribes a sung query into notes and then extracts pitch vectors similarly to the index construction. For each query pitch vector, the method searches for similar melodic fragments in the database to obtain a list of candidate melodies. This is performed efficiently by using LSH. The candidate melodies are ranked by their distance to the entire query and returned to the user. In our experiments, the method achieved mean reciprocal rank of 0.885 for 2797 queries when searching from a database of 6030 MIDI melodies. To retrieve audio signals, we apply an automatic melody transcription method to construct the melody database directly from music recordings and report the corresponding retrieval results.
提出了一种基于局部敏感哈希(LSH)的蜂鸣式查询方法。该方法通过从旋律数据库中提取音高向量,构建旋律片段索引。在检索中,该方法自动将一个已唱的查询转录到音符中,然后提取与索引构造类似的音高向量。对于每个查询音高向量,该方法在数据库中搜索相似的旋律片段,以获得候选旋律列表。这可以通过使用LSH有效地执行。候选旋律根据它们到整个查询的距离进行排序,并返回给用户。在我们的实验中,当从6030个MIDI旋律的数据库中搜索时,该方法对2797个查询的平均倒数秩为0.885。为了检索音频信号,我们采用旋律自动转录的方法直接从音乐录音中构建旋律数据库,并报告相应的检索结果。
{"title":"Query by humming of midi and audio using locality sensitive hashing","authors":"M. Ryynänen, Anssi Klapuri","doi":"10.1109/ICASSP.2008.4518093","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518093","url":null,"abstract":"This paper proposes a query by humming method based on locality sensitive hashing (LSH). The method constructs an index of melodic fragments by extracting pitch vectors from a database of melodies. In retrieval, the method automatically transcribes a sung query into notes and then extracts pitch vectors similarly to the index construction. For each query pitch vector, the method searches for similar melodic fragments in the database to obtain a list of candidate melodies. This is performed efficiently by using LSH. The candidate melodies are ranked by their distance to the entire query and returned to the user. In our experiments, the method achieved mean reciprocal rank of 0.885 for 2797 queries when searching from a database of 6030 MIDI melodies. To retrieve audio signals, we apply an automatic melody transcription method to construct the melody database directly from music recordings and report the corresponding retrieval results.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115639903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 108
Localization of chemical sources using stochastic differential equations 利用随机微分方程的化学源定位
Pub Date : 2008-05-12 DOI: 10.1109/ICASSP.2008.4518174
Ashraf Atalla, A. Jeremic
Localization of chemical sources and prediction of their spread is an important issue in many applications. We propose computationally efficient framework for localizing low-intensity chemical sources using stochastic differential equations. The main advantage of this technique lies in the fact that it accounts for random effects such as Brownian motion which are not accounted for in commonly used classical techniques based on Fick's law of diffusion. We model the dispersion using Fokker-Planck equation and derive corresponding inverse model. We then derive maximum likelihood estimator of source intensity, location and release time. We demonstrate the applicability of our results using numerical examples.
在许多应用中,化学源的定位和扩散预测是一个重要的问题。我们提出了一个计算效率高的框架,用于使用随机微分方程来定位低强度化学源。这种技术的主要优点在于,它考虑了布朗运动等随机效应,而这在基于菲克扩散定律的常用经典技术中是没有考虑到的。利用Fokker-Planck方程对色散进行建模,并推导出相应的逆模型。然后推导出源强度、位置和释放时间的最大似然估计。用数值算例说明了所得结果的适用性。
{"title":"Localization of chemical sources using stochastic differential equations","authors":"Ashraf Atalla, A. Jeremic","doi":"10.1109/ICASSP.2008.4518174","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518174","url":null,"abstract":"Localization of chemical sources and prediction of their spread is an important issue in many applications. We propose computationally efficient framework for localizing low-intensity chemical sources using stochastic differential equations. The main advantage of this technique lies in the fact that it accounts for random effects such as Brownian motion which are not accounted for in commonly used classical techniques based on Fick's law of diffusion. We model the dispersion using Fokker-Planck equation and derive corresponding inverse model. We then derive maximum likelihood estimator of source intensity, location and release time. We demonstrate the applicability of our results using numerical examples.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115717403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2008 IEEE International Conference on Acoustics, Speech and Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1