首页 > 最新文献

2020 28th European Signal Processing Conference (EUSIPCO)最新文献

英文 中文
Robust Acoustic Scene Classification to Multiple Devices Using Maximum Classifier Discrepancy and Knowledge Distillation 基于最大分类器差异和知识蒸馏的多设备声场景鲁棒分类
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287734
Saori Takeyama, Tatsuya Komatsu, Koichi Miyazaki, M. Togami, Shunsuke Ono
This paper proposes robust acoustic scene classification (ASC) to multiple devices using maximum classifier discrepancy (MCD) and knowledge distillation (KD). The proposed method employs domain adaptation to train multiple ASC models dedicated to each device and combines these multiple device-specific models using a KD technique into a multi-domain ASC model. For domain adaptation, the proposed method utilizes MCD to align class distributions that conventional DA for ASC methods have ignored. The multi-device robust ASC model is obtained by KD, combining the multiple device-specific ASC models by MCD that may have a lower performance for non-target devices. Our experiments show that the proposed MCD-based device-specific model improved ASC accuracy by at most 12.22% for target samples, and the proposed KD-based device-general model improved ASC accuracy by 2.13% on average for all devices.
基于最大分类器差异(MCD)和知识蒸馏(KD),提出了多设备的鲁棒声场景分类方法。该方法采用域自适应的方法来训练多个专用于每个设备的ASC模型,并使用KD技术将这些多个特定于设备的模型组合成一个多域ASC模型。在领域自适应方面,本文提出的方法利用MCD来对齐类分布,这是传统的用于ASC的DA方法所忽略的。多设备鲁棒ASC模型是通过KD获得的,结合MCD的多设备特定ASC模型,这些模型对于非目标设备可能具有较低的性能。我们的实验表明,所提出的基于mcd的设备特定模型对目标样本的ASC精度提高了最多12.22%,而所提出的基于kd的设备通用模型对所有设备的ASC精度平均提高了2.13%。
{"title":"Robust Acoustic Scene Classification to Multiple Devices Using Maximum Classifier Discrepancy and Knowledge Distillation","authors":"Saori Takeyama, Tatsuya Komatsu, Koichi Miyazaki, M. Togami, Shunsuke Ono","doi":"10.23919/Eusipco47968.2020.9287734","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287734","url":null,"abstract":"This paper proposes robust acoustic scene classification (ASC) to multiple devices using maximum classifier discrepancy (MCD) and knowledge distillation (KD). The proposed method employs domain adaptation to train multiple ASC models dedicated to each device and combines these multiple device-specific models using a KD technique into a multi-domain ASC model. For domain adaptation, the proposed method utilizes MCD to align class distributions that conventional DA for ASC methods have ignored. The multi-device robust ASC model is obtained by KD, combining the multiple device-specific ASC models by MCD that may have a lower performance for non-target devices. Our experiments show that the proposed MCD-based device-specific model improved ASC accuracy by at most 12.22% for target samples, and the proposed KD-based device-general model improved ASC accuracy by 2.13% on average for all devices.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"14 1","pages":"36-40"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88907855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Faster independent low-rank matrix analysis with pairwise updates of demixing vectors 更快的独立低秩矩阵分析与分解向量的两两更新
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287508
Taishi Nakashima, Robin Scheibler, Yukoh Wakabayashi, Nobutaka Ono
In this paper, we present an algorithm for independent low-rank matrix analysis (ILRMA) of three or more sources that is faster than that for conventional ILRMA. In conventional ILRMA, demixing vectors are updated one by one by the iterative projection (IP) method. The update rules of IP are derived from a system of quadratic equations obtained by differentiating the objective function of ILRMA with respect to demixing vectors. This system of quadratic equations is called hybrid exact-approximate joint diagonalization (HEAD) and no closed-form solution is known yet for three or more sources. Recently, a method that can update two demixing vectors simultaneously has been proposed for independent vector analysis. The method is derived by reducing HEAD for two sources to a generalized eigenvalue problem and solving the problem. Furthermore, the pairwise updates have recently been extended to the case of three or more sources. However, the efficacy of the pairwise updates for ILRMA has not yet been investigated. Therefore, in this work, we apply the pairwise updates of demixing vectors to ILRMA. By replacing the update rules of demixing vectors with the proposed pairwise updates, we accelerate the convergence of ILRMA. The experimental results show that the proposed method yields faster convergence and better performance than conventional ILRMA.
本文提出了一种三源或多源独立低秩矩阵分析(ILRMA)算法,该算法比传统的ILRMA算法更快。在传统的ILRMA中,分解向量是通过迭代投影(IP)法逐个更新的。通过对ILRMA的目标函数对解混向量求导得到一个二次方程组,推导出IP的更新规则。这种二次方程系统被称为混合精确近似联合对角化(HEAD),对于三个或更多的源,目前还没有已知的闭型解。近年来,提出了一种同时更新两个分离矢量的独立矢量分析方法。该方法通过将两个源的HEAD简化为一个广义特征值问题并求解得到。此外,成对更新最近已扩展到三个或更多来源的情况。然而,对ILRMA的成对更新的有效性尚未进行研究。因此,在这项工作中,我们将分解向量的成对更新应用于ILRMA。通过将分解向量的更新规则替换为所提出的两两更新规则,加快了ILRMA的收敛速度。实验结果表明,该方法比传统的ILRMA具有更快的收敛速度和更好的性能。
{"title":"Faster independent low-rank matrix analysis with pairwise updates of demixing vectors","authors":"Taishi Nakashima, Robin Scheibler, Yukoh Wakabayashi, Nobutaka Ono","doi":"10.23919/Eusipco47968.2020.9287508","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287508","url":null,"abstract":"In this paper, we present an algorithm for independent low-rank matrix analysis (ILRMA) of three or more sources that is faster than that for conventional ILRMA. In conventional ILRMA, demixing vectors are updated one by one by the iterative projection (IP) method. The update rules of IP are derived from a system of quadratic equations obtained by differentiating the objective function of ILRMA with respect to demixing vectors. This system of quadratic equations is called hybrid exact-approximate joint diagonalization (HEAD) and no closed-form solution is known yet for three or more sources. Recently, a method that can update two demixing vectors simultaneously has been proposed for independent vector analysis. The method is derived by reducing HEAD for two sources to a generalized eigenvalue problem and solving the problem. Furthermore, the pairwise updates have recently been extended to the case of three or more sources. However, the efficacy of the pairwise updates for ILRMA has not yet been investigated. Therefore, in this work, we apply the pairwise updates of demixing vectors to ILRMA. By replacing the update rules of demixing vectors with the proposed pairwise updates, we accelerate the convergence of ILRMA. The experimental results show that the proposed method yields faster convergence and better performance than conventional ILRMA.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"40 1","pages":"301-305"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87612643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Provably Accurate Algorithm for Recovering Compactly Supported Smooth Functions from Spectrogram Measurements 从谱图测量中恢复紧支持光滑函数的一种可证明的精确算法
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287698
Michael Perlmutter, N. Sissouno, A. Viswanathan, M. Iwen
We present an algorithm which is closely related to direct phase retrieval methods that have been shown to work well empirically [1], [2] and prove that it is guaranteed to recover (up to a global phase) a large class of compactly supported smooth functions from their spectrogram measurements. As a result, we take a first step toward developing a new class of practical phaseless imaging algorithms capable of producing provably accurate images of a given sample after it is masked by just a few shifts of a fixed periodic grating.
我们提出了一种与直接相位检索方法密切相关的算法,该方法已被证明在经验上工作得很好[1],[2],并证明它保证从它们的谱图测量中恢复(直到全局相位)一大类紧支持平滑函数。因此,我们朝着开发一种新型实用的无相成像算法迈出了第一步,这种算法能够在被固定周期光栅的几个移位掩盖后产生可证明的精确图像。
{"title":"A Provably Accurate Algorithm for Recovering Compactly Supported Smooth Functions from Spectrogram Measurements","authors":"Michael Perlmutter, N. Sissouno, A. Viswanathan, M. Iwen","doi":"10.23919/Eusipco47968.2020.9287698","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287698","url":null,"abstract":"We present an algorithm which is closely related to direct phase retrieval methods that have been shown to work well empirically [1], [2] and prove that it is guaranteed to recover (up to a global phase) a large class of compactly supported smooth functions from their spectrogram measurements. As a result, we take a first step toward developing a new class of practical phaseless imaging algorithms capable of producing provably accurate images of a given sample after it is masked by just a few shifts of a fixed periodic grating.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"35 1","pages":"970-974"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90328637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Memory Requirement Reduction of Deep Neural Networks for Field Programmable Gate Arrays Using Low-Bit Quantization of Parameters 基于低比特参数量化的现场可编程门阵列深度神经网络内存需求降低
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287739
Niccoló Nicodemo, Gaurav Naithani, K. Drossos, T. Virtanen, R. Saletti
Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems, like field programmable gate arrays, is hampered by requirements for memory and computational power. In this paper we propose a method that employs a non-uniform fixed-point quantization and a virtual bit shift (VBS) to improve the accuracy of the quantization of the DNN weights. We evaluate our method in a speech enhancement application, where a fully connected DNN is used to predict the clean speech spectrum from the input noisy speech spectrum. A DNN is optimized, its memory requirement is calculated, and its performance is evaluated using the short-time objective intelligibility (STOI) metric. The application of the low-bit quantization leads to a 50% reduction of the DNN memory requirement while the STOI performance drops only by 2.7%.
深度神经网络(dnn)在移动设备和嵌入式系统(如现场可编程门阵列)中的有效应用受到内存和计算能力要求的阻碍。本文提出了一种采用非均匀不动点量化和虚拟位移位(VBS)的方法来提高深度神经网络权重量化的精度。我们在语音增强应用中评估了我们的方法,其中使用全连接DNN从输入噪声语音频谱中预测干净的语音频谱。对深度神经网络进行优化,计算其内存需求,并使用短时客观可理解度(STOI)指标评估其性能。低比特量化的应用导致DNN内存需求降低50%,而STOI性能仅下降2.7%。
{"title":"Memory Requirement Reduction of Deep Neural Networks for Field Programmable Gate Arrays Using Low-Bit Quantization of Parameters","authors":"Niccoló Nicodemo, Gaurav Naithani, K. Drossos, T. Virtanen, R. Saletti","doi":"10.23919/Eusipco47968.2020.9287739","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287739","url":null,"abstract":"Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems, like field programmable gate arrays, is hampered by requirements for memory and computational power. In this paper we propose a method that employs a non-uniform fixed-point quantization and a virtual bit shift (VBS) to improve the accuracy of the quantization of the DNN weights. We evaluate our method in a speech enhancement application, where a fully connected DNN is used to predict the clean speech spectrum from the input noisy speech spectrum. A DNN is optimized, its memory requirement is calculated, and its performance is evaluated using the short-time objective intelligibility (STOI) metric. The application of the low-bit quantization leads to a 50% reduction of the DNN memory requirement while the STOI performance drops only by 2.7%.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"50 1","pages":"466-470"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86000385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PRNU-leaks: facts and remedies prnu泄漏:事实和补救措施
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287451
F. Pérez-González, Samuel Fernández-Menduiña
We address the problem of information leakage from estimates of the PhotoResponse Non-Uniformity (PRNU) fingerprints of a sensor. This leakage may compromise privacy in forensic scenarios, as it may reveal information from the images used in the PRNU estimation. We propose a new way to compute the information-theoretic leakage that is based on embedding synthetic PRNUs, and presesent affordable approximations and bounds. We also propose a new compact measure for the performance in membership inference tests. Finally, we analyze two potential countermeasures against leakage: binarization, which was already used in PRNU-storage contexts, and equalization, which is novel and offers better performance. Theoretical results are validated with experiments carried out on a real-world image dataset.
我们解决了传感器的光响应非均匀性(PRNU)指纹估计的信息泄漏问题。这种泄漏可能会损害取证场景中的隐私,因为它可能会泄露PRNU估计中使用的图像中的信息。我们提出了一种基于嵌入合成PRNUs的计算信息论泄漏的新方法,并给出了可承受的近似值和边界。我们还提出了一种新的精简度量方法来衡量隶属推理测试的性能。最后,我们分析了两种潜在的防泄漏对策:二值化和均衡。二值化是一种已经在prnu存储环境中使用的方法,而均衡是一种新颖且性能更好的方法。在实际图像数据集上进行了实验,验证了理论结果。
{"title":"PRNU-leaks: facts and remedies","authors":"F. Pérez-González, Samuel Fernández-Menduiña","doi":"10.23919/Eusipco47968.2020.9287451","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287451","url":null,"abstract":"We address the problem of information leakage from estimates of the PhotoResponse Non-Uniformity (PRNU) fingerprints of a sensor. This leakage may compromise privacy in forensic scenarios, as it may reveal information from the images used in the PRNU estimation. We propose a new way to compute the information-theoretic leakage that is based on embedding synthetic PRNUs, and presesent affordable approximations and bounds. We also propose a new compact measure for the performance in membership inference tests. Finally, we analyze two potential countermeasures against leakage: binarization, which was already used in PRNU-storage contexts, and equalization, which is novel and offers better performance. Theoretical results are validated with experiments carried out on a real-world image dataset.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"34 1","pages":"720-724"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86034936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
CNN-based Note Onset Detection using Synthetic Data Augmentation 基于cnn的基于合成数据增强的音符起始检测
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287621
Mina Mounir, P. Karsmakers, T. Waterschoot
Detecting the onset of notes in music excerpts is a fundamental problem in many music signal processing tasks, including analysis, synthesis, and information retrieval. When addressing the note onset detection (NOD) problem using a data-driven methodology, a major challenge is the availability and quality of labeled datasets used for both model training/tuning and evaluation. As most of the available datasets are manually annotated, the amount of annotated music excerpts is limited and the annotation strategy and quality varies across data sets. To counter both problems, in this paper we propose to use semi-synthetic datasets where the music excerpts are mixes of isolated note recordings. The advantage resides in the annotations being automatically generated while mixing the notes, as isolated note onsets are straightforward to detect using a simple energy measure. A semi-synthetic dataset is used in this work for augmenting a real piano dataset when training a convolutional Neural Network (CNN) with three novel model training strategies. Training the CNN on a semi-synthetic dataset and retraining only the CNN classification layers on a real dataset results in higher average F1-score (F1) scores with lower variance.
在许多音乐信号处理任务中,包括分析、合成和信息检索,检测音乐片段中音符的开始是一个基本问题。当使用数据驱动的方法解决音符起始检测(NOD)问题时,主要的挑战是用于模型训练/调优和评估的标记数据集的可用性和质量。由于大多数可用的数据集都是手动注释的,因此注释的音乐节选数量有限,并且注释策略和质量因数据集而异。为了解决这两个问题,在本文中,我们建议使用半合成数据集,其中音乐摘录是孤立音符录音的混合。它的优点在于,注释是在混合音符时自动生成的,因为使用简单的能量度量可以直接检测到孤立的音符发作。本研究使用半合成数据集来增强真实钢琴数据集,并使用三种新颖的模型训练策略训练卷积神经网络(CNN)。在半合成数据集上训练CNN,在真实数据集上只训练CNN分类层,结果是平均F1得分更高,方差更小。
{"title":"CNN-based Note Onset Detection using Synthetic Data Augmentation","authors":"Mina Mounir, P. Karsmakers, T. Waterschoot","doi":"10.23919/Eusipco47968.2020.9287621","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287621","url":null,"abstract":"Detecting the onset of notes in music excerpts is a fundamental problem in many music signal processing tasks, including analysis, synthesis, and information retrieval. When addressing the note onset detection (NOD) problem using a data-driven methodology, a major challenge is the availability and quality of labeled datasets used for both model training/tuning and evaluation. As most of the available datasets are manually annotated, the amount of annotated music excerpts is limited and the annotation strategy and quality varies across data sets. To counter both problems, in this paper we propose to use semi-synthetic datasets where the music excerpts are mixes of isolated note recordings. The advantage resides in the annotations being automatically generated while mixing the notes, as isolated note onsets are straightforward to detect using a simple energy measure. A semi-synthetic dataset is used in this work for augmenting a real piano dataset when training a convolutional Neural Network (CNN) with three novel model training strategies. Training the CNN on a semi-synthetic dataset and retraining only the CNN classification layers on a real dataset results in higher average F1-score (F1) scores with lower variance.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"89 1","pages":"171-175"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74046757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Generic Compression of Off-The-Air Radio Frequency Signals with Grouped-Bin FFT Quantisation 分组码FFT量化对射频信号的通用压缩
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287457
D. Muir, L. Crockett, R. Stewart
This paper studies the capabilities of a proposed lossy, grouped-bin FFT quantisation compression method for targeting Off-The-Air (OTA) Radio Frequency (RF) signals. The bins within a 512-point Fast Fourier Transform (FFT) are split into groups of adjacent bins, and these groups are each quantised separately. Additional compression can be achieved by setting groups which are not deemed to contain significant information to zero, based on a pre-defined minimum magnitude threshold. In this paper, we propose two alternative methods for quantising the remaining groups. The first of these, Grouped-bin FFT Threshold Quantisation (GFTQ), involves allocating quantisation wordlengths based on several pre-defined magnitude thresholds. The second, Grouped-bin FFT Error Quantisation (GFEQ), involves incrementing the quantisation wordlength for each group until the calculated quantisation error falls below a minimum error threshold. Both algorithms were tested for a variety of signal types, including Digital Private Mobile Radio 446 MHz (dPMR446), which was considered as a case study. While GFTQ allowed for higher Compression Ratios (CR), the compression process resulted in added quantisation noise. The GFEQ algorithm achieved lower CRs, but also lower noise levels across all test signals.
本文研究了一种针对空中(OTA)射频(RF)信号的有损分组分组FFT量化压缩方法的性能。512点快速傅里叶变换(FFT)中的箱子被分成相邻的箱子组,这些组分别被量化。通过基于预定义的最小幅度阈值,将不被认为包含重要信息的组设置为零,可以实现额外的压缩。在本文中,我们提出了两种可供选择的方法来量化剩余群。第一种是分组FFT阈值量化(GFTQ),它涉及到基于几个预定义的幅度阈值分配量化字长。第二种方法是分组分组FFT误差量化(GFEQ),包括增加每组的量化字长,直到计算出的量化误差低于最小误差阈值。这两种算法都针对各种信号类型进行了测试,包括作为案例研究的数字专用移动无线电446 MHz (dPMR446)。虽然GFTQ允许更高的压缩比(CR),但压缩过程会增加量化噪声。GFEQ算法实现了较低的CRs,同时也降低了所有测试信号的噪声水平。
{"title":"Generic Compression of Off-The-Air Radio Frequency Signals with Grouped-Bin FFT Quantisation","authors":"D. Muir, L. Crockett, R. Stewart","doi":"10.23919/Eusipco47968.2020.9287457","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287457","url":null,"abstract":"This paper studies the capabilities of a proposed lossy, grouped-bin FFT quantisation compression method for targeting Off-The-Air (OTA) Radio Frequency (RF) signals. The bins within a 512-point Fast Fourier Transform (FFT) are split into groups of adjacent bins, and these groups are each quantised separately. Additional compression can be achieved by setting groups which are not deemed to contain significant information to zero, based on a pre-defined minimum magnitude threshold. In this paper, we propose two alternative methods for quantising the remaining groups. The first of these, Grouped-bin FFT Threshold Quantisation (GFTQ), involves allocating quantisation wordlengths based on several pre-defined magnitude thresholds. The second, Grouped-bin FFT Error Quantisation (GFEQ), involves incrementing the quantisation wordlength for each group until the calculated quantisation error falls below a minimum error threshold. Both algorithms were tested for a variety of signal types, including Digital Private Mobile Radio 446 MHz (dPMR446), which was considered as a case study. While GFTQ allowed for higher Compression Ratios (CR), the compression process resulted in added quantisation noise. The GFEQ algorithm achieved lower CRs, but also lower noise levels across all test signals.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"1767-1771"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74000525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Computational Approach to Track Beats in Improvisational Music Performance 在即兴音乐表演中追踪节拍的计算方法
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287444
Xianghui Xie, Jared Houghtaling, K. Foubert, T. Waterschoot
Beat tracking, or identifying the temporal locations of beats in a musical recording, has a variety of applications that range from music information retrieval to machine listening. Algorithms designed to monitor the tempo of a musical recording have thus far been optimized for music with relatively stable rhythms, repetitive structures, and consistent melodies; these algorithms typically struggle to follow the free-form nature of improvisational music. Here, we present a multi-agent improvisation beat tracker (MAIBT) that addresses the challenges posed by improvisations and compare its performance with other state-of-the-art methods on a unique data set collected during improvisational music therapy sessions. This algorithm is designed for MIDI files and proceeds in four stages: (1) preprocessing to remove notes that are timid and overlapping, (2) clustering of the remaining notes and subsequent ranking of the clusters, (3) agent initialization and performance-based selection, and (4) artificial beat insertion and deletion to fill remaining beat gaps and create a comprehensive beat sequence. This particular method performs better than other generic beat-tracking approaches for music that lacks regularity; it is thus well suited to applications where unpredictability and inaccuracy are predominant, such as in music therapy improvisation.
节拍跟踪,或识别音乐录音中节拍的时间位置,有各种各样的应用,从音乐信息检索到机器收听。迄今为止,用于监控音乐录制速度的算法已经针对节奏相对稳定、结构重复和旋律一致的音乐进行了优化;这些算法通常很难遵循即兴音乐的自由形式。在这里,我们提出了一个多智能体即兴节拍跟踪器(MAIBT),它解决了即兴演奏带来的挑战,并在即兴音乐治疗期间收集的独特数据集上将其性能与其他最先进的方法进行了比较。该算法是针对MIDI文件设计的,分为四个阶段:(1)预处理,去除弱音和重叠的音符;(2)对剩余音符进行聚类并对聚类进行排序;(3)初始化代理并基于性能进行选择;(4)人工插入和删除节拍,填补剩余节拍的空白,创建一个完整的节拍序列。对于缺乏规律性的音乐,这种特殊的方法比其他一般的节拍跟踪方法表现得更好;因此,它非常适合于不可预测性和不准确性占主导地位的应用,例如音乐治疗即兴创作。
{"title":"Computational Approach to Track Beats in Improvisational Music Performance","authors":"Xianghui Xie, Jared Houghtaling, K. Foubert, T. Waterschoot","doi":"10.23919/Eusipco47968.2020.9287444","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287444","url":null,"abstract":"Beat tracking, or identifying the temporal locations of beats in a musical recording, has a variety of applications that range from music information retrieval to machine listening. Algorithms designed to monitor the tempo of a musical recording have thus far been optimized for music with relatively stable rhythms, repetitive structures, and consistent melodies; these algorithms typically struggle to follow the free-form nature of improvisational music. Here, we present a multi-agent improvisation beat tracker (MAIBT) that addresses the challenges posed by improvisations and compare its performance with other state-of-the-art methods on a unique data set collected during improvisational music therapy sessions. This algorithm is designed for MIDI files and proceeds in four stages: (1) preprocessing to remove notes that are timid and overlapping, (2) clustering of the remaining notes and subsequent ranking of the clusters, (3) agent initialization and performance-based selection, and (4) artificial beat insertion and deletion to fill remaining beat gaps and create a comprehensive beat sequence. This particular method performs better than other generic beat-tracking approaches for music that lacks regularity; it is thus well suited to applications where unpredictability and inaccuracy are predominant, such as in music therapy improvisation.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"11 2 1","pages":"166-170"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72731551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-Class based learning for Hybrid Spectrum Sensing in Cognitive Radio 基于一类学习的认知无线电混合频谱感知
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287326
M. Jaber, A. Nasser, N. Charara, A. Mansour, K. Yao
The main aim of the Spectrum Sensing (SS) in a Cognitive Radio system is to distinguish between the binary hypotheses H0: Primary User (PU) is absent and H1: PU is active. In this paper, Machine Learning (ML)-based hybrid Spectrum Sensing (SS) scheme is proposed. The scattering of the Test Statistics (TSs) of two detectors is used in the learning and prediction phases. As the SS decision is binary, the proposed scheme requires the learning of only the boundaries of H0-class in order to make a decision on the PU status: active or idle. Thus, a set of data generated under H0 hypothesis is used to train the detection system. Accordingly, unlike the existing ML-based schemes of the literature, no PU statistical parameters are required. In order to discriminate between H0-class and elsewhere, we used a one-class classification approach that is inspired by the Isolation Forest algorithm. Extensive simulations are done in order to investigate the efficiency of such hybrid SS and the impact of the novelty detection model parameters on the detection performance. Indeed, these simulations corroborate the efficiency of the proposed one-class learning of the hybrid SS system.
认知无线电系统中频谱感知(SS)的主要目的是区分二元假设H0:主用户(PU)不存在和H1:主用户活跃。提出了一种基于机器学习(ML)的混合频谱感知(SS)方案。在学习和预测阶段利用了两个检测器的测试统计量的散射。由于SS决策是二元的,因此所提出的方案只需要学习h0类的边界,就可以决定PU的状态是活动还是空闲。因此,使用在H0假设下产生的一组数据来训练检测系统。因此,与文献中现有的基于ml的方案不同,不需要PU统计参数。为了区分h0级和其他级别,我们使用了受隔离森林算法启发的单类分类方法。为了研究这种混合SS的效率以及新颖性检测模型参数对检测性能的影响,进行了大量的仿真研究。实际上,这些仿真验证了所提出的混合SS系统单类学习的有效性。
{"title":"One-Class based learning for Hybrid Spectrum Sensing in Cognitive Radio","authors":"M. Jaber, A. Nasser, N. Charara, A. Mansour, K. Yao","doi":"10.23919/Eusipco47968.2020.9287326","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287326","url":null,"abstract":"The main aim of the Spectrum Sensing (SS) in a Cognitive Radio system is to distinguish between the binary hypotheses H0: Primary User (PU) is absent and H1: PU is active. In this paper, Machine Learning (ML)-based hybrid Spectrum Sensing (SS) scheme is proposed. The scattering of the Test Statistics (TSs) of two detectors is used in the learning and prediction phases. As the SS decision is binary, the proposed scheme requires the learning of only the boundaries of H0-class in order to make a decision on the PU status: active or idle. Thus, a set of data generated under H0 hypothesis is used to train the detection system. Accordingly, unlike the existing ML-based schemes of the literature, no PU statistical parameters are required. In order to discriminate between H0-class and elsewhere, we used a one-class classification approach that is inspired by the Isolation Forest algorithm. Extensive simulations are done in order to investigate the efficiency of such hybrid SS and the impact of the novelty detection model parameters on the detection performance. Indeed, these simulations corroborate the efficiency of the proposed one-class learning of the hybrid SS system.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"1683-1686"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77083788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Audio-Visual Speech Classification based on Absent Class Detection 基于缺席类检测的视听语音分类
Pub Date : 2021-01-24 DOI: 10.23919/Eusipco47968.2020.9287615
G. D. Sad, J. Gómez
In the present paper, a novel method for Audio-Visual Speech Recognition is introduced, aiming to minimize the intra-class errors. Based on a novel training procedure, the Complementary Models are introduced. These models aim to detect the absence of a class, in contrast to traditional models that aim to detect the presence of a class. In the proposed method, traditional models are employed in the first stage of a cascade scheme, and then the proposed complementary models are used to make the final decision on the recognition results. Experimental results in all the scenarios evaluated (different inputs modalities, three databases, four classifiers, and acoustic noisy conditions), show that a good performance is achieved with the proposed scheme. Also, better results than other reported methods in the literature over two public databases are achieved.
本文提出了一种新的视听语音识别方法,旨在最大限度地减少类内误差。基于一种新的训练过程,引入了互补模型。这些模型的目标是检测类的缺失,而传统模型的目标是检测类的存在。在该方法中,首先在级联方案的第一阶段使用传统模型,然后使用所提出的补充模型对识别结果进行最终决策。在不同输入方式、三种数据库、四种分类器和噪声条件下的实验结果表明,该方法具有良好的性能。此外,在两个公共数据库上取得了比其他文献报道的方法更好的结果。
{"title":"Audio-Visual Speech Classification based on Absent Class Detection","authors":"G. D. Sad, J. Gómez","doi":"10.23919/Eusipco47968.2020.9287615","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287615","url":null,"abstract":"In the present paper, a novel method for Audio-Visual Speech Recognition is introduced, aiming to minimize the intra-class errors. Based on a novel training procedure, the Complementary Models are introduced. These models aim to detect the absence of a class, in contrast to traditional models that aim to detect the presence of a class. In the proposed method, traditional models are employed in the first stage of a cascade scheme, and then the proposed complementary models are used to make the final decision on the recognition results. Experimental results in all the scenarios evaluated (different inputs modalities, three databases, four classifiers, and acoustic noisy conditions), show that a good performance is achieved with the proposed scheme. Also, better results than other reported methods in the literature over two public databases are achieved.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"77 1","pages":"336-340"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80987401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2020 28th European Signal Processing Conference (EUSIPCO)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1