首页 > 最新文献

2022 30th European Signal Processing Conference (EUSIPCO)最新文献

英文 中文
FRISPEE: FRI-Based Single Image Super-Resolution with Deep Recursive Residual Network 基于深度递归残差网络的单图像超分辨率FRISPEE
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909646
Renke Wang, Jun-Jie Huang, P. Dragotti
In this paper, we propose a novel single image super-resolution algorithm that integrates a model-based approach with self-learning deep networks. The proposed method can be adapted to low-resolution (LR) images obtained with real acquisition devices where the point spread function is Gaussian-like. By modelling natural image lines as piece-wise smooth functions and approximating the blurring kernel with B-splines, an intermediate high-resolution (HR) image can be first obtained based on Finite Rate of Innovation theory. A self-supervised deep recursive residual network is then applied to further enhance the reconstruction quality. From the simulation results, our algorithm outperforms other self-learning algorithms and achieves state-of-the-art performance.
在本文中,我们提出了一种新的单图像超分辨率算法,该算法将基于模型的方法与自学习深度网络相结合。该方法可以适用于用实际采集设备获得的低分辨率图像,其中点扩展函数为高斯函数。将自然图像线建模为分段平滑函数,并用b样条逼近模糊核,首先基于有限创新率理论获得中分辨率图像。然后采用自监督深度递归残差网络进一步提高重构质量。从仿真结果来看,我们的算法优于其他自学习算法,达到了最先进的性能。
{"title":"FRISPEE: FRI-Based Single Image Super-Resolution with Deep Recursive Residual Network","authors":"Renke Wang, Jun-Jie Huang, P. Dragotti","doi":"10.23919/eusipco55093.2022.9909646","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909646","url":null,"abstract":"In this paper, we propose a novel single image super-resolution algorithm that integrates a model-based approach with self-learning deep networks. The proposed method can be adapted to low-resolution (LR) images obtained with real acquisition devices where the point spread function is Gaussian-like. By modelling natural image lines as piece-wise smooth functions and approximating the blurring kernel with B-splines, an intermediate high-resolution (HR) image can be first obtained based on Finite Rate of Innovation theory. A self-supervised deep recursive residual network is then applied to further enhance the reconstruction quality. From the simulation results, our algorithm outperforms other self-learning algorithms and achieves state-of-the-art performance.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"13 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122581994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Projected Newton-type Algorithm for Rank - revealing Nonnegative Block - Term Tensor Decomposition 揭示秩的非负块项张量分解的投影牛顿型算法
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909799
Eleftherios Kofidis, Paris V. Giampouras, A. Rontogiannis
The block-term tensor decomposition (BTD) model has been receiving increasing attention as a quite flexible way to capture the structure of 3-dimensional data that can be naturally viewed as the superposition of $R$ block terms of multilinear rank ($L_{r}, L_{r}, 1), r=1,2,ldots,R$. Versions with nonnegativity constraints, especially relevant in applications like blind source separation problems, have only recently been proposed and they all share the need to have an a-priori knowledge of the number of block terms, $R$, and their individual ranks, $L_{i}$. Clearly, the latter requirement may severely limit their practical applicability. Building upon earlier work of ours on unconstrained BTD model selection and computation, we develop for the first time in this paper a method for nonnegative BTD approximation that is also rank-revealing. The idea is to impose column sparsity jointly on the factors and successively estimate the ranks as the numbers of factor columns of non-negligible magnitude. This is effected with the aid of nonnegative alternating iteratively reweighted least squares, implemented via projected Newton updates for increased convergence rate and accuracy. Simulation results are reported that demonstrate the effectiveness of our method in accurately estimating both the ranks and the factors of the nonnegative least squares BTD approximation.
块项张量分解(BTD)模型作为一种非常灵活的捕获三维数据结构的方法而受到越来越多的关注,三维数据可以很自然地看作是多元线性秩($L_{R}, L_{R}, 1), R =1,2,ldots,R$的块项R$的叠加。具有非负性约束的版本,特别是与盲源分离问题等应用相关的版本,直到最近才被提出,它们都需要具有块项数量的先验知识,$R$和它们的单个秩,$L_{i}$。显然,后一项要求可能严重限制它们的实际适用性。在我们之前关于无约束BTD模型选择和计算的工作的基础上,我们在本文中首次开发了一种非负BTD近似方法,该方法也具有秩揭示性。其思想是对因子联合施加列稀疏性,并依次估计作为不可忽略量级的因子列的数量的秩。这是借助于非负交替迭代加权最小二乘实现的,通过投影牛顿更新实现,以提高收敛速度和精度。仿真结果表明,该方法能够准确估计非负最小二乘BTD近似的秩和因子。
{"title":"A Projected Newton-type Algorithm for Rank - revealing Nonnegative Block - Term Tensor Decomposition","authors":"Eleftherios Kofidis, Paris V. Giampouras, A. Rontogiannis","doi":"10.23919/eusipco55093.2022.9909799","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909799","url":null,"abstract":"The block-term tensor decomposition (BTD) model has been receiving increasing attention as a quite flexible way to capture the structure of 3-dimensional data that can be naturally viewed as the superposition of $R$ block terms of multilinear rank ($L_{r}, L_{r}, 1), r=1,2,ldots,R$. Versions with nonnegativity constraints, especially relevant in applications like blind source separation problems, have only recently been proposed and they all share the need to have an a-priori knowledge of the number of block terms, $R$, and their individual ranks, $L_{i}$. Clearly, the latter requirement may severely limit their practical applicability. Building upon earlier work of ours on unconstrained BTD model selection and computation, we develop for the first time in this paper a method for nonnegative BTD approximation that is also rank-revealing. The idea is to impose column sparsity jointly on the factors and successively estimate the ranks as the numbers of factor columns of non-negligible magnitude. This is effected with the aid of nonnegative alternating iteratively reweighted least squares, implemented via projected Newton updates for increased convergence rate and accuracy. Simulation results are reported that demonstrate the effectiveness of our method in accurately estimating both the ranks and the factors of the nonnegative least squares BTD approximation.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122770970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deep Unfolding in Multicell MU-MIMO 多单元MU-MIMO的深度展开
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909892
Lukas Schynol, M. Pesavento
The weighted sum-rate maximization in coordinated multicell MIMO networks with intra- and intercell interference and local channel state at the base stations is considered. Based on the concept of unrolling applied to the classical weighted minimum mean squared error (WMMSE) algorithm and ideas from graph signal processing, we present the GCN-WMMSE deep network architecture for transceiver design in multicell MU-MIMO interference channels with local channel state information. Similar to the original WMMSE algorithm it facilitates a distributed implementation in multicell networks. However, GCN-WMMSE significantly accelerates the convergence and con-sequently alleviates the communication overhead in a distributed deployment. Additionally, the architecture is agnostic to different wireless network topologies while exhibiting a low number of trainable parameters and high efficiency w.r.t. training data.
考虑了蜂窝内、蜂窝间干扰和基站本地信道状态下多蜂窝协同MIMO网络的加权和速率最大化问题。基于经典加权最小均方误差(WMMSE)算法的展开概念和图信号处理的思想,提出了一种适用于具有本地信道状态信息的多单元MU-MIMO干扰信道的GCN-WMMSE深度网络架构。与原始的WMMSE算法类似,它便于在多蜂窝网络中分布式实现。然而,GCN-WMMSE显著加快了收敛速度,从而减轻了分布式部署中的通信开销。此外,该体系结构对不同的无线网络拓扑不可知,同时显示出较少的可训练参数和高效率的w.r.t.训练数据。
{"title":"Deep Unfolding in Multicell MU-MIMO","authors":"Lukas Schynol, M. Pesavento","doi":"10.23919/eusipco55093.2022.9909892","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909892","url":null,"abstract":"The weighted sum-rate maximization in coordinated multicell MIMO networks with intra- and intercell interference and local channel state at the base stations is considered. Based on the concept of unrolling applied to the classical weighted minimum mean squared error (WMMSE) algorithm and ideas from graph signal processing, we present the GCN-WMMSE deep network architecture for transceiver design in multicell MU-MIMO interference channels with local channel state information. Similar to the original WMMSE algorithm it facilitates a distributed implementation in multicell networks. However, GCN-WMMSE significantly accelerates the convergence and con-sequently alleviates the communication overhead in a distributed deployment. Additionally, the architecture is agnostic to different wireless network topologies while exhibiting a low number of trainable parameters and high efficiency w.r.t. training data.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122826129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Non-Negative Kernel Graphs for Time-Varying Signals Using Visibility Graphs 利用可见性图求解时变信号的非负核图
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909594
Ecem Bozkurt, Antonio Ortega
We present a novel framework to represent sets of time-varying signals as dynamic graphs using the non-negative kernel (NNK) graph construction. We extend the original NNK framework to allow explicit delays as part of the graph construction, so that unlike in NNK, two nodes can be connected with an edge corresponding to a non-zero time delay, if there is higher similarity between the signals after shifting one of them. We also propose to characterize the similarity between signals at different nodes using the node degree and clustering coefficients of their respective visibility graphs. Graph edges that can representing temporal delays, we provide a new perspective that enables us to see the effect of synchronization in graph construction for time-series signals. For both temperature and EEG datasets, we show that our proposed approach can achieve sparse and interpretable graph representations. Furthermore, the proposed method can be useful in characterizing different EEG experiments using sparsity.
我们提出了一种新的框架,利用非负核(NNK)图构造将时变信号集表示为动态图。我们扩展了原始的NNK框架,允许显式延迟作为图构造的一部分,因此与NNK不同的是,如果两个节点在移动其中一个信号后具有更高的相似性,则两个节点可以通过对应于非零时间延迟的边连接。我们还提出了使用节点度和各自可见性图的聚类系数来表征不同节点信号之间的相似性。图边可以表示时间延迟,我们提供了一个新的视角,使我们能够看到同步在时间序列信号的图构建中的影响。对于温度和脑电图数据集,我们表明我们的方法可以实现稀疏和可解释的图表示。此外,该方法还可以利用稀疏度对不同的脑电信号实验进行表征。
{"title":"Non-Negative Kernel Graphs for Time-Varying Signals Using Visibility Graphs","authors":"Ecem Bozkurt, Antonio Ortega","doi":"10.23919/eusipco55093.2022.9909594","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909594","url":null,"abstract":"We present a novel framework to represent sets of time-varying signals as dynamic graphs using the non-negative kernel (NNK) graph construction. We extend the original NNK framework to allow explicit delays as part of the graph construction, so that unlike in NNK, two nodes can be connected with an edge corresponding to a non-zero time delay, if there is higher similarity between the signals after shifting one of them. We also propose to characterize the similarity between signals at different nodes using the node degree and clustering coefficients of their respective visibility graphs. Graph edges that can representing temporal delays, we provide a new perspective that enables us to see the effect of synchronization in graph construction for time-series signals. For both temperature and EEG datasets, we show that our proposed approach can achieve sparse and interpretable graph representations. Furthermore, the proposed method can be useful in characterizing different EEG experiments using sparsity.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122894405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Open-Access System for Long-Range Chainsaw Sound Detection 一种开放式远程电锯声检测系统
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909629
N. Stefanakis, Konstantinos Psaroulakis, Nikonas Simou, Christos Astaras
A pipeline for automatic detection of chainsaw events in audio recordings is presented as the means to detect illegal logging activity in a protected natural environment. We propose a two-step process that consists of an activity detector at the front end and a deep neural network (DNN) classifier at the back end. At the front end, we use the Summation or Residual Harmonics method in order to detect patterns with harmonic structure in the audio recording. Active audio segments are consequently fed to the classifier that decides upon the absence or presence of a chainsaw event. As acoustic feature, we propose the widely-used amplitude spectrogram, passing it through the recently proposed Per-Channel Energy Normalization (PCEN) process. Results based on real-field recordings illustrate that the proposed end-to-end system may efficiently detect low-SNR chainsaw events at a very low false detection rate.
提出了一种用于自动检测录音中电锯事件的管道,作为检测受保护自然环境中非法采伐活动的手段。我们提出了一个两步过程,由前端的活动检测器和后端的深度神经网络(DNN)分类器组成。在前端,我们使用求和或剩余谐波方法来检测音频记录中具有谐波结构的模式。因此,活动音频片段被馈送到分类器,该分类器决定是否存在链锯事件。作为声学特征,我们提出了广泛使用的振幅谱图,并将其通过最近提出的逐通道能量归一化(PCEN)过程。基于现场记录的结果表明,所提出的端到端系统可以以非常低的误检率有效地检测低信噪比链锯事件。
{"title":"An Open-Access System for Long-Range Chainsaw Sound Detection","authors":"N. Stefanakis, Konstantinos Psaroulakis, Nikonas Simou, Christos Astaras","doi":"10.23919/eusipco55093.2022.9909629","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909629","url":null,"abstract":"A pipeline for automatic detection of chainsaw events in audio recordings is presented as the means to detect illegal logging activity in a protected natural environment. We propose a two-step process that consists of an activity detector at the front end and a deep neural network (DNN) classifier at the back end. At the front end, we use the Summation or Residual Harmonics method in order to detect patterns with harmonic structure in the audio recording. Active audio segments are consequently fed to the classifier that decides upon the absence or presence of a chainsaw event. As acoustic feature, we propose the widely-used amplitude spectrogram, passing it through the recently proposed Per-Channel Energy Normalization (PCEN) process. Results based on real-field recordings illustrate that the proposed end-to-end system may efficiently detect low-SNR chainsaw events at a very low false detection rate.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114254128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binaural Wind-Noise Tracking with Steering Preset 双耳风噪声跟踪与转向预设
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909804
Stefan Thaleiser, G. Enzner
Optimal performance of many speech enhancement methods is bound to an accurate noise power-spectral density (PSD) estimation. While for stationary noises, such as the white Gaussian or car noise, several approaches have proven themselves to perform sufficiently good, non-stationary noise types like the wind noise are more challenging. In the binaural setting and in multichannel systems, the speech-blocking method is essential to recent developments for non-stationary noise estimation. It critically requires information of the acoustic channel transfer function from source to listener. In this paper, we propose such noise-subspace approach for wind-noise PSD estimation, which relies on data-driven blind channel identification in speech presence and on a-priori acoustic channel information (i.e., the steering preset) in speech pause, where the smooth transition of both is controlled by a-priori SNR. The algorithm is designed for entire online operation based on the current noisy frame input. It improves on straightforward recursive subspace analysis and on established single-channel estimation in the wind-noise scenario, while dealing well with speech presence or babble noise too.
许多语音增强方法的最佳性能取决于准确的噪声功率谱密度(PSD)估计。虽然对于平稳噪声,如白高斯噪声或汽车噪声,有几种方法已经证明自己表现得足够好,但像风噪声这样的非平稳噪声类型更具挑战性。在双耳环境和多声道系统中,语音阻塞方法是非平稳噪声估计的重要发展方向。它迫切需要声道从声源到听者传递函数的信息。在本文中,我们提出了这种用于风噪声PSD估计的噪声子空间方法,该方法在语音存在时依赖于数据驱动的盲信道识别,在语音暂停时依赖于先验声学信道信息(即转向预设),其中两者的平滑过渡由先验信噪比控制。该算法是基于当前有噪声帧输入的全在线运行算法。它改进了直接递归子空间分析和在风噪声场景下建立的单通道估计,同时也能很好地处理语音存在或呀呀学噪声。
{"title":"Binaural Wind-Noise Tracking with Steering Preset","authors":"Stefan Thaleiser, G. Enzner","doi":"10.23919/eusipco55093.2022.9909804","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909804","url":null,"abstract":"Optimal performance of many speech enhancement methods is bound to an accurate noise power-spectral density (PSD) estimation. While for stationary noises, such as the white Gaussian or car noise, several approaches have proven themselves to perform sufficiently good, non-stationary noise types like the wind noise are more challenging. In the binaural setting and in multichannel systems, the speech-blocking method is essential to recent developments for non-stationary noise estimation. It critically requires information of the acoustic channel transfer function from source to listener. In this paper, we propose such noise-subspace approach for wind-noise PSD estimation, which relies on data-driven blind channel identification in speech presence and on a-priori acoustic channel information (i.e., the steering preset) in speech pause, where the smooth transition of both is controlled by a-priori SNR. The algorithm is designed for entire online operation based on the current noisy frame input. It improves on straightforward recursive subspace analysis and on established single-channel estimation in the wind-noise scenario, while dealing well with speech presence or babble noise too.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122114439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Note-level Automatic Guitar Transcription Using Attention Mechanism 音符级自动吉他转录使用注意机制
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909659
Sehun Kim, Tomoki Hayashi, T. Toda
We propose a method that effectively generates a note-level transcription from a guitar sound signal. In recent years, there have been many successful guitar transcription systems. However, most of them generate a frame-level transcription rather than a note-level transcription. Furthermore, it is usually difficult to effectively model long-term characteristics. To address these problems, we propose a novel model architecture using an attention mechanism along with a convolutional neural network (CNN). Our model is capable of modeling both short-term and long-term characteristics of a guitar sound signal and a corresponding guitar transcription. A beat-informed quantization is implemented to generate a note-level transcription. Furthermore, multi-task learning with frame-level and note-level estimations is also implemented to achieve robust training. We conducted experimental evaluations on our method using a publicly available acoustic guitar dataset. We confirmed that 1) the proposed method significantly outperforms the conventional method based on a CNN in frame-level estimation performance and that 2) the proposed method can also generate note-level guitar transcription while preserving high estimation performance.
我们提出了一种方法,有效地从吉他声音信号产生音符级转录。近年来,有许多成功的吉他转录系统。然而,它们中的大多数生成帧级转录而不是音符级转录。此外,通常很难有效地模拟长期特征。为了解决这些问题,我们提出了一种使用注意机制和卷积神经网络(CNN)的新型模型架构。我们的模型能够模拟吉他声音信号的短期和长期特征以及相应的吉他转录。实现了节拍知情量化以生成音符级转录。此外,还实现了框架级和笔记级估计的多任务学习,以实现鲁棒性训练。我们使用公开可用的原声吉他数据集对我们的方法进行了实验评估。我们证实了1)所提出的方法在帧级估计性能上明显优于基于CNN的传统方法;2)所提出的方法在保持高估计性能的同时也可以生成音符级吉他转录。
{"title":"Note-level Automatic Guitar Transcription Using Attention Mechanism","authors":"Sehun Kim, Tomoki Hayashi, T. Toda","doi":"10.23919/eusipco55093.2022.9909659","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909659","url":null,"abstract":"We propose a method that effectively generates a note-level transcription from a guitar sound signal. In recent years, there have been many successful guitar transcription systems. However, most of them generate a frame-level transcription rather than a note-level transcription. Furthermore, it is usually difficult to effectively model long-term characteristics. To address these problems, we propose a novel model architecture using an attention mechanism along with a convolutional neural network (CNN). Our model is capable of modeling both short-term and long-term characteristics of a guitar sound signal and a corresponding guitar transcription. A beat-informed quantization is implemented to generate a note-level transcription. Furthermore, multi-task learning with frame-level and note-level estimations is also implemented to achieve robust training. We conducted experimental evaluations on our method using a publicly available acoustic guitar dataset. We confirmed that 1) the proposed method significantly outperforms the conventional method based on a CNN in frame-level estimation performance and that 2) the proposed method can also generate note-level guitar transcription while preserving high estimation performance.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129766260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Weighted Edit Distance for Country Code Recognition in License Plates 车牌国家代码识别的加权编辑距离
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909869
K. Chumachenko, Alexandros Iosifidis, M. Gabbouj
This paper presents the problem of country code recognition from li-cense plate images. We propose an approach based on character de-tection and subsequent clustering for country code localization. We further propose three weighted Edit Distance metrics for country of origin prediction from imperfect detections, namely based on char-acter similarity, detection confidence, and relative operation impor-tance. Experimental results show the benefit of proposed approaches on real-world data. The proposed method is lightweight and inde-pendent of the underlying object detector, facilitating its application on edge devices.
本文研究了车牌图像的国家代码识别问题。我们提出了一种基于字符检测和后续聚类的国家代码本地化方法。我们进一步提出了三个加权编辑距离指标,用于从不完全检测中预测原产国,即基于字符相似性、检测置信度和相对操作重要性。实验结果表明,所提出的方法在实际数据上是有效的。该方法轻量级且不依赖于底层目标检测器,便于在边缘设备上的应用。
{"title":"Weighted Edit Distance for Country Code Recognition in License Plates","authors":"K. Chumachenko, Alexandros Iosifidis, M. Gabbouj","doi":"10.23919/eusipco55093.2022.9909869","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909869","url":null,"abstract":"This paper presents the problem of country code recognition from li-cense plate images. We propose an approach based on character de-tection and subsequent clustering for country code localization. We further propose three weighted Edit Distance metrics for country of origin prediction from imperfect detections, namely based on char-acter similarity, detection confidence, and relative operation impor-tance. Experimental results show the benefit of proposed approaches on real-world data. The proposed method is lightweight and inde-pendent of the underlying object detector, facilitating its application on edge devices.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128464995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Design of Spatial Fast- and Slow-Time Waveforms and Receive Filter for MIMO Radar Space-Time Adaptive Processing MIMO雷达时空自适应处理中空间快、慢时波形及接收滤波器的设计
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909676
Chunxuan Shi, Yongzhe Li, R. Tao
In this paper, we study the joint design of transmit waveforms and receive filter for the multiple-input multiple-output (MIMO) radar with space-time adaptive processing (STAP), wherein the complex environment that involves both clutter and jamming signals is considered. We choose to simultaneously design both the fast-time waveform and slow-time coding among transmitted pulses, together with the design of adaptive processing at receiver, which therefore leads to a three-dimensional STAP for MIMO radar. Specifically, we maximize the signal-to-jammer-plus-clutter-plus-noise ratio at the output, and meanwhile, we ensure the constant-modulus and similarity constraints for the waveform transmission. Based on this, we formulate the joint design as a non-convex optimization problem, and then recast it into a form that allows the application of alternating direction method of multipliers to find its solution. Moreover, we propose an algorithm with fast convergence speed for the conducted design, whose effectiveness is verified by simulations.
本文研究了具有空时自适应处理(STAP)的多输入多输出(MIMO)雷达的发射波形和接收滤波器的联合设计,其中考虑了杂波和干扰信号的复杂环境。我们选择同时设计发射脉冲之间的快时波形和慢时编码,并在接收机上设计自适应处理,从而形成MIMO雷达的三维STAP。具体而言,我们最大限度地提高了输出端的信杂比,同时保证了波形传输的等模量和相似约束。在此基础上,将关节设计化为非凸优化问题,并将其转化为可应用乘法器交替方向法求解的形式。此外,我们还提出了一种收敛速度快的算法,并通过仿真验证了算法的有效性。
{"title":"Design of Spatial Fast- and Slow-Time Waveforms and Receive Filter for MIMO Radar Space-Time Adaptive Processing","authors":"Chunxuan Shi, Yongzhe Li, R. Tao","doi":"10.23919/eusipco55093.2022.9909676","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909676","url":null,"abstract":"In this paper, we study the joint design of transmit waveforms and receive filter for the multiple-input multiple-output (MIMO) radar with space-time adaptive processing (STAP), wherein the complex environment that involves both clutter and jamming signals is considered. We choose to simultaneously design both the fast-time waveform and slow-time coding among transmitted pulses, together with the design of adaptive processing at receiver, which therefore leads to a three-dimensional STAP for MIMO radar. Specifically, we maximize the signal-to-jammer-plus-clutter-plus-noise ratio at the output, and meanwhile, we ensure the constant-modulus and similarity constraints for the waveform transmission. Based on this, we formulate the joint design as a non-convex optimization problem, and then recast it into a form that allows the application of alternating direction method of multipliers to find its solution. Moreover, we propose an algorithm with fast convergence speed for the conducted design, whose effectiveness is verified by simulations.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129353560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binaural source localization using deep learning and head rotation information 使用深度学习和头部旋转信息的双耳源定位
Pub Date : 2022-08-29 DOI: 10.23919/eusipco55093.2022.9909764
Guillermo García-Barrios, D. Krause, A. Politis, A. Mesaros, J. Gutiérrez-Arriola, R. Fraile
This work studies learning-based binaural sound source localization, under the influence of head rotation in rever-berant conditions. Emphasis is on whether knowledge of head rotation can improve localization performance over the non-rotating case for the same acoustic scene. Simulations of binaural head signals of a static and rotating head were conducted, for 5 different rotation speeds and a wide range of reverberant conditions. Several convolutional recurrent neural network mod-els were evaluated including a static head scenario, a model without rotation information, and distinct models differentiated on the way of manipulating the quaternions. The results were analyzed based on the direction-of-arrival error, and they show the importance of using quaternions as additional features, with the best localization accuracy obtained when using an additional convolutional branch that merges the features through addition or concatenation. Nevertheless, raw quaternion features presented lower performance than the static baseline model. Additionally, the study shows the importance of the analysis time window length when using information about head rotation.
本工作研究了在反向条件下头部旋转影响下基于学习的双耳声源定位。重点是在相同的声学场景中,头部旋转的知识是否可以提高定位性能。在5种不同的转速和大范围混响条件下,对静态和旋转头部的双耳信号进行了仿真。评估了几种卷积递归神经网络模型,包括静态头部场景、不含旋转信息的模型和根据四元数操作方式区分的不同模型。基于到达方向误差对结果进行了分析,结果显示了使用四元数作为附加特征的重要性,当使用额外的卷积分支通过添加或连接合并特征时,可以获得最佳的定位精度。然而,原始四元数特征表现出比静态基线模型更低的性能。此外,该研究表明,在使用有关头部旋转的信息时,分析时间窗口长度的重要性。
{"title":"Binaural source localization using deep learning and head rotation information","authors":"Guillermo García-Barrios, D. Krause, A. Politis, A. Mesaros, J. Gutiérrez-Arriola, R. Fraile","doi":"10.23919/eusipco55093.2022.9909764","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909764","url":null,"abstract":"This work studies learning-based binaural sound source localization, under the influence of head rotation in rever-berant conditions. Emphasis is on whether knowledge of head rotation can improve localization performance over the non-rotating case for the same acoustic scene. Simulations of binaural head signals of a static and rotating head were conducted, for 5 different rotation speeds and a wide range of reverberant conditions. Several convolutional recurrent neural network mod-els were evaluated including a static head scenario, a model without rotation information, and distinct models differentiated on the way of manipulating the quaternions. The results were analyzed based on the direction-of-arrival error, and they show the importance of using quaternions as additional features, with the best localization accuracy obtained when using an additional convolutional branch that merges the features through addition or concatenation. Nevertheless, raw quaternion features presented lower performance than the static baseline model. Additionally, the study shows the importance of the analysis time window length when using information about head rotation.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124611906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2022 30th European Signal Processing Conference (EUSIPCO)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1