Circuits, Systems and Signal Processing最新文献

英文中文

Squeeze-and-Excitation Self-Attention Mechanism Enhanced Digital Audio Source Recognition Based on Transfer Learning 基于迁移学习的挤压-激发自注意机制增强数字音源识别能力

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-13 DOI: 10.1007/s00034-024-02850-8

Chunyan Zeng, Yuhao Zhao, Zhifeng Wang, Kun Li, Xiangkui Wan, Min Liu

Recent advances in digital audio source recognition, particularly within judicial forensics and intellectual property rights domains, have been significantly propelled by deep learning technologies. As these methods evolve, they introduce novel models and enhance processing capabilities crucial for audio source recognition research. Despite these advancements, the limited availability of high-quality labeled samples and the labor-intensive nature of data labeling remain substantial challenges. This paper addresses these challenges by exploring the efficacy of self-attention mechanisms, specifically through a novel neural network that integrates the Squeeze-and-Excitation (SE) self-attention mechanism for identifying recording devices. Our study not only demonstrates a relative improvement of approximately 1.5% in all four evaluation metrics over traditional convolutional neural networks but also compares the performance across two public datasets. Furthermore, we delve into the self-attention mechanism’s adaptability across different network architectures by embedding the Squeeze-and-Excitation mechanism within both residual and conventional convolutional network frameworks. Through ablation studies and comparative analyses, we reveal that the impact of self-attention mechanisms varies significantly with the underlying network architecture. Additionally, employing a transfer learning strategy has allowed us to leverage data from a baseline network with extensive samples, applying it to a smaller dataset to successfully identify 141 devices. This approach resulted in performance enhancements ranging from 4% to 7% across various metrics, highlighting the transfer learning method’s role in advancing digital audio source identification research. These findings not only validate the Squeeze-and-Excitation self-attention mechanism’s effectiveness in audio source recognition but also illustrate the broader applicability and benefits of incorporating advanced learning strategies in overcoming data scarcity and enhancing model adaptability.

深度学习技术极大地推动了数字音源识别领域，特别是司法取证和知识产权领域的最新进展。随着这些方法的发展，它们引入了新的模型，并增强了对音源识别研究至关重要的处理能力。尽管取得了这些进步，但高质量标注样本的有限可用性和数据标注的劳动密集型仍是巨大挑战。本文通过探索自我注意机制的功效来应对这些挑战，特别是通过一种集成了挤压-激发（SE）自我注意机制的新型神经网络来识别录音设备。与传统卷积神经网络相比，我们的研究不仅在所有四个评估指标上都实现了约 1.5% 的相对改进，而且还比较了两个公共数据集的性能。此外，我们还通过在残差和传统卷积网络框架中嵌入 "挤压-激发 "机制，深入研究了自我关注机制在不同网络架构中的适应性。通过消融研究和比较分析，我们发现自我注意机制的影响随底层网络架构的不同而有显著差异。此外，采用迁移学习策略使我们能够利用具有大量样本的基线网络数据，将其应用于较小的数据集，从而成功识别出 141 个设备。这种方法使各种指标的性能提高了 4% 到 7%，突出了迁移学习方法在推动数字音源识别研究方面的作用。这些研究结果不仅验证了 "挤压-激发 "自我注意机制在音源识别中的有效性，而且还说明了在克服数据稀缺和增强模型适应性方面采用高级学习策略的广泛适用性和益处。

{"title":"Squeeze-and-Excitation Self-Attention Mechanism Enhanced Digital Audio Source Recognition Based on Transfer Learning","authors":"Chunyan Zeng, Yuhao Zhao, Zhifeng Wang, Kun Li, Xiangkui Wan, Min Liu","doi":"10.1007/s00034-024-02850-8","DOIUrl":"https://doi.org/10.1007/s00034-024-02850-8","url":null,"abstract":"Recent advances in digital audio source recognition, particularly within judicial forensics and intellectual property rights domains, have been significantly propelled by deep learning technologies. As these methods evolve, they introduce novel models and enhance processing capabilities crucial for audio source recognition research. Despite these advancements, the limited availability of high-quality labeled samples and the labor-intensive nature of data labeling remain substantial challenges. This paper addresses these challenges by exploring the efficacy of self-attention mechanisms, specifically through a novel neural network that integrates the Squeeze-and-Excitation (SE) self-attention mechanism for identifying recording devices. Our study not only demonstrates a relative improvement of approximately 1.5% in all four evaluation metrics over traditional convolutional neural networks but also compares the performance across two public datasets. Furthermore, we delve into the self-attention mechanism’s adaptability across different network architectures by embedding the Squeeze-and-Excitation mechanism within both residual and conventional convolutional network frameworks. Through ablation studies and comparative analyses, we reveal that the impact of self-attention mechanisms varies significantly with the underlying network architecture. Additionally, employing a transfer learning strategy has allowed us to leverage data from a baseline network with extensive samples, applying it to a smaller dataset to successfully identify 141 devices. This approach resulted in performance enhancements ranging from 4% to 7% across various metrics, highlighting the transfer learning method’s role in advancing digital audio source identification research. These findings not only validate the Squeeze-and-Excitation self-attention mechanism’s effectiveness in audio source recognition but also illustrate the broader applicability and benefits of incorporating advanced learning strategies in overcoming data scarcity and enhancing model adaptability.","PeriodicalId":10227,"journal":{"name":"Circuits, Systems and Signal Processing","volume":"94 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Discrete-Time Delta-Sigma Modulator with Successively Approximating Register ADC Assisted Analog Feedback Technique 采用连续逼近寄存器 ADC 辅助模拟反馈技术的离散时间三角积分调制器

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-12 DOI: 10.1007/s00034-024-02832-w

Hsin-Liang Chen, Hsiao-Hsing Chou, Hong-Ming Chiu, Hung-Chi Chang, Jen-Shiun Chiang

This paper proposes a delta-sigma modulator (DSM) for audio band applications with low-area cost and high-resolution performance characteristics. The proposed circuit is implemented by discrete-time switched capacitor circuits. It employs an assisted 6-bit successive approximation register (SAR) analog-to-digital converter (ADC) as the quantizer. Most importantly, it combines and shares the resistive digital-to-analog (DAC) in DSM and SAR ADC. Therefore, it can achieve high-efficiency advantages and reduce the chip layout cost. After all, the chip area is only 0.096 mm² by the 0.18 um 1P6M CMOS process. It achieves 96 dB dynamic range (DR), 83.1 dB signal to noise and distortion ratio (SNDR), and 93.4 dB signal to noise ratio (SNR) with 25 kHz signal bandwidth and oversampling ratio (OSR) of 64.

本文提出了一种用于音频波段应用的三角积分调制器（DSM），具有低成本、高分辨率的性能特点。该电路由离散时间开关电容电路实现。它采用辅助 6 位逐次逼近寄存器（SAR）模数转换器（ADC）作为量化器。最重要的是，它结合并共享了 DSM 和 SAR ADC 中的电阻式数模转换器 (DAC)。因此，它可以实现高效率优势并降低芯片布局成本。毕竟，采用 0.18 um 1P6M CMOS 工艺，芯片面积仅为 0.096 mm2。在 25 kHz 信号带宽和 64 的过采样率（OSR）条件下，它的动态范围（DR）达到 96 dB，信噪比和失真比（SNDR）达到 83.1 dB，信噪比（SNR）达到 93.4 dB。

引用次数: 0

Recursive Windowed Variational Mode Decomposition 递归窗口变分模式分解

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-12 DOI: 10.1007/s00034-024-02864-2

Zhaoheng Zhou, Bingo Wing-Kuen Ling, Nuo Xu

The variational mode decomposition (VMD) and its variants aim to decompose a given signal into a set of narrow band modes. The analysis of these modes is usually based on the Fourier analysis. That is, the center frequencies of these modes are found without exploiting the local time varying information of the signal during the iteration in the existing algorithms for performing the VMD. To address this issue, this paper proposes a recursive windowed VMD (RWVMD) approach for performing the signal decomposition. First, the window is sliding across the signal. Then, the variational mode extraction is performed on each frame to obtain the first mode. Then, the difference between the first mode and the signal is computed to obtain the residual signal. The above process is repeated on the residual signal until the algorithm converges. The effectiveness of the RWVMD algorithm is demonstrated through the computer numerical simulations. It is found that the center frequency in the time frequency plane is more accurately matched with the characteristics of the original signal.

变异模式分解（VMD）及其变体旨在将给定信号分解为一组窄带模式。对这些模式的分析通常基于傅立叶分析。也就是说，在现有的 VMD 算法中，这些模式的中心频率是在迭代过程中找到的，而没有利用信号的局部时变信息。针对这一问题，本文提出了一种递归窗口 VMD（RWVMD）方法来进行信号分解。首先，在信号上滑动窗口。然后，对每一帧进行变分模式提取，以获得第一模式。然后，计算第一模式与信号之间的差值，得到残差信号。在残差信号上重复上述过程，直到算法收敛。计算机数值模拟证明了 RWVMD 算法的有效性。结果发现，时频平面上的中心频率与原始信号的特征更为精确地匹配。

引用次数: 0

Event-Triggered $$H_{infty }$$ Filtering for A Class of Nonlinear Systems Under DoS Attacks DoS 攻击下一类非线性系统的事件触发 $$H_{infty }$ 过滤

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02775-2

Weiguo Ma, Yuanqiang Zhou, Xin Lai, Furong Gao

This paper investigates event-triggered (H_{infty }) filtering for a class of discrete-time nonlinear systems subject to denial-of-service (DoS) attacks. Since the communication network in the networked systems is vulnerable to malicious cyber-attacks, this paper models DoS attacks as a Bernoulli random variable, which results in stochastic filtering error system. Besides, we use adaptive event-triggered communication to ensure that the least amount of information is transmitted over the network. For the filtering error system under the effect of event-triggered communication and DoS attacks, we provide sufficient conditions on guaranteeing the stability and prescribed (H_{infty }) performance, where the (H_{infty }) filter and event-triggered parameters are co-designed using the linear matrix inequality approach. Finally, two illustrative examples are provided to demonstrate the effectiveness of the proposed method.

本文研究了针对一类受到拒绝服务（DoS）攻击的离散时间非线性系统的事件触发滤波（H_{infty }）。由于网络系统中的通信网络容易受到恶意网络攻击，本文将 DoS 攻击建模为伯努利随机变量，从而得到随机滤波误差系统。此外，我们使用自适应事件触发通信，以确保在网络上传输最少的信息。对于事件触发通信和 DoS 攻击影响下的滤波误差系统，我们提供了保证稳定性和规定 (H_{infty }) 性能的充分条件，其中 (H_{infty }) 滤波器和事件触发参数是使用线性矩阵不等式方法共同设计的。最后，我们提供了两个示例来证明所提方法的有效性。

引用次数: 0

Individually Weighted Modified Logarithmic Hyperbolic Sine Curvelet Based Recursive FLN for Nonlinear System Identification 基于单独加权修正对数双曲正弦曲线的递归 FLN 用于非线性系统识别

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02839-3

Neetu Chikyal, Vasundhara, Chayan Bhar, Asutosh Kar, Mads Graesboll Christensen

Lately, an adaptive exponential functional link network (AEFLN) involving exponential terms integrated with trigonometric functional expansion is being introduced as a linear-in-the-parameters nonlinear filter. However, they exhibit degraded efficacy in lieu of non-Gaussian or impulsive noise interference. Therefore, to enhance the nonlinear modelling capability, here is a modified logarithmic hyperbolic sine cost function in amalgamation with the adaptive recursive exponential functional link network. In conjugation with this, a sparsity constraint motivated by a curvelet-dependent notion is employed in the suggested approach. Therefore, this paper presents an individually weighted modified logarithmic hyperbolic sine curvelet-based recursive exponential FLN (IMLSC-REF) for robust sparse nonlinear system identification. An individually weighted adaptation gain is imparted to several coefficients corresponding to the nonlinear adaptive model for accelerating the convergence rate. The weight update rule and the maximum criteria for the convergence factor are being further derived. Exhaustive simulation studies profess the effectiveness of the introduced algorithm in case of varied nonlinearity and for identifying as well as modelling the physical path of the acoustic feedback phenomenon of a behind-the-ear (BTE) hearing aid.

最近，一种自适应指数函数链路网络（AEFLN）作为一种参数线性非线性滤波器被引入，其中涉及与三角函数展开集成的指数项。然而，在非高斯或脉冲噪声干扰下，它们的功效会有所下降。因此，为了增强非线性建模能力，这里将改进的对数双曲正弦成本函数与自适应递归指数函数链接网络相结合。与此同时，在建议的方法中还采用了由小曲线相关概念激发的稀疏性约束。因此，本文提出了一种基于小曲线的单独加权修正对数双曲正弦递归指数功能链接网络（IMLSC-REF），用于鲁棒稀疏非线性系统识别。为加快收敛速度，对非线性自适应模型对应的几个系数赋予了单独加权的自适应增益。此外，还进一步推导出了权值更新规则和收敛因子的最大标准。详尽的模拟研究证明了所引入算法在不同非线性情况下的有效性，以及对耳背式（BTE）助听器声反馈现象的物理路径进行识别和建模的有效性。

{"title":"Individually Weighted Modified Logarithmic Hyperbolic Sine Curvelet Based Recursive FLN for Nonlinear System Identification","authors":"Neetu Chikyal, Vasundhara, Chayan Bhar, Asutosh Kar, Mads Graesboll Christensen","doi":"10.1007/s00034-024-02839-3","DOIUrl":"https://doi.org/10.1007/s00034-024-02839-3","url":null,"abstract":"Lately, an adaptive exponential functional link network (AEFLN) involving exponential terms integrated with trigonometric functional expansion is being introduced as a linear-in-the-parameters nonlinear filter. However, they exhibit degraded efficacy in lieu of non-Gaussian or impulsive noise interference. Therefore, to enhance the nonlinear modelling capability, here is a modified logarithmic hyperbolic sine cost function in amalgamation with the adaptive recursive exponential functional link network. In conjugation with this, a sparsity constraint motivated by a curvelet-dependent notion is employed in the suggested approach. Therefore, this paper presents an individually weighted modified logarithmic hyperbolic sine curvelet-based recursive exponential FLN (IMLSC-REF) for robust sparse nonlinear system identification. An individually weighted adaptation gain is imparted to several coefficients corresponding to the nonlinear adaptive model for accelerating the convergence rate. The weight update rule and the maximum criteria for the convergence factor are being further derived. Exhaustive simulation studies profess the effectiveness of the introduced algorithm in case of varied nonlinearity and for identifying as well as modelling the physical path of the acoustic feedback phenomenon of a behind-the-ear (BTE) hearing aid.","PeriodicalId":10227,"journal":{"name":"Circuits, Systems and Signal Processing","volume":"292 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Secure and Imperceptible Frequency-Based Watermarking for Medical Images 基于频率的医学图像安全和不可感知水印技术

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02814-y

Saadaoui Naima, Akram Zine Eddine Boukhamla, Zermi Narima, Khaldi Amine, Kafi Med Redouane, Aditya Kumar Sahu

Medical image security is a critical concern in the healthcare domain, and various watermarking techniques have been explored to embed imperceptible and secure data within medical images. This paper introduces an innovative frequency-based watermarking technique for medical images, utilizing the Fractional Discrete Cosine Transform (FDCT) and Schur decomposition to ensure robust and secure watermark embedding. The watermark bits are integrated by modulating the obtained Schur coefficients, thereby ensuring robust and secure watermarking without significantly altering the visual quality of the medical images. The experiments conducted on the ocular database demonstrate the capacity, imperceptibility, and robustness of the proposed method. This approach achieved a favorable trade-off between imperceptibility and information embedding capacity for ensuring the authenticity and integrity of medical images during transmission.

医疗图像的安全性是医疗保健领域的一个关键问题，人们探索了各种水印技术，以在医疗图像中嵌入不易察觉的安全数据。本文介绍了一种创新的基于频率的医学图像水印技术，利用分数离散余弦变换（FDCT）和舒尔分解确保稳健安全的水印嵌入。通过调制获得的舒尔系数来整合水印比特，从而在不明显改变医学图像视觉质量的情况下确保水印的稳健性和安全性。在眼科数据库上进行的实验证明了所提出方法的容量、不可感知性和鲁棒性。这种方法在不可感知性和信息嵌入能力之间实现了良好的权衡，确保了医学图像在传输过程中的真实性和完整性。

引用次数: 0

An Online Two-Stage Classification Based on Projections 基于预测的在线两阶段分类法

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02843-7

Aimin Song, Yan Wang, Shengyang Luan

Kernel-based online classification algorithms, such as the Perceptron, NORMA, and passive-aggressive, are renowned for their computational efficiency but have been criticized for slow convergence. However, the parallel projection algorithm, within the adaptive projected subgradient method framework, exhibits accelerated convergence and enhanced noise resilience. Despite these advantages, a specific sparsification procedure for the parallel projection algorithm is currently absent. Additionally, existing online classification algorithms, including those mentioned earlier, heavily rely on the kernel width parameter, rendering them sensitive to its choices. In an effort to bolster the performance of these algorithms, we propose a two-stage classification algorithm within the Cartesian product space of reproducing kernel Hilbert spaces. In the initial stage, we introduce an online double-kernel classifier with parallel projection. This design aims not only to improve convergence but also to address the sensitivity to kernel width. In the subsequent stage, the component with a larger kernel width remains fixed, while the component with a smaller kernel width undergoes updates. To promote sparsity and mitigate model complexity, we incorporate the projection-along-subspace technique. Moreover, for enhanced computational efficiency, we integrate the set-membership technique into the updates, selectively exploiting informative vectors to improve the classifier. The monotone approximation of the proposed classifier, based on the designed ( epsilon )-insensitive function, is presented. Finally, we apply the proposed algorithm to equalize a nonlinear channel. Simulation results demonstrate that the proposed classifier achieves faster convergence and lower misclassification error with comparable model complexity.

基于核的在线分类算法，如 Perceptron、NORMA 和被动攻击算法，以计算效率高而著称，但因收敛速度慢而饱受诟病。然而，在自适应投影子梯度法框架内的并行投影算法却能加快收敛速度并增强抗噪声能力。尽管有这些优点，但目前还没有针对并行投影算法的特定稀疏化程序。此外，现有的在线分类算法，包括前面提到的那些，都严重依赖于核宽度参数，因此对其选择非常敏感。为了提高这些算法的性能，我们在再现核希尔伯特空间的笛卡尔乘积空间内提出了一种两阶段分类算法。在初始阶段，我们引入了并行投影的在线双核分类器。这种设计的目的不仅在于提高收敛性，还在于解决对核宽度的敏感性问题。在后续阶段，内核宽度较大的部分保持固定，而内核宽度较小的部分则进行更新。为了提高稀疏性和降低模型复杂度，我们采用了子空间投影技术。此外，为了提高计算效率，我们将集合成员技术整合到更新中，有选择性地利用信息向量来改进分类器。基于所设计的 ( epsilon )-不敏感函数，提出了分类器的单调近似值。最后，我们应用所提出的算法来均衡非线性信道。仿真结果表明，在模型复杂度相当的情况下，所提出的分类器收敛速度更快，误分类误差更低。

{"title":"An Online Two-Stage Classification Based on Projections","authors":"Aimin Song, Yan Wang, Shengyang Luan","doi":"10.1007/s00034-024-02843-7","DOIUrl":"https://doi.org/10.1007/s00034-024-02843-7","url":null,"abstract":"Kernel-based online classification algorithms, such as the Perceptron, NORMA, and passive-aggressive, are renowned for their computational efficiency but have been criticized for slow convergence. However, the parallel projection algorithm, within the adaptive projected subgradient method framework, exhibits accelerated convergence and enhanced noise resilience. Despite these advantages, a specific sparsification procedure for the parallel projection algorithm is currently absent. Additionally, existing online classification algorithms, including those mentioned earlier, heavily rely on the kernel width parameter, rendering them sensitive to its choices. In an effort to bolster the performance of these algorithms, we propose a two-stage classification algorithm within the Cartesian product space of reproducing kernel Hilbert spaces. In the initial stage, we introduce an online double-kernel classifier with parallel projection. This design aims not only to improve convergence but also to address the sensitivity to kernel width. In the subsequent stage, the component with a larger kernel width remains fixed, while the component with a smaller kernel width undergoes updates. To promote sparsity and mitigate model complexity, we incorporate the projection-along-subspace technique. Moreover, for enhanced computational efficiency, we integrate the set-membership technique into the updates, selectively exploiting informative vectors to improve the classifier. The monotone approximation of the proposed classifier, based on the designed ( epsilon )-insensitive function, is presented. Finally, we apply the proposed algorithm to equalize a nonlinear channel. Simulation results demonstrate that the proposed classifier achieves faster convergence and lower misclassification error with comparable model complexity.","PeriodicalId":10227,"journal":{"name":"Circuits, Systems and Signal Processing","volume":"48 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SWAM-Net $$+$$ : Selective Wavelet Attentive M-Network $$+$$ for Single Image Dehazing SWAM-Net $$+$$: 用于单幅图像去噪的选择性小波注意 M 网络 $$+$$

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02837-5

Raju Nuthi, Srinivas Kankanala

Image dehazing is an ill-posed issue in low-level computer vision; therefore, it grabbed many researchers’ attention. The key mechanism to improve dehazing performance remains unclear, although many existing network pipelines work fine. To improve the performance of the image dehazing network, a hierarchical model named “Selective Attentive Wavelet M-Net+” (SWAM-Net+) was proposed. In order to enrich the features from the wavelet domain, a “Selective Wavelet Attentive Module” was introduced in M-Net+. Several key components of our network are used for extracting the multiscale features through parallel multi-resolution convolution channels. Contextual information is collected using a dual attention unit, and the attention is based on multiscale feature aggregation. We replaced summation and concatenation operations by introducing the Selective Kernel Feature Fusing module to achieve feature aggregation. Furthermore, our network achieves comprehensively better performance results on the RESIDE dataset both qualitatively and quantitatively.

图像去毛刺是低级计算机视觉中的一个难题，因此吸引了众多研究人员的关注。尽管许多现有的网络管道运行良好，但提高去毛刺性能的关键机制仍不清楚。为了提高图像去毛刺网络的性能，有人提出了一种名为 "选择性注意小波 M-Net+"（SWAM-Net+）的分层模型。为了丰富小波域的特征，在 M-Net+ 中引入了 "选择性小波注意模块"。我们网络的几个关键组件用于通过并行多分辨率卷积通道提取多尺度特征。使用双注意单元收集上下文信息，注意基于多尺度特征聚合。我们通过引入选择性内核特征融合模块来实现特征聚合，从而取代了求和与串联操作。此外，我们的网络在 RESIDE 数据集上取得了定性和定量两方面全面提升的性能结果。

引用次数: 0

A Novel Flat Broadband Passive Circuit with Negative Group Delay 具有负群延迟的新型扁平宽带无源电路

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02844-6

Aixia Yuan, Niannan Chang, Junzheng Liu, Yuwei Meng

A novel flat negative group delay circuit is proposed. The flatter negative group delay is achieved, and the insertion loss is reduced. The circuit structure consists of a resistor R1 series connected with a capacitor C1 and inductor L1 in parallel, followed by a capacitor C2 and inductor L2 in parallel, and finally connected in series with a capacitor C3 and inductor L3. The analysis design equation is provided. The effects of different component values on circuit flatness and bandwidth are analyzed. According to this design method, a flat negative group delay circuit is designed and fabricated. The simulation and measurement results are basically consistent. It has good flat negative group delay characteristics, with a group delay fluctuation of 3%, a group delay value of −1.02 ns, and an insertion loss of 7 dB. The feasibility of the design method is verified.

提出了一种新型扁平负群延迟电路。该电路实现了更扁平的负群延迟，并降低了插入损耗。电路结构由电阻 R1 与电容器 C1 和电感器 L1 并联串联组成，然后是电容器 C2 和电感器 L2 并联，最后是电容器 C3 和电感器 L3 串联。提供了分析设计方程。分析了不同元件值对电路平坦度和带宽的影响。根据这种设计方法，设计并制作了一个平坦的负群延迟电路。仿真和测量结果基本一致。它具有良好的平坦负群延迟特性，群延迟波动为 3%，群延迟值为 -1.02 ns，插入损耗为 7 dB。验证了设计方法的可行性。

引用次数: 0

Acoustic Scene Classification Using Various Features and DNN Model: A Monolithic and Hierarchical Approach 使用各种特征和 DNN 模型进行声学场景分类：单层和分层方法

IF 2.3 3区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Circuits, Systems and Signal Processing

Pub Date : 2024-09-06 DOI: 10.1007/s00034-024-02836-6

Chandrasekhar Paseddula, Suryakanth V. Gangashetty

An acoustic scene is a complicated phenomenon; thus, it would be difficult to draw out scene-specific information from the foreground and background sound sources. To accurately discern the sound sceneries and pinpoint the distinct sound occurrences in realistic soundscapes, more study is still required. Investigating a good feature representation is helpful for acoustic scene classification (ASC). This study investigated a few common acoustic features for ASC, including the mel-frequency cepstral coefficients (MFCC), log-mel band energy (LOGMEL), linear prediction cepstral coefficients (LPCC), and all-pole group delay (APGD). To represent acoustic scenes, we proposed a variety of features based on speaker/music recognition, including inverted mel-frequency cepstral coefficients, spectral centroid magnitude coefficients, sub-band spectral flux coefficients, and single frequency filtering cepstral coefficients. Using DNN classification models, it has been investigated how these features affect the classification of acoustic scenes in the DCASE 2017 dataset. Our analysis shows that no single feature has performed better than the others for all acoustic scenarios. In general, it may be challenging for a single classifier to successfully identify all the classes when there are more acoustic scenes. Therefore, we have proposed a two-level hierarchical classification approach. This is accomplished by first determining the meta-category of the acoustic scene, followed by the fine-grained classification that falls under each meta-category. From our studies, it is observed that, the hierarchical approach has performed (81.0%) better than the monolithic classification approach (79.9%) without DNN score fusion at level 2 as post processing. The performance of the ASC system can be further improved by exploring more sophisticated complementary features. The fusion of MFCC AND LOGMEL features based monolithic system resulted in an accuracy of 90.5%. The proposed hierarchical system results in accuracy of 82.6% with DNN score fusion at level 2 as post processing.

声音场景是一种复杂的现象，因此很难从前景和背景声源中提取特定场景的信息。要准确辨别声音场景，并在真实的声音场景中精确定位独特的声音发生，还需要更多的研究。研究良好的特征表示有助于声学场景分类（ASC）。本研究调查了一些用于声学场景分类的常见声学特征，包括旋律-频率共振频率系数（MFCC）、对数-旋律带能量（LOGMEL）、线性预测共振频率系数（LPCC）和全极群延迟（APGD）。为了表示声学场景，我们提出了多种基于扬声器/音乐识别的特征，包括倒置梅尔频率epstral系数、频谱中心点幅度系数、子带频谱通量系数和单频滤波epstral系数。利用 DNN 分类模型，我们研究了这些特征如何影响 DCASE 2017 数据集中的声学场景分类。我们的分析表明，在所有声学场景中，没有一个特征的表现优于其他特征。一般来说，当声学场景较多时，单个分类器要成功识别所有类别可能具有挑战性。因此，我们提出了一种两级分层分类方法。首先确定声学场景的元类别，然后对每个元类别进行细粒度分类。从我们的研究中可以看出，分层方法的性能（81.0%）优于没有在第 2 层进行 DNN 分数融合作为后处理的单一分类方法（79.9%）。通过探索更复杂的互补特征，可以进一步提高 ASC 系统的性能。基于 MFCC 和 LOGMEL 特征融合的单一系统的准确率为 90.5%。提议的分层系统通过在第 2 层进行 DNN 分数融合作为后处理，准确率达到 82.6%。

{"title":"Acoustic Scene Classification Using Various Features and DNN Model: A Monolithic and Hierarchical Approach","authors":"Chandrasekhar Paseddula, Suryakanth V. Gangashetty","doi":"10.1007/s00034-024-02836-6","DOIUrl":"https://doi.org/10.1007/s00034-024-02836-6","url":null,"abstract":"An acoustic scene is a complicated phenomenon; thus, it would be difficult to draw out scene-specific information from the foreground and background sound sources. To accurately discern the sound sceneries and pinpoint the distinct sound occurrences in realistic soundscapes, more study is still required. Investigating a good feature representation is helpful for acoustic scene classification (ASC). This study investigated a few common acoustic features for ASC, including the mel-frequency cepstral coefficients (MFCC), log-mel band energy (LOGMEL), linear prediction cepstral coefficients (LPCC), and all-pole group delay (APGD). To represent acoustic scenes, we proposed a variety of features based on speaker/music recognition, including inverted mel-frequency cepstral coefficients, spectral centroid magnitude coefficients, sub-band spectral flux coefficients, and single frequency filtering cepstral coefficients. Using DNN classification models, it has been investigated how these features affect the classification of acoustic scenes in the DCASE 2017 dataset. Our analysis shows that no single feature has performed better than the others for all acoustic scenarios. In general, it may be challenging for a single classifier to successfully identify all the classes when there are more acoustic scenes. Therefore, we have proposed a two-level hierarchical classification approach. This is accomplished by first determining the meta-category of the acoustic scene, followed by the fine-grained classification that falls under each meta-category. From our studies, it is observed that, the hierarchical approach has performed (81.0%) better than the monolithic classification approach (79.9%) without DNN score fusion at level 2 as post processing. The performance of the ASC system can be further improved by exploring more sophisticated complementary features. The fusion of MFCC AND LOGMEL features based monolithic system resulted in an accuracy of 90.5%. The proposed hierarchical system results in accuracy of 82.6% with DNN score fusion at level 2 as post processing.","PeriodicalId":10227,"journal":{"name":"Circuits, Systems and Signal Processing","volume":"1 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Circuits, Systems and Signal Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀