IEEE open journal of signal processing最新文献

Generalized Metaplectic Convolution-Based Cohen's Class Time-Frequency Distribution: Theory and Application 基于广义广义卷积的Cohen类时频分布：理论与应用

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-02-25 DOI: 10.1109/OJSP.2025.3545337

Manjun Cui;Zhichao Zhang;Jie Han;Yunjie Chen;Chunzheng Cao

The convolution type of the Cohen's class time-frequency distribution (CCTFD) is a useful and effective time-frequency analysis tool for additive noises jamming signals. However, it can't meet the requirement of high-performance denoising under low signal-to-noise ratio conditions. In this paper, we define the generalized metaplectic convolution-based Cohen's class time-frequency distribution (GMC-CCTFD) by replacing the traditional convolution operator in CCTFD with the generalized convolution operator of metaplectic transform (MT). This new definition leverages the high degrees of freedom and flexibility of MT, improving performance in non-stationary signal analysis. We then establish a fundamental theory about the GMC-CCTFD's essential properties. By integrating the Wiener filter principle with the time-frequency filtering mechanism of GMC-CCTFD, we design a least-squares adaptive filter in the Wigner distribution-MT domain. This allows us to achieve adaptive filtering denoising based on GMC-CCTFD, giving birth to the least-squares adaptive filter-based GMC-CCTFD. Furthermore, we conduct several examples and apply the proposed filtering method to real-world datasets, demonstrating its superior performance in noise suppression compared to some state-of-the-art methods.

科恩类时频分布（CCTFD）的卷积型是分析加性噪声干扰信号的有效时频分析工具。然而，在低信噪比条件下，它不能满足高性能去噪的要求。本文通过用广义广义卷积变换的广义卷积算子（MT）代替广义广义卷积卷积中的传统卷积算子，定义了基于广义广义卷积的Cohen类时频分布（GMC-CCTFD）。这个新定义利用了MT的高度自由度和灵活性，提高了非平稳信号分析的性能。然后，我们建立了一个关于GMC-CCTFD基本性质的基本理论。将Wiener滤波原理与GMC-CCTFD的时频滤波机制相结合，设计了Wigner分布- mt域的最小二乘自适应滤波器。这使我们能够实现基于GMC-CCTFD的自适应滤波去噪，从而产生了基于最小二乘自适应滤波器的GMC-CCTFD。此外，我们进行了几个例子，并将所提出的滤波方法应用于实际数据集，与一些最先进的方法相比，证明了其在噪声抑制方面的优越性能。

{"title":"Generalized Metaplectic Convolution-Based Cohen's Class Time-Frequency Distribution: Theory and Application","authors":"Manjun Cui;Zhichao Zhang;Jie Han;Yunjie Chen;Chunzheng Cao","doi":"10.1109/OJSP.2025.3545337","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3545337","url":null,"abstract":"The convolution type of the Cohen's class time-frequency distribution (CCTFD) is a useful and effective time-frequency analysis tool for additive noises jamming signals. However, it can't meet the requirement of high-performance denoising under low signal-to-noise ratio conditions. In this paper, we define the generalized metaplectic convolution-based Cohen's class time-frequency distribution (GMC-CCTFD) by replacing the traditional convolution operator in CCTFD with the generalized convolution operator of metaplectic transform (MT). This new definition leverages the high degrees of freedom and flexibility of MT, improving performance in non-stationary signal analysis. We then establish a fundamental theory about the GMC-CCTFD's essential properties. By integrating the Wiener filter principle with the time-frequency filtering mechanism of GMC-CCTFD, we design a least-squares adaptive filter in the Wigner distribution-MT domain. This allows us to achieve adaptive filtering denoising based on GMC-CCTFD, giving birth to the least-squares adaptive filter-based GMC-CCTFD. Furthermore, we conduct several examples and apply the proposed filtering method to real-world datasets, demonstrating its superior performance in noise suppression compared to some state-of-the-art methods.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"348-368"},"PeriodicalIF":2.9,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10902015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143628714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Angularly Consistent 4D Light Field Segmentation Using Hyperpixels and a Graph Neural Network 使用超像素和图神经网络的无监督角一致四维光场分割

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-02-25 DOI: 10.1109/OJSP.2025.3545356

Maryam Hamad;Caroline Conti;Paulo Nunes;Luís Ducla Soares

Image segmentation is an essential initial stage in several computer vision applications. However, unsupervised image segmentation is still a challenging task in some cases such as when objects with a similar visual appearance overlap. Unlike 2D images, 4D Light Fields (LFs) convey both spatial and angular scene information facilitating depth/disparity estimation, which can be further used to guide the segmentation. Existing 4D LF segmentation methods that target object level (i.e., mid-level and high-level) segmentation are typically semi-supervised or supervised with ground truth labels and mostly support only densely sampled 4D LFs. This paper proposes a novel unsupervised mid-level 4D LF Segmentation method using Graph Neural Networks (LFSGNN), which segments all LF views consistently. To achieve that, the 4D LF is represented as a hypergraph, whose hypernodes are obtained based on hyperpixel over-segmentation. Then, a graph neural network is used to extract deep features from the LF and assign segmentation labels to all hypernodes. Afterwards, the network parameters are updated iteratively to achieve better object separation using backpropagation. The proposed segmentation method supports both densely and sparsely sampled 4D LFs. Experimental results on synthetic and real 4D LF datasets show that the proposed method outperforms benchmark methods both in terms of segmentation spatial accuracy and angular consistency.

在一些计算机视觉应用中，图像分割是必不可少的初始阶段。然而，在某些情况下，如视觉外观相似的物体重叠时，无监督图像分割仍是一项具有挑战性的任务。与二维图像不同，4D 光场（LF）同时传递空间和角度场景信息，有利于深度/差异估计，可进一步用于指导分割。现有的 4D 光场分割方法以物体层（即中层和高层）分割为目标，通常采用地面实况标签进行半监督或监督，而且大多只支持密集采样的 4D 光场。本文提出了一种使用图神经网络（LFSGNN）的新型无监督中层 4D LF 分割方法，它能对所有 LF 视图进行一致的分割。为此，4D LF 被表示为一个超图，其超节点是根据超像素过度分割得到的。然后，使用图神经网络从 LF 中提取深度特征，并为所有超节点分配分割标签。之后，利用反向传播迭代更新网络参数，以实现更好的对象分离。所提出的分割方法同时支持高密度和稀疏采样的 4D LF。在合成和真实 4D LF 数据集上的实验结果表明，所提出的方法在分割空间精度和角度一致性方面都优于基准方法。

{"title":"Unsupervised Angularly Consistent 4D Light Field Segmentation Using Hyperpixels and a Graph Neural Network","authors":"Maryam Hamad;Caroline Conti;Paulo Nunes;Luís Ducla Soares","doi":"10.1109/OJSP.2025.3545356","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3545356","url":null,"abstract":"Image segmentation is an essential initial stage in several computer vision applications. However, unsupervised image segmentation is still a challenging task in some cases such as when objects with a similar visual appearance overlap. Unlike 2D images, 4D Light Fields (LFs) convey both spatial and angular scene information facilitating depth/disparity estimation, which can be further used to guide the segmentation. Existing 4D LF segmentation methods that target object level (i.e., mid-level and high-level) segmentation are typically semi-supervised or supervised with ground truth labels and mostly support only densely sampled 4D LFs. This paper proposes a novel unsupervised mid-level 4D LF Segmentation method using Graph Neural Networks (LFSGNN), which segments all LF views consistently. To achieve that, the 4D LF is represented as a hypergraph, whose hypernodes are obtained based on hyperpixel over-segmentation. Then, a graph neural network is used to extract deep features from the LF and assign segmentation labels to all hypernodes. Afterwards, the network parameters are updated iteratively to achieve better object separation using backpropagation. The proposed segmentation method supports both densely and sparsely sampled 4D LFs. Experimental results on synthetic and real 4D LF datasets show that the proposed method outperforms benchmark methods both in terms of segmentation spatial accuracy and angular consistency.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"333-347"},"PeriodicalIF":2.9,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10902629","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Non-Stationary Delayed Combinatorial Semi-Bandit With Causally Related Rewards 具有因果相关奖励的非平稳延迟组合半强盗

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-02-24 DOI: 10.1109/OJSP.2025.3545379

Saeed Ghoorchian;Steven Bilaj;Setareh Maghsudi

Sequential decision-making under uncertainty is often associated with long feedback delays. Such delays degrade the performance of the learning agent in identifying a subset of arms with the optimal collective reward in the long run. This problem becomes significantly challenging in a non-stationary environment with structural dependencies amongst the reward distributions associated with the arms. Therefore, besides adapting to delays and environmental changes, learning the causal relations alleviates the adverse effects of feedback delay on the decision-making process. We formalize the described setting as a non-stationary and delayed combinatorial semi-bandit problem with causally related rewards. We model the causal relations by a directed graph in a stationary structural equation model. The agent maximizes the long-term average payoff, defined as a linear function of the base arms' rewards. We develop a policy that learns the structural dependencies from delayed feedback and utilizes that to optimize the decision-making while adapting to drifts. We prove a regret bound for the performance of the proposed algorithm. Besides, we evaluate our method via numerical analysis using synthetic and real-world datasets to detect the regions that contribute the most to the spread of Covid-19 in Italy.

不确定性条件下的顺序决策往往与较长的反馈延迟有关。这种延迟会降低学习代理的性能，使其无法识别出具有长期最优集体奖励的武器子集。在非稳态环境中，与武器相关的奖励分布之间存在结构依赖关系，因此这一问题变得极具挑战性。因此，除了适应延迟和环境变化外，学习因果关系还能减轻反馈延迟对决策过程的不利影响。我们将所述环境形式化为一个具有因果关系奖励的非稳态延迟组合半比特问题。我们通过静态结构方程模型中的有向图来模拟因果关系。代理最大化长期平均报酬，该报酬被定义为基臂报酬的线性函数。我们开发了一种策略，可以从延迟反馈中学习结构依赖性，并利用它来优化决策，同时适应漂移。我们证明了所提算法性能的遗憾约束。此外，我们还通过使用合成数据集和真实数据集进行数值分析来评估我们的方法，从而检测出哪些地区对 Covid-19 在意大利的传播贡献最大。

{"title":"Non-Stationary Delayed Combinatorial Semi-Bandit With Causally Related Rewards","authors":"Saeed Ghoorchian;Steven Bilaj;Setareh Maghsudi","doi":"10.1109/OJSP.2025.3545379","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3545379","url":null,"abstract":"Sequential decision-making under uncertainty is often associated with long feedback delays. Such delays degrade the performance of the learning agent in identifying a subset of arms with the optimal collective reward in the long run. This problem becomes significantly challenging in a non-stationary environment with structural dependencies amongst the reward distributions associated with the arms. Therefore, besides adapting to delays and environmental changes, learning the causal relations alleviates the adverse effects of feedback delay on the decision-making process. We formalize the described setting as a non-stationary and delayed combinatorial semi-bandit problem with causally related rewards. We model the causal relations by a directed graph in a stationary structural equation model. The agent maximizes the long-term average payoff, defined as a linear function of the base arms' rewards. We develop a policy that learns the structural dependencies from delayed feedback and utilizes that to optimize the decision-making while adapting to drifts. We prove a regret bound for the performance of the proposed algorithm. Besides, we evaluate our method via numerical analysis using synthetic and real-world datasets to detect the regions that contribute the most to the spread of Covid-19 in Italy.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"369-384"},"PeriodicalIF":2.9,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10902019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143627851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extending Guided Filters Through Effective Utilization of Multi-Channel Guide Images Based on Singular Value Decomposition 基于奇异值分解有效利用多通道引导图像扩展引导滤波器

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-02-24 DOI: 10.1109/OJSP.2025.3545304

Kazu Mishiba

This paper proposes the SVD-based Guided Filter, designed to address key limitations of the original guided filter and its improved methods, providing better use of multi-channel guide images. First, we analyzed the guided filter framework, reinterpreting it from a patch-based perspective using singular value decomposition (SVD). This revealed that the original guided filter suppresses oscillatory components based on their eigenvalues. Building on this insight, we proposed a new filtering method that selectively suppresses or enhances these components through functions that respond to their eigenvalues. The proposed SVD-based Guided Filter offers improved control over edge preservation and noise reduction compared to the original guided filter and its improved methods, which often struggle to balance these tasks. We validated the proposed method across various image processing applications, including denoising, edge-preserving smoothing, detail enhancement, and edge-enhancing smoothing. The results demonstrated that the SVD-based Guided Filter consistently outperforms the original guided filter and its improved methods by making more effective use of color guide images. While the computational cost is slightly higher than the original guided filter, the method remains efficient and highly effective. Overall, the proposed SVD-based Guided Filter delivers notable improvements, offering a solid foundation for further advancements in guided filtering techniques.

本文提出了基于奇异值分解的引导滤波器，旨在解决原有引导滤波器及其改进方法的主要局限性，从而更好地利用多通道引导图像。首先，我们分析了引导滤波器框架，使用奇异值分解（SVD）从基于patch的角度对其进行了重新解释。这表明，原来的引导滤波器抑制振荡分量是基于它们的特征值。基于这一见解，我们提出了一种新的过滤方法，通过响应其特征值的函数选择性地抑制或增强这些成分。与原始的引导滤波器及其改进方法相比，基于奇异值分解的引导滤波器在边缘保持和降噪方面提供了更好的控制，而原始的引导滤波器通常难以平衡这些任务。我们在各种图像处理应用中验证了该方法，包括去噪、边缘保持平滑、细节增强和边缘增强平滑。结果表明，基于奇异值分解的引导滤波器通过更有效地利用彩色引导图像，始终优于原始的引导滤波器及其改进方法。虽然计算成本略高于原始的引导滤波器，但该方法仍然是高效的。总体而言，本文提出的基于奇异值分解的引导滤波器有了显著的改进，为引导滤波技术的进一步发展奠定了坚实的基础。

{"title":"Extending Guided Filters Through Effective Utilization of Multi-Channel Guide Images Based on Singular Value Decomposition","authors":"Kazu Mishiba","doi":"10.1109/OJSP.2025.3545304","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3545304","url":null,"abstract":"This paper proposes the SVD-based Guided Filter, designed to address key limitations of the original guided filter and its improved methods, providing better use of multi-channel guide images. First, we analyzed the guided filter framework, reinterpreting it from a patch-based perspective using singular value decomposition (SVD). This revealed that the original guided filter suppresses oscillatory components based on their eigenvalues. Building on this insight, we proposed a new filtering method that selectively suppresses or enhances these components through functions that respond to their eigenvalues. The proposed SVD-based Guided Filter offers improved control over edge preservation and noise reduction compared to the original guided filter and its improved methods, which often struggle to balance these tasks. We validated the proposed method across various image processing applications, including denoising, edge-preserving smoothing, detail enhancement, and edge-enhancing smoothing. The results demonstrated that the SVD-based Guided Filter consistently outperforms the original guided filter and its improved methods by making more effective use of color guide images. While the computational cost is slightly higher than the original guided filter, the method remains efficient and highly effective. Overall, the proposed SVD-based Guided Filter delivers notable improvements, offering a solid foundation for further advancements in guided filtering techniques.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"385-397"},"PeriodicalIF":2.9,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10902178","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VITMST++: Efficient Hyperspectral Reconstruction Through Vision Transformer-Based Spatial Compression 基于视觉变换空间压缩的高效高光谱重建

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-02-24 DOI: 10.1109/OJSP.2025.3544891

Ana C. Caznok Silveira;Diedre S. do Carmo;Lucas H. Ueda;Denis G. Fantinato;Paula D. P. Costa;Leticia Rittner

Hyperspectralchannel reconstruction transforms a subsampled multispectral image into hyperspectral imaging, providing higher spectral resolution without a dedicated acquisition hardware and camera. Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (MST++) is a state-of-the-art channel reconstruction technique, but it faces memory limitations for high spatial-resolution images. In this context, we introduced VITMST++, a novel architecture incorporating Vision Transformer embeddings for spatial compression, multi-resolution image context, and a custom channel-weighted loss. Developed for the ICASSP 2024 HyperSkin Challenge, VITMST++ outperforms the state-of-the-art MST++ in both performance and computational efficiency in channel reconstruction. In this work, we perform a deeper analysis on the main aspects of VITMST++ efficiency, quantitative performance, and generalization to other datasets. Results show that VITMST++ achieves similar values of SAM and SSIM hyperspectral reconstruction metrics when compared to state-of-the-art methods, while consuming up to three fold less memory and needing up to 10 times fewer multiply-add operations.

高光谱通道重建将次采样的多光谱图像转换为高光谱成像，在没有专用采集硬件和相机的情况下提供更高的光谱分辨率。mst++是一种先进的信道重建技术，但它在处理高空间分辨率图像时面临内存限制。在此背景下，我们介绍了vitmst++，这是一种结合视觉转换器嵌入的新架构，用于空间压缩、多分辨率图像上下文和自定义信道加权损失。为ICASSP 2024超级皮肤挑战赛而开发的vitmst++在信道重建的性能和计算效率方面都优于最先进的mst++。在这项工作中，我们对vitmst++的效率、定量性能和推广到其他数据集的主要方面进行了更深入的分析。结果表明，与最先进的方法相比，vitmst++实现了相似的SAM和SSIM高光谱重建指标，同时消耗的内存减少了3倍，需要的乘法加运算减少了10倍。

{"title":"VITMST++: Efficient Hyperspectral Reconstruction Through Vision Transformer-Based Spatial Compression","authors":"Ana C. Caznok Silveira;Diedre S. do Carmo;Lucas H. Ueda;Denis G. Fantinato;Paula D. P. Costa;Leticia Rittner","doi":"10.1109/OJSP.2025.3544891","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3544891","url":null,"abstract":"Hyperspectralchannel reconstruction transforms a subsampled multispectral image into hyperspectral imaging, providing higher spectral resolution without a dedicated acquisition hardware and camera. Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (MST++) is a state-of-the-art channel reconstruction technique, but it faces memory limitations for high spatial-resolution images. In this context, we introduced VITMST++, a novel architecture incorporating Vision Transformer embeddings for spatial compression, multi-resolution image context, and a custom channel-weighted loss. Developed for the ICASSP 2024 HyperSkin Challenge, VITMST++ outperforms the state-of-the-art MST++ in both performance and computational efficiency in channel reconstruction. In this work, we perform a deeper analysis on the main aspects of VITMST++ efficiency, quantitative performance, and generalization to other datasets. Results show that VITMST++ achieves similar values of SAM and SSIM hyperspectral reconstruction metrics when compared to state-of-the-art methods, while consuming up to three fold less memory and needing up to 10 times fewer multiply-add operations.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"398-404"},"PeriodicalIF":2.9,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10900394","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Task Nuisance Filtration for Unsupervised Domain Adaptation 无监督域自适应的任务干扰过滤

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-01-30 DOI: 10.1109/OJSP.2025.3536850

David Uliel;Raja Giryes

In unsupervised domain adaptation (UDA) labeled data is available for one domain (Source Domain) which is generated according to some distribution, and unlabeled data is available for a second domain (Target Domain) which is generated from a possibly different distribution but has the same task. The goal is to learn a model that performs well on the target domain although labels are available only for the source data. Many recent works attempt to align the source and the target domains by matching their marginal distributions in a learned feature space. In this paper, we address the domain difference as a nuisance, and enables better adaptability of the domains, by encouraging minimality of the target domain representation, disentanglement of the features, and a smoother feature space that cluster better the target data. To this end, we use the information bottleneck theory and a classical technique from the blind source separation framework, namely, ICA (independent components analysis). We show that these concepts can improve performance of leading domain adaptation methods on various domain adaptation benchmarks.

在无监督域自适应（UDA）中，根据某种分布生成的一个域（源域）可以使用标记数据，而从可能不同的分布生成的另一个域（目标域）可以使用未标记数据。目标是学习一个在目标领域上表现良好的模型，尽管标签仅对源数据可用。最近的许多工作试图通过在学习的特征空间中匹配源域和目标域的边缘分布来对齐源域和目标域。在本文中，我们将领域差异视为一种麻烦，并通过鼓励目标领域表示的最小化，特征的解纠缠以及更平滑的特征空间来更好地聚类目标数据，从而实现更好的领域适应性。为此，我们使用了信息瓶颈理论和盲源分离框架中的经典技术，即ICA（独立分量分析）。我们证明了这些概念可以提高领先的领域自适应方法在各种领域自适应基准上的性能。

{"title":"Task Nuisance Filtration for Unsupervised Domain Adaptation","authors":"David Uliel;Raja Giryes","doi":"10.1109/OJSP.2025.3536850","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3536850","url":null,"abstract":"In unsupervised domain adaptation (UDA) labeled data is available for one domain (Source Domain) which is generated according to some distribution, and unlabeled data is available for a second domain (Target Domain) which is generated from a possibly different distribution but has the same task. The goal is to learn a model that performs well on the target domain although labels are available only for the source data. Many recent works attempt to align the source and the target domains by matching their marginal distributions in a learned feature space. In this paper, we address the domain difference as a nuisance, and enables better adaptability of the domains, by encouraging minimality of the target domain representation, disentanglement of the features, and a smoother feature space that cluster better the target data. To this end, we use the information bottleneck theory and a classical technique from the blind source separation framework, namely, ICA (independent components analysis). We show that these concepts can improve performance of leading domain adaptation methods on various domain adaptation benchmarks.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"303-311"},"PeriodicalIF":2.9,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10858365","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robustifying Routers Against Input Perturbations for Sparse Mixture-of-Experts Vision Transformers 稀疏混合专家视觉变压器对输入扰动的鲁棒增强

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-01-30 DOI: 10.1109/OJSP.2025.3536853

Masahiro Kada;Ryota Yoshihashi;Satoshi Ikehata;Rei Kawakami;Ikuro Sato

Mixture of experts with a sparse expert selection rule has been gaining much attention recently because of its scalability without compromising inference time. However, unlike standard neural networks, sparse mixture-of-experts models inherently exhibit discontinuities in the output space, which may impede the acquisition of appropriate invariance to the input perturbations, leading to a deterioration of model performance for tasks such as classification. To address this issue, we propose Pairwise Router Consistency (PRC) that effectively penalizes the discontinuities occurring under natural deformations of input images. With the supervised loss, the use of PRC loss empirically improves classification accuracy on ImageNet-1 K, CIFAR-10, and CIFAR-100 datasets, compared to a baseline method. Notably, our method with 1-expert selection slightly outperforms the baseline method using 2-expert selection. We also confirmed that models trained with our method experience discontinuous changes less frequently under input perturbations.

基于稀疏专家选择规则的混合专家算法由于其不影响推理时间的可扩展性而受到了广泛的关注。然而，与标准神经网络不同，稀疏混合专家模型在输出空间中固有地表现出不连续，这可能会阻碍对输入扰动的适当不变性的获取，从而导致模型在分类等任务中的性能下降。为了解决这个问题，我们提出了配对路由器一致性（Pairwise Router Consistency， PRC），它可以有效地惩罚输入图像自然变形下出现的不连续。使用监督损失，与基线方法相比，使用PRC损失经验地提高了imagenet - 1k、CIFAR-10和CIFAR-100数据集的分类精度。值得注意的是，我们使用1位专家选择的方法略微优于使用2位专家选择的基线方法。我们还证实，用我们的方法训练的模型在输入扰动下经历不连续变化的频率较低。

{"title":"Robustifying Routers Against Input Perturbations for Sparse Mixture-of-Experts Vision Transformers","authors":"Masahiro Kada;Ryota Yoshihashi;Satoshi Ikehata;Rei Kawakami;Ikuro Sato","doi":"10.1109/OJSP.2025.3536853","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3536853","url":null,"abstract":"Mixture of experts with a sparse expert selection rule has been gaining much attention recently because of its scalability without compromising inference time. However, unlike standard neural networks, sparse mixture-of-experts models inherently exhibit discontinuities in the output space, which may impede the acquisition of appropriate invariance to the input perturbations, leading to a deterioration of model performance for tasks such as classification. To address this issue, we propose Pairwise Router Consistency (PRC) that effectively penalizes the discontinuities occurring under natural deformations of input images. With the supervised loss, the use of PRC loss empirically improves classification accuracy on ImageNet-1 K, CIFAR-10, and CIFAR-100 datasets, compared to a baseline method. Notably, our method with 1-expert selection slightly outperforms the baseline method using 2-expert selection. We also confirmed that models trained with our method experience discontinuous changes less frequently under input perturbations.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"276-283"},"PeriodicalIF":2.9,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10858379","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers 生成音乐变形器的自我监控推理时间干预

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-01-28 DOI: 10.1109/OJSP.2025.3534686

Junghyun Koo;Gordon Wichern;François G. Germain;Sameer Khurana;Jonathan Le Roux

We introduce Self-Monitored Inference-Time INtervention (SMITIN), an approach for controlling an autoregressive generative music transformer using classifier probes. These simple logistic regression probes are trained on the output of each attention head in the transformer using a small dataset of audio examples both exhibiting and missing a specific musical trait (e.g., the presence/absence of drums, or real/synthetic music). We then steer the attention heads in the probe direction, ensuring the generative model output captures the desired musical trait. Additionally, we monitor the probe output to avoid adding an excessive amount of intervention into the autoregressive generation, which could lead to temporally incoherent music. We validate our results objectively and subjectively for both audio continuation and text-to-music applications, demonstrating the ability to add controls to large generative models for which retraining or even fine-tuning is impractical for most musicians. Audio samples of the proposed intervention approach are available on our demo page.

{"title":"SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers","authors":"Junghyun Koo;Gordon Wichern;François G. Germain;Sameer Khurana;Jonathan Le Roux","doi":"10.1109/OJSP.2025.3534686","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3534686","url":null,"abstract":"We introduce Self-Monitored Inference-Time INtervention (SMITIN), an approach for controlling an autoregressive generative music transformer using classifier probes. These simple logistic regression probes are trained on the output of each attention head in the transformer using a small dataset of audio examples both exhibiting and missing a specific musical trait (e.g., the presence/absence of drums, or real/synthetic music). We then steer the attention heads in the probe direction, ensuring the generative model output captures the desired musical trait. Additionally, we monitor the probe output to avoid adding an excessive amount of intervention into the autoregressive generation, which could lead to temporally incoherent music. We validate our results objectively and subjectively for both audio continuation and text-to-music applications, demonstrating the ability to add controls to large generative models for which retraining or even fine-tuning is impractical for most musicians. Audio samples of the proposed intervention approach are available on our <underline>demo page</u>.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"266-275"},"PeriodicalIF":2.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10856829","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Non-Gaussian Process Dynamical Models 非高斯过程动力学模型

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-01-27 DOI: 10.1109/OJSP.2025.3534690

Yaman Kındap;Simon Godsill

Probabilistic dynamical models used in applications in tracking and prediction are typically assumed to be Gaussian noise driven motions since well-known inference algorithms can be applied to these models. However, in many real world examples deviations from Gaussianity are expected to appear, e.g., rapid changes in speed or direction, which cannot be reflected using processes with a smooth mean response. In this work, we introduce the non-Gaussian process (NGP) dynamical model which allow for straightforward modelling of heavy-tailed, non-Gaussian behaviours while retaining a tractable conditional Gaussian process (GP) structure through an infinite mixture of non-homogeneous GPs representation. We present two novel inference methodologies for these new models based on the conditionally Gaussian formulation of NGPs which are suitable for both MCMC and marginalised particle filtering algorithms. The results are demonstrated on synthetically generated data sets.

在跟踪和预测应用中使用的概率动态模型通常被假设为高斯噪声驱动的运动，因为众所周知的推理算法可以应用于这些模型。然而，在许多现实世界的例子中，预计会出现偏离高斯性的情况，例如，速度或方向的快速变化，这不能用具有平滑平均响应的过程来反映。在这项工作中，我们引入了非高斯过程（NGP）动态模型，该模型允许直接建模重尾，非高斯行为，同时通过非齐次GP表示的无限混合保留可处理的条件高斯过程（GP）结构。基于ngp的条件高斯公式，我们提出了两种适用于MCMC和边缘粒子滤波算法的新模型推理方法。结果在综合生成的数据集上得到了验证。

引用次数: 0

Online Learning of Expanding Graphs 展开图的在线学习

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing

Pub Date : 2025-01-27 DOI: 10.1109/OJSP.2025.3534692

Samuel Rey;Bishwadeep Das;Elvin Isufi

This paper addresses the problem of online network topology inference for expanding graphs from a stream of spatiotemporal signals. Online algorithms for dynamic graph learning are crucial in delay-sensitive applications or when changes in topology occur rapidly. While existing works focus on inferring the connectivity within a fixed set of nodes, in practice, the graph can grow as new nodes join the network. This poses additional challenges like modeling temporal dynamics involving signals and graphs of different sizes. This growth also increases the computational complexity of the learning process, which may become prohibitive. To the best of our knowledge, this is the first work to tackle this setting. We propose a general online algorithm based on projected proximal gradient descent that accounts for the increasing graph size at each iteration. Recursively updating the sample covariance matrix is a key aspect of our approach. We introduce a strategy that enables different types of updates for nodes that just joined the network and for previously existing nodes. To provide further insights into the proposed method, we specialize it in Gaussian Markov random field settings, where we analyze the computational complexity and characterize the dynamic cumulative regret. Finally, we demonstrate the effectiveness of the proposed approach using both controlled experiments and real-world datasets from epidemic and financial networks.

本文研究了从时空信号流中展开图的在线网络拓扑推理问题。动态图学习的在线算法在延迟敏感应用或拓扑变化迅速发生时至关重要。虽然现有的工作集中在推断一组固定节点内的连通性，但在实践中，图可以随着新节点加入网络而增长。这带来了额外的挑战，比如建模涉及不同大小的信号和图形的时间动态。这种增长也增加了学习过程的计算复杂性，这可能会变得令人望而却步。据我们所知，这是第一个解决这个问题的工作。我们提出了一种基于投影近端梯度下降的通用在线算法，该算法考虑了每次迭代时图大小的增加。递归地更新样本协方差矩阵是我们方法的一个关键方面。我们引入了一种策略，可以对刚刚加入网络的节点和先前存在的节点进行不同类型的更新。为了进一步深入了解所提出的方法，我们将其专门用于高斯马尔可夫随机场设置，其中我们分析了计算复杂性并表征了动态累积后悔。最后，我们使用控制实验和来自流行病和金融网络的真实数据集证明了所提出方法的有效性。

{"title":"Online Learning of Expanding Graphs","authors":"Samuel Rey;Bishwadeep Das;Elvin Isufi","doi":"10.1109/OJSP.2025.3534692","DOIUrl":"https://doi.org/10.1109/OJSP.2025.3534692","url":null,"abstract":"This paper addresses the problem of online network topology inference for expanding graphs from a stream of spatiotemporal signals. Online algorithms for dynamic graph learning are crucial in delay-sensitive applications or when changes in topology occur rapidly. While existing works focus on inferring the connectivity within a fixed set of nodes, in practice, the graph can grow as new nodes join the network. This poses additional challenges like modeling temporal dynamics involving signals and graphs of different sizes. This growth also increases the computational complexity of the learning process, which may become prohibitive. To the best of our knowledge, this is the first work to tackle this setting. We propose a general online algorithm based on projected proximal gradient descent that accounts for the increasing graph size at each iteration. Recursively updating the sample covariance matrix is a key aspect of our approach. We introduce a strategy that enables different types of updates for nodes that just joined the network and for previously existing nodes. To provide further insights into the proposed method, we specialize it in Gaussian Markov random field settings, where we analyze the computational complexity and characterize the dynamic cumulative regret. Finally, we demonstrate the effectiveness of the proposed approach using both controlled experiments and real-world datasets from epidemic and financial networks.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"247-255"},"PeriodicalIF":2.9,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854617","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0