首页 > 最新文献

IEEE Journal of Selected Topics in Signal Processing最新文献

英文 中文
Guest Editorial: IEEE JSTSP Special Issue on Deep Multimodal Speech Enhancement and Separation (DEMSES) 嘉宾评论:IEEE JSTSP深度多模态语音增强与分离(DEMSES)特刊
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-27 DOI: 10.1109/JSTSP.2025.3570397
Amir Hussain;Yu Tsao;John H.L. Hansen;Naomi Harte;Shinji Watanabe;Isabel Trancoso;Shixiong Zhang
{"title":"Guest Editorial: IEEE JSTSP Special Issue on Deep Multimodal Speech Enhancement and Separation (DEMSES)","authors":"Amir Hussain;Yu Tsao;John H.L. Hansen;Naomi Harte;Shinji Watanabe;Isabel Trancoso;Shixiong Zhang","doi":"10.1109/JSTSP.2025.3570397","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3570397","url":null,"abstract":"","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 4","pages":"596-599"},"PeriodicalIF":8.7,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11054321","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Signal Processing Society Information IEEE信号处理学会信息
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-27 DOI: 10.1109/JSTSP.2025.3570405
{"title":"IEEE Signal Processing Society Information","authors":"","doi":"10.1109/JSTSP.2025.3570405","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3570405","url":null,"abstract":"","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 4","pages":"C3-C3"},"PeriodicalIF":8.7,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11054323","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variational Bayesian Channel Estimation and Data Detection for Cell-Free Massive MIMO With Low-Resolution Quantized Fronthaul Links 具有低分辨率量化前传链路的无小区大规模MIMO变分贝叶斯信道估计和数据检测
IF 13.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-19 DOI: 10.1109/JSTSP.2025.3579644
Sajjad Nassirpour;Toan-Van Nguyen;Hien Quoc Ngo;Le-Nam Tran;Tharmalingam Ratnarajah;Duy H. N. Nguyen
We study the joint channel estimation and data detection (JED) problem in cell-free massive MIMO (CF-mMIMO) networks, where access points (APs) forward signals to a central processing unit (CPU) over fronthaul links. Due to bandwidth limitations of these links, especially with a growing number of users, efficient processing becomes challenging. To address this, we propose a variational Bayesian (VB) inference-based method for JED that accommodates low-resolution quantized signals from APs. We consider two approaches: quantization-and-estimation (Q-E) and estimation-and-quantization (E-Q). In Q-E, each AP directly quantizes its received signals before forwarding them to the CPU. In E-Q, each AP first estimates channels locally during the pilot phase, then sends quantized versions of both the local channel estimates and received data to the CPU. The final JED process in both Q-E and E-Q is performed at the CPU. We evaluate our proposed approach under perfect fronthaul links (PFL) with unquantized received signals, Q-E, and E-Q, using symbol error rate (SER), channel normalized mean squared error (NMSE), computational complexity, and fronthaul signaling overhead as performance metrics. Our methods are benchmarked against both linear and nonlinear state-of-the-art JED techniques. Numerical results demonstrate that our VB-based approaches consistently outperform the linear baseline by leveraging the nonlinear VB framework. They also surpass existing nonlinear methods due to: i) a fully VB-driven formulation, which performs better than hybrid schemes such as VB combined with expectation maximization; and ii) the stability of our approach under correlated channels, where competing methods may fail to converge or experience performance degradation.
我们研究了无蜂窝大规模MIMO (CF-mMIMO)网络中的联合信道估计和数据检测(JED)问题,其中接入点(ap)通过前传链路将信号转发给中央处理器(CPU)。由于这些链路的带宽限制,特别是随着用户数量的增加,有效的处理变得具有挑战性。为了解决这个问题,我们提出了一种基于变分贝叶斯(VB)推理的JED方法,该方法适用于来自ap的低分辨率量化信号。我们考虑了两种方法:量化和估计(Q-E)和估计和量化(E-Q)。在Q-E中,每个AP直接将接收到的信号量化后再转发给CPU。在E-Q中,每个AP首先在试点阶段估计本地信道,然后将本地信道估计和接收到的数据的量化版本发送给CPU。Q-E和E-Q中的最终JED进程都是在CPU上执行的。我们使用符号错误率(SER)、信道归一化均方误差(NMSE)、计算复杂性和前传信令开销作为性能指标,在具有未量化接收信号Q-E和E-Q的完美前传链路(PFL)下评估了我们提出的方法。我们的方法是针对最先进的线性和非线性JED技术的基准测试。数值结果表明,通过利用非线性VB框架,我们基于VB的方法始终优于线性基线。它们也超越了现有的非线性方法,因为:i)一个完全VB驱动的公式,它比混合方案(如VB结合期望最大化)性能更好;ii)我们的方法在相关信道下的稳定性,其中竞争方法可能无法收敛或经历性能下降。
{"title":"Variational Bayesian Channel Estimation and Data Detection for Cell-Free Massive MIMO With Low-Resolution Quantized Fronthaul Links","authors":"Sajjad Nassirpour;Toan-Van Nguyen;Hien Quoc Ngo;Le-Nam Tran;Tharmalingam Ratnarajah;Duy H. N. Nguyen","doi":"10.1109/JSTSP.2025.3579644","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3579644","url":null,"abstract":"We study the joint channel estimation and data detection (JED) problem in cell-free massive MIMO (CF-mMIMO) networks, where access points (APs) forward signals to a central processing unit (CPU) over fronthaul links. Due to bandwidth limitations of these links, especially with a growing number of users, efficient processing becomes challenging. To address this, we propose a variational Bayesian (VB) inference-based method for JED that accommodates low-resolution quantized signals from APs. We consider two approaches: <italic>quantization-and-estimation</i> (Q-E) and <italic>estimation-and-quantization</i> (E-Q). In Q-E, each AP directly quantizes its received signals before forwarding them to the CPU. In E-Q, each AP first estimates channels locally during the pilot phase, then sends quantized versions of both the local channel estimates and received data to the CPU. The final JED process in both Q-E and E-Q is performed at the CPU. We evaluate our proposed approach under perfect fronthaul links (PFL) with unquantized received signals, Q-E, and E-Q, using symbol error rate (SER), channel normalized mean squared error (NMSE), computational complexity, and fronthaul signaling overhead as performance metrics. Our methods are benchmarked against both linear and nonlinear state-of-the-art JED techniques. Numerical results demonstrate that our VB-based approaches consistently outperform the linear baseline by leveraging the nonlinear VB framework. They also surpass existing nonlinear methods due to: <italic>i)</i> a fully VB-driven formulation, which performs better than hybrid schemes such as VB combined with expectation maximization; and <italic>ii)</i> the stability of our approach under correlated channels, where competing methods may fail to converge or experience performance degradation.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 6","pages":"1187-1202"},"PeriodicalIF":13.7,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Detection of Articulatory-Based Disfluencies in Primary Progressive Aphasia 原发性进行性失语症中发音不流畅的自动检测
IF 13.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-16 DOI: 10.1109/JSTSP.2025.3579972
Jiachen Lian;Xuanru Zhou;Chenxu Guo;Zongli Ye;Zoe Ezzes;Jet M.J. Vonk;Brittany Morin;David Baquirin;Zachary Miller;Maria Luisa Gorno-Tempini;Gopala Krishna Anumanchipalli
Speech corpora are collections of textual data derived from human verbal output and speech signals that can be processed from a variety of perspectives, including formal or semantic content, to serve analyses of different levels of linguistic organisation (phonemic, morphosyntactic, lexico-semantic and content information, prosody and intonation) and to serve analyses of important phenomena such as speech fluency and errors (non-fluencies). We focus on transcribing speech along with non-fluencies or dysfluencies, the detection of which plays an important role in the diagnosis of primary progressive aphasia, where we specifically examine articulation-based dysfluencies in nfvPPA speech. In this work, we propose SSDM 2.0, which is built on top of the current state-of-the-art system of dysfluency detection [1] and tackles its shortcomings via four main contributions: (1) We propose a novel Neural Articulatory Flow for deriving highly scalable, dysfluency-aware speech representations. (2) We develop a full-stack connectionist subsequence aligner to capture all major dysfluency types. (3) We introduce a mispronunciation prompt pipeline and consistency learning into LLMs to enable in-context dysfluency learning. (4) We curate and open-source Libri-Co-Dys (Lian et al., 2024), the largest co-dysfluency corpus to date. (5) We also present SSDM-L, a modular, non-end-to-end, lightweight model designed for clinical deployment. In clinical experiments on pathological speech transcription, we tested SSDM 2.0 using nfvPPA corpus primarily characterized by articulatory dysfluencies. Overall, SSDM 2.0 outperforms SSDM and all other dysfluency transcription models by a large margin.
语音语料库是从人类语言输出和语音信号中提取的文本数据的集合,可以从各种角度进行处理,包括形式或语义内容,以服务于不同层次的语言组织(音位、形态句法、词汇语义和内容信息、韵律和语调)的分析,并服务于分析语音流畅性和错误(非流畅性)等重要现象。我们专注于转录非流利或不流利的语音,其检测在原发性进行性失语症的诊断中起着重要作用,我们特别研究了nfvPPA语音中基于发音的不流利。在这项工作中,我们提出了SSDM 2.0,它建立在当前最先进的非流利检测系统[1]的基础上,并通过四个主要贡献解决了它的缺点:(1)我们提出了一种新的神经发音流,用于派生高度可扩展的,非流利感知的语音表示。(2)我们开发了一个全栈连接子序列对齐器来捕获所有主要的流利障碍类型。(3)我们在法学硕士课程中引入了发音错误提示管道和一致性学习,以实现语境中的非流利性学习。(4)我们管理并开源了lib - co-dys (Lian et al., 2024),这是迄今为止最大的协同不流畅语料库。(5)我们还提出了SSDM-L,这是一种模块化、非端到端轻量级模型,专为临床部署而设计。在病理语音转录的临床实验中,我们使用以发音障碍为主要特征的nfvPPA语料库来测试SSDM 2.0。总体而言,SSDM 2.0在很大程度上优于SSDM和所有其他异常转录模型。
{"title":"Automatic Detection of Articulatory-Based Disfluencies in Primary Progressive Aphasia","authors":"Jiachen Lian;Xuanru Zhou;Chenxu Guo;Zongli Ye;Zoe Ezzes;Jet M.J. Vonk;Brittany Morin;David Baquirin;Zachary Miller;Maria Luisa Gorno-Tempini;Gopala Krishna Anumanchipalli","doi":"10.1109/JSTSP.2025.3579972","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3579972","url":null,"abstract":"Speech corpora are collections of textual data derived from human verbal output and speech signals that can be processed from a variety of perspectives, including formal or semantic content, to serve analyses of different levels of linguistic organisation (phonemic, morphosyntactic, lexico-semantic and content information, prosody and intonation) and to serve analyses of important phenomena such as speech fluency and errors (non-fluencies). We focus on transcribing speech along with non-fluencies or dysfluencies, the detection of which plays an important role in the diagnosis of primary progressive aphasia, where we specifically examine articulation-based dysfluencies in nfvPPA speech. In this work, we propose SSDM 2.0, which is built on top of the current state-of-the-art system of dysfluency detection [1] and tackles its shortcomings via four main contributions: (1) We propose a novel <italic>Neural Articulatory Flow</i> for deriving highly scalable, dysfluency-aware speech representations. (2) We develop a <italic>full-stack connectionist subsequence aligner</i> to capture all major dysfluency types. (3) We introduce a mispronunciation prompt pipeline and consistency learning into LLMs to enable in-context dysfluency learning. (4) We curate and open-source <italic>Libri-Co-Dys</i> (Lian et al., 2024), the largest co-dysfluency corpus to date. (5) We also present <italic>SSDM-L</i>, a modular, non-end-to-end, lightweight model designed for clinical deployment. In clinical experiments on pathological speech transcription, we tested SSDM 2.0 using nfvPPA corpus primarily characterized by <italic>articulatory dysfluencies</i>. Overall, SSDM 2.0 outperforms SSDM and all other dysfluency transcription models by a large margin.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 5","pages":"810-826"},"PeriodicalIF":13.7,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145659200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-Range and Non-Stationary Encoding for Dysarthric Speech Data Augmentation 困难语音数据增强的远程非平稳编码
IF 13.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-18 DOI: 10.1109/JSTSP.2025.3562417
Daipeng Zhang;Hongcheng Zhang;Wenhuan Lu;Wei Li;Jinghong Wang;Jianguo Wei
Data augmentation methods have been employed to address the deficiencies in dysarthric speech datasets, achieving state-of-the-art (SOTA) results in the Dysarthric Speech Recognition (DSR) task. Current research on Dysarthric Speech Synthesis (DSS), however, fails to focus on the encoding of pathological features in dysarthric speech. The dysarthric speech is characterized by its discontinuous pronunciation, uncontrolled volume, slow speech, and excessive nasal sounds. Moreover, compared with typical speech, the dysarthric speech contains more non-stationary components generated by the explosive pronunciation, hoarseness, and air-flow noise during the pronunciation. We propose a DSS model named the Long-range and Non-stationary Variational Autoencoder (LNVAE). The LNVAE estimates the acoustic parameters of dysarthric speech by encoding the long-range dependency duration of phonemes in frame-level representations of dysarthric speech. Moreover, the LNVAE employs the Gaussian noise perturbation within the latent variables to capture the non-stationary fluctuations in dysarthric speech. The experiments were conducted on the speech synthesis and recognition tasks using the CDSD Chinese and UASpeech English corpora. The dysarthric speech synthesized by the LNVAE achieved the best performance across 29 and 28 objective metrics in the Chinese and English datasets, respectively. The synthesized speech also received the highest score from speech rehabilitation experts in the MOS experiments. The Whisper model fine-tuned on the synthesized data, achieved the lowest CER on the Chinese CDSD dataset. Moreover, for the UASpeech dataset, we increased the data by 0.5 times to fine-tune the DSR model, yet surpassed the current SOTA method, which uses four times more augmentation data, by 4.52 $%$.
数据增强方法已被用于解决困难语音数据集的不足,在困难语音识别(DSR)任务中实现了最先进的(SOTA)结果。然而,目前对困难语音合成(DSS)的研究并未关注困难语音病理特征的编码。发音困难的特点是发音不连贯,音量不控制,语速慢,鼻音过多。此外,与典型语音相比,发音困难语音包含更多的非平稳成分,这些非平稳成分是由发音中的爆发声、嘶哑声和气流噪声产生的。提出了一种远程非平稳变分自编码器(LNVAE)模型。LNVAE通过编码困难语音帧级表征中音素的长期依赖持续时间来估计困难语音的声学参数。此外,LNVAE利用隐变量内的高斯噪声扰动来捕捉困难语音中的非平稳波动。利用CDSD汉语和uasspeech英语语料库进行了语音合成和识别实验。在中文和英文数据集中,LNVAE合成的困难语音在29个和28个客观指标上分别取得了最好的表现。合成语音在MOS实验中也获得了语言康复专家的最高分。在合成数据上进行微调的Whisper模型在中文CDSD数据集上取得了最低的CER。此外,对于uasspeech数据集,我们增加了0.5倍的数据来微调DSR模型,但超过了目前使用四倍增强数据的SOTA方法,提高了4.52美元。
{"title":"Long-Range and Non-Stationary Encoding for Dysarthric Speech Data Augmentation","authors":"Daipeng Zhang;Hongcheng Zhang;Wenhuan Lu;Wei Li;Jinghong Wang;Jianguo Wei","doi":"10.1109/JSTSP.2025.3562417","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3562417","url":null,"abstract":"Data augmentation methods have been employed to address the deficiencies in dysarthric speech datasets, achieving state-of-the-art (SOTA) results in the Dysarthric Speech Recognition (DSR) task. Current research on Dysarthric Speech Synthesis (DSS), however, fails to focus on the encoding of pathological features in dysarthric speech. The dysarthric speech is characterized by its discontinuous pronunciation, uncontrolled volume, slow speech, and excessive nasal sounds. Moreover, compared with typical speech, the dysarthric speech contains more non-stationary components generated by the explosive pronunciation, hoarseness, and air-flow noise during the pronunciation. We propose a DSS model named the Long-range and Non-stationary Variational Autoencoder (LNVAE). The LNVAE estimates the acoustic parameters of dysarthric speech by encoding the long-range dependency duration of phonemes in frame-level representations of dysarthric speech. Moreover, the LNVAE employs the Gaussian noise perturbation within the latent variables to capture the non-stationary fluctuations in dysarthric speech. The experiments were conducted on the speech synthesis and recognition tasks using the CDSD Chinese and UASpeech English corpora. The dysarthric speech synthesized by the LNVAE achieved the best performance across 29 and 28 objective metrics in the Chinese and English datasets, respectively. The synthesized speech also received the highest score from speech rehabilitation experts in the MOS experiments. The Whisper model fine-tuned on the synthesized data, achieved the lowest CER on the Chinese CDSD dataset. Moreover, for the UASpeech dataset, we increased the data by 0.5 times to fine-tune the DSR model, yet surpassed the current SOTA method, which uses four times more augmentation data, by 4.52<inline-formula> <tex-math>$%$</tex-math></inline-formula>.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 5","pages":"767-782"},"PeriodicalIF":13.7,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145659215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$C^{2}$AV-TSE: Context and Confidence-Aware Audio Visual Target Speaker Extraction 上下文和自信感知的视听目标说话人提取
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-15 DOI: 10.1109/JSTSP.2025.3560513
Wenxuan Wu;Xueyuan Chen;Shuai Wang;Jiadong Wang;Lingwei Meng;Xixin Wu;Helen Meng;Haizhou Li
Audio-Visual Target Speaker Extraction (AV-TSE) aims to mimic the human ability to enhance auditory perception using visual cues. Although numerous models have been proposed recently, most of them estimate target signals by primarily relying on local dependencies within acoustic features, underutilizing the human-like capacity to infer unclear parts of speech through contextual information. This limitation results in not only suboptimal performance but also inconsistent extraction quality across the utterance, with some segments exhibiting poor quality or inadequate suppression of interfering speakers. To close this gap, we propose a model-agnostic strategy called the Mask-And-Recover (MAR). It integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules. Additionally, to better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model to assess extraction quality and guide extraction modules to emphasize improvement on low-quality segments. To validate the effectiveness of our proposed model-agnostic training paradigm, six popular AV-TSE backbones were adopted for evaluation on the VoxCeleb2 dataset, demonstrating consistent performance improvements across various metrics.
视听目标说话人提取(AV-TSE)旨在模拟人类利用视觉线索增强听觉感知的能力。虽然最近提出了许多模型,但大多数模型主要依赖于声学特征中的局部依赖关系来估计目标信号,没有充分利用人类通过上下文信息推断言语不清晰部分的能力。这种限制不仅导致性能不佳,而且导致整个话语的提取质量不一致,一些片段表现出较差的质量或对干扰说话者的抑制不足。为了缩小这一差距,我们提出了一种模型不可知的策略,称为掩码和恢复(MAR)。它集成了模态间和模态内的上下文关联,以支持提取模块内的全局推理。此外,为了更好地针对每个样本中具有挑战性的部分,我们引入了细粒度置信度评分(FCS)模型来评估提取质量,并指导提取模块强调对低质量部分的改进。为了验证我们提出的与模型无关的训练范式的有效性,采用了六个流行的AV-TSE主干在VoxCeleb2数据集上进行评估,证明了在各种指标上的一致性能改进。
{"title":"$C^{2}$AV-TSE: Context and Confidence-Aware Audio Visual Target Speaker Extraction","authors":"Wenxuan Wu;Xueyuan Chen;Shuai Wang;Jiadong Wang;Lingwei Meng;Xixin Wu;Helen Meng;Haizhou Li","doi":"10.1109/JSTSP.2025.3560513","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3560513","url":null,"abstract":"Audio-Visual Target Speaker Extraction (AV-TSE) aims to mimic the human ability to enhance auditory perception using visual cues. Although numerous models have been proposed recently, most of them estimate target signals by primarily relying on local dependencies within acoustic features, underutilizing the human-like capacity to infer unclear parts of speech through contextual information. This limitation results in not only suboptimal performance but also inconsistent extraction quality across the utterance, with some segments exhibiting poor quality or inadequate suppression of interfering speakers. To close this gap, we propose a model-agnostic strategy called the Mask-And-Recover (MAR). It integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules. Additionally, to better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model to assess extraction quality and guide extraction modules to emphasize improvement on low-quality segments. To validate the effectiveness of our proposed model-agnostic training paradigm, six popular AV-TSE backbones were adopted for evaluation on the VoxCeleb2 dataset, demonstrating consistent performance improvements across various metrics.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 4","pages":"646-657"},"PeriodicalIF":8.7,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HPCNet: Hybrid Pixel and Contour Network for Audio-Visual Speech Enhancement With Low-Quality Video HPCNet:用于低质量视频的视听语音增强的混合像素和轮廓网络
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-10 DOI: 10.1109/JSTSP.2025.3559763
Hang Chen;Chen-Yue Zhang;Qing Wang;Jun Du;Sabato Marco Siniscalchi;Shi-Fu Xiong;Gen-Shun Wan
To advance audio-visual speech enhancement (AVSE) research in low-quality video settings, we introduce the multimodal information-based speech processing-low quality video (MISP-LQV) benchmark, which includes a 120-hour real-world Mandarin audio-visual dataset, two video degradation simulation methods, and benchmark results from several well-known AVSE models. We also propose a novel hybrid pixel and contour network (HPCNet), incorporating a lip reconstruction and distillation (LRD) module and a contour graph convolution (CGConv) layer. Specifically, the LRD module reconstructs high-quality lip frames from low-quality audio-visual data, utilizing knowledge distillation from a teacher model trained on high-quality data. The CGConv layer employs spatio-temporal and semantic-contextual graphs to capture complex relationships among lip landmark points. Extensive experiments on the MISP-LQV benchmark reveal the performance degradation caused by low-quality video across various AVSE models. Notably, including real/simulated low-quality videos in AVSE training enhances its robustness to low-quality videos but degrades the performance of high-quality videos.The proposed HPCNet demonstrates strong robustness against video quality degradation, which can be attributed to (1) the reconstructed lip frames closely aligning with high-quality frames and (2) the contour features exhibiting consistency across different video quality levels. The generalizability of HPCNet also has been validated through experiments on the 2nd COG-MHEAR AVSE Challenge dataset.
为了推进低质量视频环境下的视听语音增强(AVSE)研究,我们引入了基于多模态信息的语音处理-低质量视频(MISP-LQV)基准测试,该测试包括120小时的真实普通话视听数据集、两种视频退化模拟方法以及几个知名AVSE模型的基准测试结果。我们还提出了一种新的混合像素和轮廓网络(HPCNet),它包含唇重构和蒸馏(LRD)模块和轮廓图卷积(CGConv)层。具体来说,LRD模块利用经过高质量数据训练的教师模型的知识蒸馏,从低质量的视听数据中重建高质量的唇框。CGConv层采用时空图和语义上下文图来捕捉唇标点之间的复杂关系。在MISP-LQV基准上进行的大量实验揭示了各种AVSE模型的低质量视频导致的性能下降。值得注意的是,在AVSE训练中加入真实/模拟的低质量视频增强了其对低质量视频的鲁棒性,但降低了高质量视频的性能。所提出的HPCNet对视频质量退化具有很强的鲁棒性,这可归因于:(1)重构的唇帧与高质量帧紧密对齐;(2)不同视频质量水平的轮廓特征具有一致性。在第二次COG-MHEAR AVSE Challenge数据集上的实验也验证了HPCNet的泛化性。
{"title":"HPCNet: Hybrid Pixel and Contour Network for Audio-Visual Speech Enhancement With Low-Quality Video","authors":"Hang Chen;Chen-Yue Zhang;Qing Wang;Jun Du;Sabato Marco Siniscalchi;Shi-Fu Xiong;Gen-Shun Wan","doi":"10.1109/JSTSP.2025.3559763","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3559763","url":null,"abstract":"To advance audio-visual speech enhancement (AVSE) research in low-quality video settings, we introduce the multimodal information-based speech processing-low quality video (MISP-LQV) benchmark, which includes a 120-hour real-world Mandarin audio-visual dataset, two video degradation simulation methods, and benchmark results from several well-known AVSE models. We also propose a novel hybrid pixel and contour network (HPCNet), incorporating a lip reconstruction and distillation (LRD) module and a contour graph convolution (CGConv) layer. Specifically, the LRD module reconstructs high-quality lip frames from low-quality audio-visual data, utilizing knowledge distillation from a teacher model trained on high-quality data. The CGConv layer employs spatio-temporal and semantic-contextual graphs to capture complex relationships among lip landmark points. Extensive experiments on the MISP-LQV benchmark reveal the performance degradation caused by low-quality video across various AVSE models. Notably, including real/simulated low-quality videos in AVSE training enhances its robustness to low-quality videos but degrades the performance of high-quality videos.The proposed HPCNet demonstrates strong robustness against video quality degradation, which can be attributed to (1) the reconstructed lip frames closely aligning with high-quality frames and (2) the contour features exhibiting consistency across different video quality levels. The generalizability of HPCNet also has been validated through experiments on the 2nd COG-MHEAR AVSE Challenge dataset.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 4","pages":"671-684"},"PeriodicalIF":8.7,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Input-Independent Subject-Adaptive Channel Selection for Brain-Assisted Speech Enhancement 脑辅助语音增强的输入独立主体自适应信道选择
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-07 DOI: 10.1109/JSTSP.2025.3558653
Qingtian Xu;Jie Zhang;Zhenhua Ling
Brain-assisted speech enhancement (BASE) that utilizes electroencephalogram (EEG) signals as an assistive modality has shown a great potential for extracting the target speaker in multi-talker conditions. This is feasible as the EEG measurements contain the auditory attention of hearing-impaired listeners that can be leveraged to classify the target identity. Considering that an EEG cap with sparse channels exhibits multiple benefits and in practice many electrodes might contribute marginally, the EEG channel selection for BASE is desired. This problem was tackled in a subject-invariant manner in literature, the resulting BASE performance varies significantly across subjects. In this work, we therefore propose an input-independent subject-adaptive channel selection method for BASE, called subject-adaptive convolutional regularization selection (SA-ConvRS), which enables a personalized informative channel distribution. We observe the abnormal over memory phenomenon that facilitates the model to perform BASE without any brain signals, which often occurs in related fields due to the data recording and validation conditions. To remove this effect, we further design a task-based multi-process adversarial training (TMAT) approach by exploiting pseudo-EEG inputs. Experimental results on a public dataset show that the proposed SA-ConvRS can achieve subject-adaptive channel selections and keep the BASE performance close to the full-channel upper bound; the TMAT can avoid the over memory problem without sacrificing the performance of SA-ConvRS.
利用脑电图(EEG)信号作为辅助方式的脑辅助语音增强(BASE)在多语环境下提取目标说话人方面显示出巨大的潜力。这是可行的,因为脑电图测量包含听力受损听众的听觉注意,可以用来对目标身份进行分类。考虑到具有稀疏通道的EEG帽具有多种优点,并且在实践中许多电极可能贡献不大,因此需要BASE的EEG通道选择。在文献中,这个问题是以主题不变的方式解决的,因此不同主题的BASE性能差异很大。因此,在这项工作中,我们提出了一种独立于输入的BASE主题自适应信道选择方法,称为主题自适应卷积正则化选择(SA-ConvRS),它可以实现个性化的信息信道分布。我们观察到由于数据记录和验证条件的原因,在相关领域经常出现的异常超记忆现象,使得模型在没有任何脑信号的情况下执行BASE。为了消除这种影响,我们进一步设计了一种基于任务的多进程对抗训练(TMAT)方法,利用伪eeg输入。在公共数据集上的实验结果表明,所提出的SA-ConvRS可以实现主体自适应信道选择,并使BASE性能接近全信道上限;TMAT可以在不牺牲SA-ConvRS性能的前提下避免内存过大的问题。
{"title":"Input-Independent Subject-Adaptive Channel Selection for Brain-Assisted Speech Enhancement","authors":"Qingtian Xu;Jie Zhang;Zhenhua Ling","doi":"10.1109/JSTSP.2025.3558653","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3558653","url":null,"abstract":"Brain-assisted speech enhancement (BASE) that utilizes electroencephalogram (EEG) signals as an assistive modality has shown a great potential for extracting the target speaker in multi-talker conditions. This is feasible as the EEG measurements contain the auditory attention of hearing-impaired listeners that can be leveraged to classify the target identity. Considering that an EEG cap with sparse channels exhibits multiple benefits and in practice many electrodes might contribute marginally, the EEG channel selection for BASE is desired. This problem was tackled in a subject-invariant manner in literature, the resulting BASE performance varies significantly across subjects. In this work, we therefore propose an input-independent subject-adaptive channel selection method for BASE, called subject-adaptive convolutional regularization selection (SA-ConvRS), which enables a personalized informative channel distribution. We observe the abnormal <italic>over memory</i> phenomenon that facilitates the model to perform BASE without any brain signals, which often occurs in related fields due to the data recording and validation conditions. To remove this effect, we further design a task-based multi-process adversarial training (TMAT) approach by exploiting pseudo-EEG inputs. Experimental results on a public dataset show that the proposed SA-ConvRS can achieve subject-adaptive channel selections and keep the BASE performance close to the full-channel upper bound; the TMAT can avoid the over memory problem without sacrificing the performance of SA-ConvRS.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 4","pages":"658-670"},"PeriodicalIF":8.7,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complex-Valued Autoencoder-Based Neural Data Compression for SAR Raw Data 基于复值自编码器的SAR原始数据神经数据压缩
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-07 DOI: 10.1109/JSTSP.2025.3558651
Reza Mohammadi Asiyabi;Mihai Datcu;Andrei Anghel;Adrian Focsa;Michele Martone;Paola Rizzoli;Ernesto Imbembo
Recent advances in Synthetic Aperture Radar (SAR) sensors and innovative advanced imagery techniques have enabled SAR systems to acquire very high-resolution images with wide swaths, large bandwidth and in multiple polarization channels. The improvements of the SAR system capabilities also imply a significant increase in SAR data acquisition rates, such that efficient and effective compression methods become necessary. The compression of SAR raw data plays a crucial role in addressing the challenges posed by downlink and memory limitations onboard the SAR satellites and directly affects the quality of the generated SAR image. Neural data compression techniques using deep models have attracted many interests for natural image compression tasks and demonstrated promising results. In this study, neural data compression is extended into the complex domain to develop a Complex-Valued (CV) autoencoder-based data compression for SAR raw data. To this end, the basic fundamentals of data compression and Rate-Distortion (RD) theory are reviewed, well known data compression methods, Block Adaptive Quantization (BAQ) and JPEG2000 methods, are implemented and tested for SAR raw data compression, and a neural data compression based on CV autoencoders is developed for SAR raw data. Furthermore, since the available Sentinel-1 SAR raw products are already compressed with Flexible Dynamic BAQ (FDBAQ), an adaptation procedure applied to the decoded SAR raw data to generate SAR raw data with quasi-uniform quantization that resemble the statistics of the uncompressed SAR raw data onboard the satellites.
合成孔径雷达(SAR)传感器的最新进展和创新的先进成像技术使SAR系统能够获得宽波段、大带宽和多极化通道的高分辨率图像。SAR系统能力的改进也意味着SAR数据采集率的显著提高,因此需要高效和有效的压缩方法。SAR原始数据的压缩在解决SAR卫星下行链路和内存限制所带来的挑战方面起着至关重要的作用,并直接影响生成的SAR图像的质量。利用深度模型的神经数据压缩技术在自然图像压缩任务中引起了许多人的兴趣,并显示出良好的结果。本研究将神经数据压缩扩展到复域,开发一种基于复值(CV)自编码器的SAR原始数据压缩方法。为此,回顾了数据压缩的基本原理和率失真(RD)理论,对SAR原始数据压缩的常用数据压缩方法,即块自适应量化(BAQ)和JPEG2000方法进行了实现和测试,并开发了一种基于CV自编码器的SAR原始数据神经数据压缩方法。此外,由于现有的Sentinel-1 SAR原始产品已经使用柔性动态BAQ (FDBAQ)进行了压缩,因此对解码的SAR原始数据进行了自适应处理,生成了类似于卫星上未压缩SAR原始数据统计的准均匀量化SAR原始数据。
{"title":"Complex-Valued Autoencoder-Based Neural Data Compression for SAR Raw Data","authors":"Reza Mohammadi Asiyabi;Mihai Datcu;Andrei Anghel;Adrian Focsa;Michele Martone;Paola Rizzoli;Ernesto Imbembo","doi":"10.1109/JSTSP.2025.3558651","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3558651","url":null,"abstract":"Recent advances in Synthetic Aperture Radar (SAR) sensors and innovative advanced imagery techniques have enabled SAR systems to acquire very high-resolution images with wide swaths, large bandwidth and in multiple polarization channels. The improvements of the SAR system capabilities also imply a significant increase in SAR data acquisition rates, such that efficient and effective compression methods become necessary. The compression of SAR raw data plays a crucial role in addressing the challenges posed by downlink and memory limitations onboard the SAR satellites and directly affects the quality of the generated SAR image. Neural data compression techniques using deep models have attracted many interests for natural image compression tasks and demonstrated promising results. In this study, neural data compression is extended into the complex domain to develop a Complex-Valued (CV) autoencoder-based data compression for SAR raw data. To this end, the basic fundamentals of data compression and Rate-Distortion (RD) theory are reviewed, well known data compression methods, Block Adaptive Quantization (BAQ) and JPEG2000 methods, are implemented and tested for SAR raw data compression, and a neural data compression based on CV autoencoders is developed for SAR raw data. Furthermore, since the available Sentinel-1 SAR raw products are already compressed with Flexible Dynamic BAQ (FDBAQ), an adaptation procedure applied to the decoded SAR raw data to generate SAR raw data with quasi-uniform quantization that resemble the statistics of the uncompressed SAR raw data onboard the satellites.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 3","pages":"572-582"},"PeriodicalIF":8.7,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10955162","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144073335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAV-SE: Scene-Aware Audio-Visual Speech Enhancement With Selective State Space Model 基于选择性状态空间模型的场景感知视听语音增强
IF 8.7 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-07 DOI: 10.1109/JSTSP.2025.3558654
Xinyuan Qian;Jiaran Gao;Yaodan Zhang;Qiquan Zhang;Hexin Liu;Leibny Paola Garcia Perera;Haizhou Li
Speech enhancement plays an essential role in various applications, and the integration of visual information has been demonstrated to bring substantial advantages. However, the majority of current research concentrates on the examination of facial and lip movements, which can be compromised or entirely inaccessible in scenarios where occlusions occur or when the camera view is distant. Whereas contextual visual cues from the surrounding environment have been overlooked: for example, when we see a dog bark, our brain has the innate ability to discern and filter out the barking noise. To this end, in this paper, we introduce a novel task, i.e. Scene-aware Audio-Visual Speech Enhancement (SAV-SE). To our best knowledge, this is the first proposal to use rich contextual information from synchronized video as auxiliary cues to indicate the type of noise, which eventually improves the speech enhancement performance. Specifically, we propose the VC-S $^{2}$ E method, which incorporates the Conformer and Mamba modules for their complementary strengths. Extensive experiments are conducted on public MUSIC, AVSpeech and AudioSet datasets, where the results demonstrate the superiority of VC-S $^{2}$ E over other competitive methods.
语音增强在各种应用中发挥着至关重要的作用,而视觉信息的集成已被证明具有巨大的优势。然而,目前的大多数研究都集中在面部和嘴唇运动的检查上,这些运动在发生遮挡或相机视野较远的情况下可能会受到损害或完全无法访问。然而,来自周围环境的上下文视觉线索却被忽视了:例如,当我们看到狗叫时,我们的大脑天生就有能力辨别并过滤掉狗叫的噪音。为此,在本文中,我们引入了一种新的任务,即场景感知视听语音增强(SAV-SE)。据我们所知,这是第一个使用来自同步视频的丰富上下文信息作为辅助线索来指示噪声类型的建议,最终提高了语音增强性能。具体来说,我们提出了VC-S $^{2}$ E方法,该方法结合了Conformer和Mamba模块的互补优势。在公共MUSIC, AVSpeech和AudioSet数据集上进行了大量的实验,结果表明VC-S $^{2}$ E优于其他竞争方法。
{"title":"SAV-SE: Scene-Aware Audio-Visual Speech Enhancement With Selective State Space Model","authors":"Xinyuan Qian;Jiaran Gao;Yaodan Zhang;Qiquan Zhang;Hexin Liu;Leibny Paola Garcia Perera;Haizhou Li","doi":"10.1109/JSTSP.2025.3558654","DOIUrl":"https://doi.org/10.1109/JSTSP.2025.3558654","url":null,"abstract":"Speech enhancement plays an essential role in various applications, and the integration of visual information has been demonstrated to bring substantial advantages. However, the majority of current research concentrates on the examination of facial and lip movements, which can be compromised or entirely inaccessible in scenarios where occlusions occur or when the camera view is distant. Whereas contextual visual cues from the surrounding environment have been overlooked: for example, when we see a dog bark, our brain has the innate ability to discern and filter out the barking noise. To this end, in this paper, we introduce a novel task, i.e. Scene-aware Audio-Visual Speech Enhancement (SAV-SE). To our best knowledge, this is the first proposal to use rich contextual information from synchronized video as auxiliary cues to indicate the type of noise, which eventually improves the speech enhancement performance. Specifically, we propose the VC-S <inline-formula><tex-math>$^{2}$</tex-math></inline-formula> E method, which incorporates the Conformer and Mamba modules for their complementary strengths. Extensive experiments are conducted on public MUSIC, AVSpeech and AudioSet datasets, where the results demonstrate the superiority of VC-S <inline-formula><tex-math>$^{2}$</tex-math></inline-formula> E over other competitive methods.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"19 4","pages":"623-634"},"PeriodicalIF":8.7,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Selected Topics in Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1