首页 > 最新文献

Speech Communication最新文献

英文 中文
The dependence of accommodation processes on conversational experience 调节过程对会话体验的依赖
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1016/j.specom.2023.102963
L. Ann Burchfield, Mark Antoniou, Anne Cutler

Conversational partners accommodate to one another's speech, a process that greatly facilitates perception. This process occurs in both first (L1) and second languages (L2); however, recent research has revealed that adaptation can be language-specific, with listeners sometimes applying it in one language but not in another. Here, we investigate whether a supply of novel talkers impacts whether the adaptation is applied, testing Mandarin-English groups whose use of their two languages involves either an extensive or a restricted set of social situations. Perceptual learning in Mandarin and English is examined across two similarly-constituted groups in the same English-speaking environment: (a) heritage language users with Mandarin as family L1 and English as environmental language, and (b) international students with Mandarin as L1 and English as later-acquired L2. In English, exposure to an ambiguous sound in lexically disambiguating contexts prompted the expected retuning of phonemic boundaries in categorisation for the heritage users, but not for the students. In Mandarin, the opposite appeared: the heritage users showed no adaptation, but the students did adapt. In each case where learning did not appear, participants reported using the language in question with fewer interlocutors. The results support the view that successful retuning ability in any language requires regular conversational interaction with novel talkers.

对话伙伴相互适应对方的讲话,这一过程极大地促进了感知。这个过程发生在第一语言(L1)和第二语言(L2)中;然而,最近的研究表明,适应可以是特定于语言的,听众有时会在一种语言中使用它,而不是在另一种语言中。在这里,我们调查了新说话者的供应是否会影响适应性的应用,测试了使用两种语言的普通话-英语群体,他们的使用涉及广泛或有限的社会情境。在相同的英语环境中,对普通话和英语的感知学习在两个相似的群体中进行了研究:(a)以普通话为家庭第一语言、英语为环境语言的传统语言使用者,以及(b)以普通话为第一语言、英语为后来习得的第二语言的国际学生。在英语中,在词汇消歧的语境中,暴露于一个模糊的声音会促使传统使用者在分类中回归音位边界,但对学生却没有。在普通话中,情况正好相反:文物使用者没有表现出适应能力,但学生们确实适应了。在每一种没有学习的情况下,参与者报告说,他们使用的语言与较少的对话者。研究结果支持了这样一种观点,即任何语言的成功回归能力都需要与陌生人进行定期的对话互动。
{"title":"The dependence of accommodation processes on conversational experience","authors":"L. Ann Burchfield,&nbsp;Mark Antoniou,&nbsp;Anne Cutler","doi":"10.1016/j.specom.2023.102963","DOIUrl":"10.1016/j.specom.2023.102963","url":null,"abstract":"<div><p>Conversational partners accommodate to one another's speech, a process that greatly facilitates perception. This process occurs in both first (L1) and second languages (L2); however, recent research has revealed that adaptation can be language-specific, with listeners sometimes applying it in one language but not in another. Here, we investigate whether a supply of novel talkers impacts whether the adaptation is applied, testing Mandarin-English groups whose use of their two languages involves either an extensive or a restricted set of social situations. Perceptual learning in Mandarin and English is examined across two similarly-constituted groups in the same English-speaking environment: (a) heritage language users with Mandarin as family L1 and English as environmental language, and (b) international students with Mandarin as L1 and English as later-acquired L2. In English, exposure to an ambiguous sound in lexically disambiguating contexts prompted the expected retuning of phonemic boundaries in categorisation for the heritage users, but not for the students. In Mandarin, the opposite appeared: the heritage users showed no adaptation, but the students did adapt. In each case where learning did not appear, participants reported using the language in question with fewer interlocutors. The results support the view that successful retuning ability in any language requires regular conversational interaction with novel talkers.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47047534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech 空间和扬声器感知声学建模与有效的数据增强识别多阵列会话语音
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1016/j.specom.2023.102958
Li Chai , Hang Chen , Jun Du , Qing-Feng Liu , Chin-Hui Lee

We propose a space-and-speaker-aware (SSA) approach to acoustic modeling (AM), denoted as SSA-AM, to improve system performances of automatic speech recognition (ASR) in distant multi-array conversational scenarios. In contrast to conventional AM which only uses spectral features from a target speaker as inputs, the inputs to SSA-AM consists of speech features from both the target and interfering speakers, which contain discriminative information from different speakers, including spatial information embedded in interaural phase differences (IPDs) between individual interfering speakers and the target speaker. In the proposed SSA-AM framework, we explore four acoustic model architectures consisting of different combinations of four neural networks, namely deep residual network, factorized time delay neural network, self-attention and residual bidirectional long short-term memory neural network. Various data augmentation techniques are adopted to expand the training data to include different options of beamformed speech obtained from multi-channel speech enhancement. Evaluated on the recent CHiME-6 Challenge Track 1, our proposed SSA-AM framework achieves consistent recognition performance improvements when compared with the official baseline acoustic models. Furthermore, SSA-AM outperforms acoustic models without explicitly using the space and speaker information. Finally, our data augmentation schemes are shown to be especially effective for compact model designs. Code is released at https://github.com/coalboss/SSA_AM.

我们提出了一种空间和说话人感知(SSA)的声学建模(AM)方法,称为SSA-AM,以提高远程多阵列会话场景中自动语音识别(ASR)的系统性能。与仅使用来自目标扬声器的频谱特征作为输入的传统AM相比,SSA-AM的输入由来自目标扬声器和干扰扬声器的语音特征组成,其包含来自不同扬声器的判别信息,包括嵌入在各个干扰扬声器和目标扬声器之间的耳间相位差(IPD)中的空间信息。在所提出的SSA-AM框架中,我们探索了四种由四种神经网络的不同组合组成的声学模型架构,即深度残差网络、因子化时延神经网络、自注意和残差双向长短期记忆神经网络。采用各种数据增强技术来扩展训练数据,以包括从多通道语音增强获得的波束成形语音的不同选项。在最近的CHiME-6挑战赛道1上进行了评估,与官方基线声学模型相比,我们提出的SSA-AM框架实现了一致的识别性能改进。此外,SSA-AM在没有明确使用空间和扬声器信息的情况下优于声学模型。最后,我们的数据扩充方案被证明对紧凑模型设计特别有效。代码发布于https://github.com/coalboss/SSA_AM.
{"title":"Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech","authors":"Li Chai ,&nbsp;Hang Chen ,&nbsp;Jun Du ,&nbsp;Qing-Feng Liu ,&nbsp;Chin-Hui Lee","doi":"10.1016/j.specom.2023.102958","DOIUrl":"https://doi.org/10.1016/j.specom.2023.102958","url":null,"abstract":"<div><p>We propose a space-and-speaker-aware (SSA) approach to acoustic modeling (AM), denoted as SSA-AM, to improve system performances of automatic speech recognition (ASR) in distant multi-array conversational scenarios. In contrast to conventional AM which only uses spectral features from a target speaker as inputs, the inputs to SSA-AM consists of speech features from both the target and interfering speakers, which contain discriminative information from different speakers, including spatial information embedded in interaural phase differences (IPDs) between individual interfering speakers and the target speaker. In the proposed SSA-AM framework, we explore four acoustic model architectures consisting of different combinations of four neural networks, namely deep residual network, factorized time delay neural network, self-attention and residual bidirectional long short-term memory neural network. Various data augmentation techniques are adopted to expand the training data to include different options of beamformed speech obtained from multi-channel speech enhancement. Evaluated on the recent CHiME-6 Challenge Track 1, our proposed SSA-AM framework achieves consistent recognition performance improvements when compared with the official baseline acoustic models. Furthermore, SSA-AM outperforms acoustic models without explicitly using the space and speaker information. Finally, our data augmentation schemes are shown to be especially effective for compact model designs. Code is released at <span>https://github.com/coalboss/SSA_AM</span><svg><path></path></svg>.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49728533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a hybrid word recognition system and dataset for the Azerbaijani Sign Language dactyl alphabet 阿塞拜疆手语dactyl字母的混合单词识别系统和数据集的开发
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1016/j.specom.2023.102960
Jamaladdin Hasanov , Nigar Alishzade , Aykhan Nazimzade , Samir Dadashzade , Toghrul Tahirov

The paper introduces a real-time fingerspelling-to-text translation system for the Azerbaijani Sign Language (AzSL), targeted to the clarification of the words with no available or ambiguous signs. The system consists of both statistical and probabilistic models, used in the sign recognition and sequence generation phases. Linguistic, technical, and human–computer interaction-related challenges, which are usually not considered in publicly available sign-based recognition application programming interfaces and tools, are addressed in this study. The specifics of the AzSL are reviewed, feature selection strategies are evaluated, and a robust model for the translation of hand signs is suggested. The two-stage recognition model exhibits high accuracy during real-time inference. Considering the lack of a publicly available dataset with the benchmark, a new, comprehensive AzSL dataset consisting of 13,444 samples collected by 221 volunteers is described and made publicly available for the sign language recognition community. To extend the dataset and make the model robust to changes, augmentation methods and their effect on the performance are analyzed. A lexicon-based validation method used for the probabilistic analysis and candidate word selection enhances the probability of the recognized phrases. Experiments delivered 94% accuracy on the test dataset, which was close to the real-time user experience. The dataset and implemented software are shared in a public repository for review and further research (CeDAR, 2021; Alishzade et al., 2022). The work has been presented at TeknoFest 2022 and ranked as the first in the category of social-oriented technologies.

本文介绍了阿塞拜疆手语(AzSL)的实时手指拼写到文本翻译系统,旨在澄清没有可用或模棱两可的符号的单词。该系统由统计模型和概率模型组成,用于符号识别和序列生成阶段。语言、技术和人机交互相关的挑战,通常在公开可用的基于符号的识别应用程序编程接口和工具中没有被考虑,在本研究中得到解决。回顾了手语翻译的具体情况,评估了特征选择策略,并提出了一个稳健的手语翻译模型。两阶段识别模型在实时推理中具有较高的准确率。考虑到缺乏具有基准的公开可用数据集,本文描述了由221名志愿者收集的13,444个样本组成的新的综合AzSL数据集,并将其公开提供给手语识别社区。为了扩展数据集并使模型对变化具有鲁棒性,分析了增强方法及其对性能的影响。基于词典的验证方法用于概率分析和候选词选择,提高了识别短语的概率。实验在测试数据集上提供了94%的准确率,接近实时用户体验。数据集和实现的软件在公共存储库中共享,以供审查和进一步研究(CeDAR, 2021;Alishzade et al., 2022)。这项工作已在TeknoFest 2022上展示,并在面向社会的技术类别中排名第一。
{"title":"Development of a hybrid word recognition system and dataset for the Azerbaijani Sign Language dactyl alphabet","authors":"Jamaladdin Hasanov ,&nbsp;Nigar Alishzade ,&nbsp;Aykhan Nazimzade ,&nbsp;Samir Dadashzade ,&nbsp;Toghrul Tahirov","doi":"10.1016/j.specom.2023.102960","DOIUrl":"10.1016/j.specom.2023.102960","url":null,"abstract":"<div><p>The paper introduces a real-time fingerspelling-to-text translation system for the Azerbaijani Sign Language (AzSL), targeted to the clarification of the words with no available or ambiguous signs. The system consists of both statistical and probabilistic models, used in the sign recognition and sequence generation phases. Linguistic, technical, and <em>human–computer interaction</em>-related challenges, which are usually not considered in publicly available sign-based recognition application programming interfaces and tools, are addressed in this study. The specifics of the AzSL are reviewed, feature selection strategies are evaluated, and a robust model for the translation of hand signs is suggested. The two-stage recognition model exhibits high accuracy during real-time inference. Considering the lack of a publicly available dataset with the benchmark, a new, comprehensive AzSL dataset consisting of 13,444 samples collected by 221 volunteers is described and made publicly available for the sign language recognition community. To extend the dataset and make the model robust to changes, augmentation methods and their effect on the performance are analyzed. A lexicon-based validation method used for the probabilistic analysis and candidate word selection enhances the probability of the recognized phrases. Experiments delivered 94% accuracy on the test dataset, which was close to the real-time user experience. The dataset and implemented software are shared in a public repository for review and further research (CeDAR, 2021; Alishzade et al., 2022). The work has been presented at TeknoFest 2022 and ranked as the first in the category of <em>social-oriented technologies</em>.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46498442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A new time–frequency representation based on the tight framelet packet for telephone-band speech coding 基于紧小帧包的电话频段语音编码时频表示方法
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.102954
Souhir Bousselmi, Kaïs Ouni

To improve the quality and intelligibility of telephone-band speech coding, a new time–frequency representation based on a tight framelet packet transform is proposed in this paper. In the context of speech coding, the effectiveness of this representation stems from its resilience to quantization noise, and reconstruction stability. Moreover, it offers a sub-band decomposition and good time–frequency localization according to the critical bands of the human ear. The coded signal is obtained using dynamic bit allocation and optimal quantization of normalized framelet coefficients. The performances of the corresponding method are compared to the critically sampled wavelet packet transform. Extensive simulation revealed that the proposed speech coding scheme, which incorporates the tight framelet packet transform performs better than that based on the critically sampled wavelet packet transform. Furthermore, it ensures a high bit-rate reduction with negligible degradation in speech quality. The proposed coder is found to outperform the standard telephone-band speech coders in term of objective measures and subjective evaluations including a formal listening test. The subjective quality of our codec at 4 kbps is almost identical to the reference G.711 codec operating at 64 kbps.

为了提高电话频段语音编码的质量和可理解性,提出了一种基于紧小帧包变换的时频表示方法。在语音编码的背景下,这种表示的有效性源于其对量化噪声的弹性和重建的稳定性。根据人耳的关键波段进行子波段分解,具有较好的时频定位能力。采用动态位分配和归一化小帧系数的最优量化方法获得编码信号。将该方法的性能与临界采样小波包变换进行了比较。大量的仿真结果表明,采用严格小波包变换的语音编码方案比基于严格采样小波包变换的语音编码方案性能更好。此外,它确保了高比特率的降低,而语音质量的退化可以忽略不计。该编码器在客观测量和主观评价(包括正式的听力测试)方面优于标准电话频段语音编码器。我们编解码器在4kbps下的主观质量几乎与参考G.711编解码器在64kbps下的工作质量相同。
{"title":"A new time–frequency representation based on the tight framelet packet for telephone-band speech coding","authors":"Souhir Bousselmi,&nbsp;Kaïs Ouni","doi":"10.1016/j.specom.2023.102954","DOIUrl":"10.1016/j.specom.2023.102954","url":null,"abstract":"<div><p>To improve the quality and intelligibility of telephone-band speech coding, a new time–frequency representation based on a tight framelet packet transform is proposed in this paper. In the context of speech coding, the effectiveness of this representation stems from its resilience to quantization noise, and reconstruction stability. Moreover, it offers a sub-band decomposition and good time–frequency localization according to the critical bands of the human ear. The coded signal is obtained using dynamic bit allocation and optimal quantization of normalized framelet coefficients. The performances of the corresponding method are compared to the critically sampled wavelet packet transform. Extensive simulation revealed that the proposed speech coding scheme, which incorporates the tight framelet packet transform performs better than that based on the critically sampled wavelet packet transform. Furthermore, it ensures a high bit-rate reduction with negligible degradation in speech quality. The proposed coder is found to outperform the standard telephone-band speech coders in term of objective measures and subjective evaluations including a formal listening test. The subjective quality of our codec at 4 kbps is almost identical to the reference G.711 codec operating at 64 kbps.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46455896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of virtual human sign language translation based on speech recognition 基于语音识别的虚拟人手语翻译应用
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.06.001
Xin Li , Shuying Yang, Haiming Guo

For the application problem of speech recognition to sign language translation, we conducted a study in two parts: improving speech recognition's effectiveness and promoting the application of sign language translation. The mainstream frequency-domain feature has achieved great success in speech recognition. However, it fails to capture the instantaneous gap in speech, and the time-domain feature makes up for this deficiency. In order to combine the advantages of frequency and time domain features, an acoustic architecture with a joint time domain encoder and frequency domain encoder is proposed. A new time-domain feature based on SSM (State-Space-Model) is proposed in the time- domain encoder and encoded using the GRU model. A new model, ConFLASH, is proposed in the frequency domain encoder, which is a lightweight model combining CNN and FLASH (a variant of the Transformer model). It not only reduces the computational complexity of the Transformer model but also effectively integrates the global modeling advantages of the Transformer model and the local modeling advantages of CNN. The Transducer structure is used to decode speech after the encoders are joined. This acoustic model is named GRU-ConFLASH- Transducer. On the self-built dataset and open-source dataset speechocean, it achieves optimal WER (Word Error Rate) of 2.6% and 4.7%. In addition, to better realize the visual application of sign language translation, a 3D virtual human model is designed and developed.

针对语音识别在手语翻译中的应用问题,我们从提高语音识别的有效性和促进手语翻译的应用两方面进行了研究。主流的频域特征在语音识别中取得了巨大的成功。然而,它无法捕捉语音中的瞬时间隙,而时域特征弥补了这一不足。为了结合频域和频域特征的优点,提出了一种时域和频域编码器联合的声学结构。在时域编码器中提出了一种新的基于状态-空间模型的时域特征,并使用GRU模型进行编码。提出了一种新的频域编码器模型ConFLASH,它是一种将CNN和FLASH (Transformer模型的一种变体)相结合的轻量级模型。它不仅降低了Transformer模型的计算复杂度,而且有效地融合了Transformer模型的全局建模优势和CNN的局部建模优势。换能器结构用于在编码器连接后对语音进行解码。该声学模型命名为GRU-ConFLASH-换能器。在自建数据集和开源数据集上,实现了最优的WER (Word Error Rate)分别为2.6%和4.7%。此外,为了更好地实现手语翻译的可视化应用,设计开发了三维虚拟人体模型。
{"title":"Application of virtual human sign language translation based on speech recognition","authors":"Xin Li ,&nbsp;Shuying Yang,&nbsp;Haiming Guo","doi":"10.1016/j.specom.2023.06.001","DOIUrl":"10.1016/j.specom.2023.06.001","url":null,"abstract":"<div><p>For the application problem of speech recognition to sign language translation, we conducted a study in two parts: improving speech recognition's effectiveness and promoting the application of sign language translation. The mainstream frequency-domain feature has achieved great success in speech recognition. However, it fails to capture the instantaneous gap in speech, and the time-domain feature makes up for this deficiency. In order to combine the advantages of frequency and time domain features, an acoustic architecture with a joint time domain encoder and frequency domain encoder is proposed. A new time-domain feature based on SSM (State-Space-Model) is proposed in the time- domain encoder and encoded using the GRU model. A new model, ConFLASH, is proposed in the frequency domain encoder, which is a lightweight model combining CNN and FLASH (a variant of the Transformer model). It not only reduces the computational complexity of the Transformer model but also effectively integrates the global modeling advantages of the Transformer model and the local modeling advantages of CNN. The Transducer structure is used to decode speech after the encoders are joined. This acoustic model is named GRU-ConFLASH- Transducer. On the self-built dataset and open-source dataset speechocean, it achieves optimal WER (Word Error Rate) of 2.6% and 4.7%. In addition, to better realize the visual application of sign language translation, a 3D virtual human model is designed and developed.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48223785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time intelligibility affects the realization of French word-final schwa 实时可理解性影响法语词尾弱读音的实现
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.102962
Georgia Zellou , Ioana Chitoran , Ziqi Zhou

Speech variation has been hypothesized to reflect both speaker-internal influences of lexical access on production and adaptive modifications to make words more intelligible to the listener. The current study considers categorical and gradient variation in the production of word-final schwa in French as explained by lexical access processes, phonological, and/or listener-oriented influences on speech production, while controlling for other factors. To that end, native French speakers completed two laboratory production tasks. In Experiment 1, speakers produced 32 monosyllabic words varying in lexical frequency in a word list production task with no listener feedback. In Experiment 2, speakers produced the same words to an interlocutor while completing a map task varying listener comprehension success across trials: in half the trials, the words are correctly perceived by the interlocutor; in half, there is misunderstanding. Results reveal that speakers are more likely to produce word-final schwa when there is explicit pressure to be intelligible to the interlocutor. Also, when schwa is produced, it is longer preceding a consonant-initial word. Taken together, findings suggest that there are both phonological and clarity-oriented influences on word-final schwa realization in French.

言语变异被认为既反映了说话者内部词汇获取对产出的影响,也反映了听者为使话语更容易理解而进行的适应性修改。在控制其他因素的同时,本研究考虑了法语词尾弱读音产生的分类和梯度变化,解释了词汇获取过程、语音和/或听者导向对语音产生的影响。为此,以法语为母语的人完成了两项实验室生产任务。在实验1中,说话者在一个没有听众反馈的单词列表生成任务中产生32个单音节单词,这些单词的词汇频率各不相同。在实验2中,说话者向对话者说出了相同的单词,同时完成了一项地图任务,在不同的试验中,听者的理解成功率不同:在一半的试验中,对话者正确地理解了这些单词;其中一半是误解。结果表明,当说话者有明确的压力要让对话者听懂时,说话者更有可能产生弱读音。而且,当弱读元音出现时,它要在辅音开头的单词之前更长一些。综上所述,研究结果表明,语音和清晰导向对法语词尾弱读音的实现有双重影响。
{"title":"Real-time intelligibility affects the realization of French word-final schwa","authors":"Georgia Zellou ,&nbsp;Ioana Chitoran ,&nbsp;Ziqi Zhou","doi":"10.1016/j.specom.2023.102962","DOIUrl":"10.1016/j.specom.2023.102962","url":null,"abstract":"<div><p>Speech variation has been hypothesized to reflect both speaker-internal influences of lexical access on production and adaptive modifications to make words more intelligible to the listener. The current study considers categorical and gradient variation in the production of word-final schwa in French as explained by lexical access processes, phonological, and/or listener-oriented influences on speech production, while controlling for other factors. To that end, native French speakers completed two laboratory production tasks. In Experiment 1, speakers produced 32 monosyllabic words varying in lexical frequency in a word list production task with no listener feedback. In Experiment 2, speakers produced the same words to an interlocutor while completing a map task varying listener comprehension success across trials: in half the trials, the words are correctly perceived by the interlocutor; in half, there is misunderstanding. Results reveal that speakers are more likely to produce word-final schwa when there is explicit pressure to be intelligible to the interlocutor. Also, when schwa is produced, it is longer preceding a consonant-initial word. Taken together, findings suggest that there are both phonological and clarity-oriented influences on word-final schwa realization in French.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45132784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing the semi-open set dialect recognition problem under resource-efficient considerations 基于资源效率的半开放集方言识别问题研究
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.102957
Spandan Dey, Goutam Saha

This work presents a resource-efficient solution for the spoken dialect recognition task under semi-open set evaluation scenarios, where a closed set model is exposed to unknown class inputs. We have primarily explored the task 2 of the OLR 2020 challenge for our experiments. In this task, three Chinese dialects Hokkien, Sichuanese, and Shanghainese, are to be recognized. For evaluation, along with the three target dialects, utterances from other unknown classes are also included. We find that the top-performing submissions and the baseline system did not propose solutions that explicitly address the semi-open set scenario. This work pays special attention to the semi-open set nature of the problem and analyzes how the unknown utterances can potentially degrade the overall performance if not treated separately. We train our main dialect classifier with the ECAPA-TDNN architecture and 40-dimensional MFCC from the training data of three dialects. We propose a confidence-assessment algorithm and combine the TDNN performance from both end-to-end and embedding extractor approaches. We then frame the semi-open set scenario as a constrained optimization problem. By solving it, we prove that the performance degradation by the unknown utterances is minimized if the corresponding softmax prediction is equally confused among the target outputs. Based on this criterion, we develop different feedback modules in our system. These modules work on the novelty detection principles and flag unknown class utterances as anomaly. The prediction score of the corresponding utterance is then penalized by flattening. The proposed system achieves Cavg(×100) score of 8.50 and EER (%) of 9.77. Averaging both metrics, the score for our system outperforms the winning submission. Due to the proposed semi-open set adaptations, our system achieves this performance using much less training data and computation resources than the top-performing submissions. Additionally, to verify the broader applicability of the proposed semi-open set solution, we experiment with two other dialect recognition tasks covering English and Arabic languages and larger database sizes.

本文为半开放集评估场景下的口语方言识别任务提供了一种资源高效的解决方案,其中封闭集模型暴露于未知的类输入。我们主要为我们的实验探索了OLR 2020挑战的任务2。在这个任务中,要识别三种中国方言福建话、四川话和上海话。为了评估,除了三种目标方言外,还包括其他未知类别的话语。我们发现,表现最好的提交和基线系统没有提出明确解决半开放集场景的解决方案。这项工作特别关注问题的半开放集性质,并分析了如果不单独处理未知话语如何潜在地降低整体性能。我们使用ECAPA-TDNN架构和40维MFCC从三种方言的训练数据中训练主方言分类器。我们提出了一种置信度评估算法,并结合了端到端和嵌入提取方法的TDNN性能。然后,我们将半开集场景构建为约束优化问题。通过求解该问题,我们证明了如果相应的softmax预测在目标输出中同样混淆,则未知语音对性能的影响最小。基于这一准则,我们在系统中开发了不同的反馈模块。这些模块基于新颖性检测原理,将未知类话语标记为异常。然后,相应话语的预测分数被压平。该系统的Cavg(×100)得分为8.50,EER(%)为9.77。平均这两个指标,我们的系统得分优于获胜的提交。由于所提出的半开放集自适应,我们的系统比表现最好的提交使用更少的训练数据和计算资源实现了这种性能。此外,为了验证所提出的半开放集解决方案的更广泛适用性,我们对另外两个方言识别任务进行了实验,这些任务涵盖英语和阿拉伯语以及更大的数据库规模。
{"title":"Addressing the semi-open set dialect recognition problem under resource-efficient considerations","authors":"Spandan Dey,&nbsp;Goutam Saha","doi":"10.1016/j.specom.2023.102957","DOIUrl":"10.1016/j.specom.2023.102957","url":null,"abstract":"<div><p>This work presents a resource-efficient solution for the spoken dialect recognition task under semi-open set evaluation scenarios, where a closed set model is exposed to unknown class inputs. We have primarily explored the task 2 of the OLR 2020 challenge for our experiments. In this task, three Chinese dialects Hokkien, Sichuanese, and Shanghainese, are to be recognized. For evaluation, along with the three target dialects, utterances from other unknown classes are also included. We find that the top-performing submissions and the baseline system did not propose solutions that explicitly address the semi-open set scenario. This work pays special attention to the semi-open set nature of the problem and analyzes how the unknown utterances can potentially degrade the overall performance if not treated separately. We train our main dialect classifier with the ECAPA-TDNN architecture and 40-dimensional MFCC from the training data of three dialects. We propose a confidence-assessment algorithm and combine the TDNN performance from both end-to-end and embedding extractor approaches. We then frame the semi-open set scenario as a constrained optimization problem. By solving it, we prove that the performance degradation by the unknown utterances is minimized if the corresponding softmax prediction is equally confused among the target outputs. Based on this criterion, we develop different feedback modules in our system. These modules work on the novelty detection principles and flag unknown class utterances as anomaly. The prediction score of the corresponding utterance is then penalized by flattening. The proposed system achieves <span><math><mrow><msub><mrow><mi>C</mi></mrow><mrow><mi>avg</mi></mrow></msub><mrow><mo>(</mo><mo>×</mo><mn>100</mn><mo>)</mo></mrow></mrow></math></span> score of 8.50 and EER <span><math><mrow><mo>(</mo><mtext>%</mtext><mo>)</mo></mrow></math></span> of 9.77. Averaging both metrics, the score for our system outperforms the winning submission. Due to the proposed semi-open set adaptations, our system achieves this performance using much less training data and computation resources than the top-performing submissions. Additionally, to verify the broader applicability of the proposed semi-open set solution, we experiment with two other dialect recognition tasks covering English and Arabic languages and larger database sizes.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47720119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions 基于迭代自适应和动态掩码的多语言儿童语音提取
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.102956
Shi Cheng , Jun Du , Shutong Niu , Alejandrina Cristia , Xin Wang , Qing Wang , Chin-Hui Lee

We develop two improvements over our previously-proposed joint enhancement and separation (JES) framework for child speech extraction in real-world multilingual scenarios. First, we introduce an iterative adaptation based separation (IAS) technique to iteratively fine-tune our pre-trained separation model in JES using data from real scenes to adapt the model. Second, to purify the training data, we propose a dynamic mask separation (DMS) technique with variable lengths in movable windows to locate meaningful speech segments using a scale-invariant signal-to-noise ratio (SI-SNR) objective. With DMS on top of IAS, called DMS+IAS, the combined technique can remove a large number of noise backgrounds and correctly locate speech regions in utterances recorded under real-world scenarios. Evaluated on the BabyTrain corpus, our proposed IAS system achieves consistent extraction performance improvements when compared to our previously-proposed JES framework. Moreover, experimental results also show that the proposed DMS+IAS technique can further improve the quality of separated child speech in real-world scenarios and obtain a relatively good extraction performance in difficult situations where adult speech is mixed with child speech.

我们对之前提出的联合增强和分离(JES)框架进行了两项改进,用于现实世界多语言场景下的儿童语音提取。首先,我们引入了一种基于迭代自适应的分离(IAS)技术,使用来自真实场景的数据对JES中预训练的分离模型进行迭代微调,以适应模型。其次,为了净化训练数据,我们提出了一种可移动窗口可变长度的动态掩模分离(DMS)技术,利用尺度不变的信噪比(SI-SNR)目标定位有意义的语音片段。将DMS置于IAS之上,称为DMS+IAS,这种组合技术可以去除大量噪声背景,并在真实场景下记录的话语中正确定位语音区域。在BabyTrain语料库上进行评估,与之前提出的JES框架相比,我们提出的IAS系统实现了一致的提取性能改进。此外,实验结果还表明,本文提出的DMS+IAS技术可以进一步提高真实场景中分离儿童语音的提取质量,在成人语音与儿童语音混合的困难情况下获得较好的提取性能。
{"title":"Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions","authors":"Shi Cheng ,&nbsp;Jun Du ,&nbsp;Shutong Niu ,&nbsp;Alejandrina Cristia ,&nbsp;Xin Wang ,&nbsp;Qing Wang ,&nbsp;Chin-Hui Lee","doi":"10.1016/j.specom.2023.102956","DOIUrl":"10.1016/j.specom.2023.102956","url":null,"abstract":"<div><p>We develop two improvements over our previously-proposed joint enhancement and separation (JES) framework for child speech extraction in real-world multilingual scenarios. First, we introduce an iterative adaptation based separation (IAS) technique to iteratively fine-tune our pre-trained separation model in JES using data from real scenes to adapt the model. Second, to purify the training data, we propose a dynamic mask separation (DMS) technique with variable lengths in movable windows to locate meaningful speech segments using a scale-invariant signal-to-noise ratio (SI-SNR) objective. With DMS on top of IAS, called DMS+IAS, the combined technique can remove a large number of noise backgrounds and correctly locate speech regions in utterances recorded under real-world scenarios. Evaluated on the BabyTrain corpus, our proposed IAS system achieves consistent extraction performance improvements when compared to our previously-proposed JES framework. Moreover, experimental results also show that the proposed DMS+IAS technique can further improve the quality of separated child speech in real-world scenarios and obtain a relatively good extraction performance in difficult situations where adult speech is mixed with child speech.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41459371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusion-based speech emotion classification using two-stage feature selection 基于融合的两阶段特征选择语音情感分类
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.102955
Jie Xie , Mingying Zhu , Kai Hu

Speech emotion recognition plays an important role in human–computer interaction, which uses speech signals to determine the emotional state. Previous studies have proposed various features and feature selection methods. However, few studies have investigated the two-stage feature selection method for speech emotion classification. In this study, we propose a novel speech emotion classification algorithm based on two-stage feature selection and two fusion strategies. Specifically, three types of features are extracted from speech signals: constant-Q spectrogram-based histogram of oriented gradients, openSMILE, and wavelet packet decomposition-based features. Then, two-stage feature selection using random forest and grey wolf optimization is applied to reduce feature dimension and model training time and improve the classification performance. In addition, both early and late fusion strategies are explored aiming to further improve the performance. Experimental results indicate that early fusion with two-stage feature selection can achieve the best performance. The highest classification accuracy for RAVDESS, SAVEE, EMOVO, and EmoDB is 86.97%, 88.79%, 89.24%, and 95.29%, respectively.

语音情绪识别在人机交互中起着重要的作用,它利用语音信号来判断人的情绪状态。以往的研究提出了各种特征和特征选择方法。然而,针对语音情感分类的两阶段特征选择方法研究较少。在本研究中,我们提出了一种基于两阶段特征选择和两种融合策略的语音情感分类算法。具体来说,从语音信号中提取了三种类型的特征:基于恒q谱图的定向梯度直方图、openSMILE和基于小波包分解的特征。然后,采用随机森林和灰狼优化的两阶段特征选择,降低特征维数和模型训练时间,提高分类性能;此外,还探讨了早期和后期融合策略,以进一步提高性能。实验结果表明,采用两阶段特征选择的早期融合方法可以获得最佳的融合性能。RAVDESS、SAVEE、EMOVO和EmoDB的最高分类准确率分别为86.97%、88.79%、89.24%和95.29%。
{"title":"Fusion-based speech emotion classification using two-stage feature selection","authors":"Jie Xie ,&nbsp;Mingying Zhu ,&nbsp;Kai Hu","doi":"10.1016/j.specom.2023.102955","DOIUrl":"10.1016/j.specom.2023.102955","url":null,"abstract":"<div><p>Speech emotion recognition plays an important role in human–computer interaction, which uses speech signals to determine the emotional state. Previous studies have proposed various features and feature selection methods. However, few studies have investigated the two-stage feature selection method for speech emotion classification. In this study, we propose a novel speech emotion classification algorithm based on two-stage feature selection and two fusion strategies. Specifically, three types of features are extracted from speech signals: constant-Q spectrogram-based histogram of oriented gradients, openSMILE, and wavelet packet decomposition-based features. Then, two-stage feature selection using random forest and grey wolf optimization is applied to reduce feature dimension and model training time and improve the classification performance. In addition, both early and late fusion strategies are explored aiming to further improve the performance. Experimental results indicate that early fusion with two-stage feature selection can achieve the best performance. The highest classification accuracy for RAVDESS, SAVEE, EMOVO, and EmoDB is 86.97%, 88.79%, 89.24%, and 95.29%, respectively.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42570284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple voice disorders in the same individual: Investigating handcrafted features, multi-label classification algorithms, and base-learners 同一个体的多重语音障碍:调查手工特征、多标签分类算法和基础学习器
IF 3.2 3区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1016/j.specom.2023.102952
Sylvio Barbon Junior , Rodrigo Capobianco Guido , Gabriel Jonas Aguiar , Everton José Santana , Mario Lemes Proença Junior , Hemant A. Patil

Non-invasive acoustic analyses of voice disorders have been at the forefront of current biomedical research. Usual strategies, essentially based on machine learning (ML) algorithms, commonly classify a subject as being either healthy or pathologically-affected. Nevertheless, the latter state is not always a result of a sole laryngeal issue, i.e., multiple disorders might exist, demanding multi-label classification procedures for effective diagnoses. Consequently, the objective of this paper is to investigate the application of five multi-label classification methods based on problem transformation to play the role of base-learners, i.e., Label Powerset, Binary Relevance, Nested Stacking, Classifier Chains, and Dependent Binary Relevance with Random Forest (RF) and Support Vector Machine (SVM), in addition to a Deep Neural Network (DNN) from an algorithm adaptation method, to detect multiple voice disorders, i.e., Dysphonia, Laryngitis, Reinke’s Edema, Vox Senilis, and Central Laryngeal Motion Disorder. Receiving as input three handcrafted features, i.e., signal energy (SE), zero-crossing rates (ZCRs), and signal entropy (SH), which allow for interpretable descriptors in terms of speech analysis, production, and perception, we observed that the DNN-based approach powered with SE-based feature vectors presented the best values of F1-score among the tested methods, i.e., 0.943, as the averaged value from all the balancing scenarios, under Saarbrücken Voice Database (SVD) and considering 20% of balancing rate with Synthetic Minority Over-sampling Technique (SMOTE). Finally, our findings of most false negatives for laryngitis may explain the reason why its detection is a serious issue in speech technology. The results we report provide an original contribution, allowing for the consistent detection of multiple speech pathologies and advancing the state-of-the-art in the field of handcrafted acoustic-based non-invasive diagnosis of voice disorders.

声音障碍的非侵入性声学分析已成为当前生物医学研究的前沿。通常的策略基本上是基于机器学习(ML)算法,通常将受试者分为健康或病理影响。然而,后一种状态并不总是单一喉部问题的结果,即可能存在多种疾病,需要多标签分类程序才能有效诊断。因此,本文的目的是研究基于问题变换的五种多标签分类方法的应用,即标签Powerset、二进制关联、嵌套堆叠、分类器链和依赖二进制关联,以及随机森林(RF)和支持向量机(SVM),以及来自算法自适应方法的深度神经网络(DNN),以发挥基本学习器的作用,检测多种语音障碍,如语音障碍、喉炎、赖因克水肿、老年性声带和中枢性喉运动障碍。接收三个手工特征,即信号能量(SE)、过零率(zcr)和信号熵(SH)作为输入,这些特征允许在语音分析、产生和感知方面提供可解释的描述符,我们观察到基于SE的特征向量支持的基于dnn的方法在测试方法中具有最佳的f1得分值,即0.943,作为所有平衡场景的平均值。在saarbr cken语音数据库(SVD)下,使用合成少数派过采样技术(SMOTE)考虑20%的平衡率。最后,我们发现喉炎的大多数假阴性可能解释了为什么它的检测在语音技术中是一个严重的问题。我们报告的结果提供了一个原始的贡献,允许对多种语言病理进行一致的检测,并推进了语音障碍的手工声学非侵入性诊断领域的最新技术。
{"title":"Multiple voice disorders in the same individual: Investigating handcrafted features, multi-label classification algorithms, and base-learners","authors":"Sylvio Barbon Junior ,&nbsp;Rodrigo Capobianco Guido ,&nbsp;Gabriel Jonas Aguiar ,&nbsp;Everton José Santana ,&nbsp;Mario Lemes Proença Junior ,&nbsp;Hemant A. Patil","doi":"10.1016/j.specom.2023.102952","DOIUrl":"10.1016/j.specom.2023.102952","url":null,"abstract":"<div><p>Non-invasive acoustic analyses of voice disorders have been at the forefront of current biomedical research. Usual strategies, essentially based on machine learning (ML) algorithms, commonly classify a subject as being either healthy or pathologically-affected. Nevertheless, the latter state is not always a result of a sole laryngeal issue, i.e., multiple disorders might exist, demanding multi-label classification procedures for effective diagnoses. Consequently, the objective of this paper is to investigate the application of five multi-label classification methods based on problem transformation to play the role of base-learners, i.e., Label Powerset, Binary Relevance, Nested Stacking, Classifier Chains, and Dependent Binary Relevance with Random Forest (RF) and Support Vector Machine (SVM), in addition to a Deep Neural Network (DNN) from an algorithm adaptation method, to detect multiple voice disorders, i.e., Dysphonia, Laryngitis, Reinke’s Edema, Vox Senilis, and Central Laryngeal Motion Disorder. Receiving as input three handcrafted features, i.e., signal energy (SE), zero-crossing rates (ZCRs), and signal entropy (SH), which allow for interpretable descriptors in terms of speech analysis, production, and perception, we observed that the DNN-based approach powered with SE-based feature vectors presented the best values of F1-score among the tested methods, i.e., 0.943, as the averaged value from all the balancing scenarios, under Saarbrücken Voice Database (SVD) and considering 20% of balancing rate with Synthetic Minority Over-sampling Technique (SMOTE). Finally, our findings of most false negatives for laryngitis may explain the reason why its detection is a serious issue in speech technology. The results we report provide an original contribution, allowing for the consistent detection of multiple speech pathologies and advancing the state-of-the-art in the field of handcrafted acoustic-based non-invasive diagnosis of voice disorders.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44587953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Speech Communication
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1