首页 > 最新文献

Computer Speech and Language最新文献

英文 中文
Model discrepancy policy optimization for task-oriented dialogue 面向任务对话的模型差异策略优化
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-03-06 DOI: 10.1016/j.csl.2024.101636
Zhenyou Zhou, Zhibin Liu, Zhaoan Dong, Yuhan Liu

Task-oriented dialogue systems use deep reinforcement learning (DRL) to learn policies, and agent interaction with user models can help the agent enhance its generalization capacity. But user models frequently lack the language complexity of human interlocutors and contain generative errors, and their design biases can impair the agent’s ability to function well in certain situations. In this paper, we incorporate an evaluator based on inverse reinforcement learning into the model to determine the quality of the dialogue of user models in order to recruit high-quality user models for training. We can successfully regulate the quality of training trajectories while maintaining their diversity by constructing a sampling environment distribution to pick high-quality user models to participate in policy learning. The evaluation on the Multiwoz dataset demonstrates that it is capable of successfully improving the dialogue agents’ performance.

以任务为导向的对话系统使用深度强化学习(DRL)来学习策略,而代理与用户模型的交互可以帮助代理增强其泛化能力。但是,用户模型往往缺乏人类对话者的语言复杂性,而且包含生成错误,其设计偏差会损害代理在某些情况下的良好运作能力。在本文中,我们在模型中加入了基于反强化学习的评估器,以确定用户模型的对话质量,从而招募高质量的用户模型进行训练。我们可以通过构建采样环境分布来挑选高质量的用户模型参与策略学习,从而成功地调节训练轨迹的质量,同时保持其多样性。在 Multiwoz 数据集上进行的评估表明,它能够成功提高对话代理的性能。
{"title":"Model discrepancy policy optimization for task-oriented dialogue","authors":"Zhenyou Zhou,&nbsp;Zhibin Liu,&nbsp;Zhaoan Dong,&nbsp;Yuhan Liu","doi":"10.1016/j.csl.2024.101636","DOIUrl":"10.1016/j.csl.2024.101636","url":null,"abstract":"<div><p>Task-oriented dialogue systems use deep reinforcement learning (DRL) to learn policies, and agent interaction with user models can help the agent enhance its generalization capacity. But user models frequently lack the language complexity of human interlocutors and contain generative errors, and their design biases can impair the agent’s ability to function well in certain situations. In this paper, we incorporate an evaluator based on inverse reinforcement learning into the model to determine the quality of the dialogue of user models in order to recruit high-quality user models for training. We can successfully regulate the quality of training trajectories while maintaining their diversity by constructing a sampling environment distribution to pick high-quality user models to participate in policy learning. The evaluation on the Multiwoz dataset demonstrates that it is capable of successfully improving the dialogue agents’ performance.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140047069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Next word prediction for Urdu language using deep learning models 使用深度学习模型预测乌尔都语的下一个单词
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-03-02 DOI: 10.1016/j.csl.2024.101635
Ramish Shahid, Aamir Wali, Maryam Bashir

Deep learning models are being used for natural language processing. Despite their success, these models have been employed for only a few languages. Pretrained models also exist but they are mostly available for the English language. Low resource languages like Urdu are not able to benefit from these pre-trained deep learning models and their effectiveness in Urdu language processing remains a question. This paper investigates the usefulness of deep learning models for the next word prediction and suggestion model for Urdu. For this purpose, this study considers and proposes two word prediction models for Urdu. Firstly, we propose to use LSTM for neural language modeling of Urdu. LSTMs are a popular approach for language modeling due to their ability to process sequential data. Secondly, we employ BERT which was specifically designed for natural language modeling. We train BERT from scratch using an Urdu corpus consisting of 1.1 million sentences thus paving the way for further studies in the Urdu language. We achieved an accuracy of 52.4% with LSTM and 73.7% with BERT. Our proposed BERT model outperformed two other pre-trained BERT models developed for Urdu. Since this is a multi-class problem and the number of classes is equal to the vocabulary size, this accuracy is still promising. Based on the present performance, BERT seems to be effective for the Urdu language, and this paper lays the groundwork for future studies.

深度学习模型正被用于自然语言处理。尽管取得了成功,但这些模型仅用于少数语言。预训练模型也存在,但它们大多适用于英语。像乌尔都语这样的低资源语言无法从这些预训练的深度学习模型中获益,它们在乌尔都语处理中的有效性仍然是个问题。本文研究了深度学习模型在乌尔都语下一个词预测和建议模型中的实用性。为此,本研究考虑并提出了两种乌尔都语单词预测模型。首先,我们建议使用 LSTM 对乌尔都语进行神经语言建模。LSTM 具有处理连续数据的能力,因此是一种流行的语言建模方法。其次,我们采用了专为自然语言建模而设计的 BERT。我们使用由 110 万个句子组成的乌尔都语语料库从头开始训练 BERT,从而为乌尔都语的进一步研究铺平了道路。LSTM 的准确率为 52.4%,BERT 的准确率为 73.7%。我们提出的 BERT 模型优于为乌尔都语开发的其他两个预训练 BERT 模型。由于这是一个多类问题,而类的数量与词汇量相等,因此这一准确率仍然很有希望。根据目前的表现,BERT 似乎对乌尔都语很有效,本文为今后的研究奠定了基础。
{"title":"Next word prediction for Urdu language using deep learning models","authors":"Ramish Shahid,&nbsp;Aamir Wali,&nbsp;Maryam Bashir","doi":"10.1016/j.csl.2024.101635","DOIUrl":"https://doi.org/10.1016/j.csl.2024.101635","url":null,"abstract":"<div><p>Deep learning models are being used for natural language processing. Despite their success, these models have been employed for only a few languages. Pretrained models also exist but they are mostly available for the English language. Low resource languages like Urdu are not able to benefit from these pre-trained deep learning models and their effectiveness in Urdu language processing remains a question. This paper investigates the usefulness of deep learning models for the next word prediction and suggestion model for Urdu. For this purpose, this study considers and proposes two word prediction models for Urdu. Firstly, we propose to use LSTM for neural language modeling of Urdu. LSTMs are a popular approach for language modeling due to their ability to process sequential data. Secondly, we employ BERT which was specifically designed for natural language modeling. We train BERT from scratch using an Urdu corpus consisting of 1.1 million sentences thus paving the way for further studies in the Urdu language. We achieved an accuracy of 52.4% with LSTM and 73.7% with BERT. Our proposed BERT model outperformed two other pre-trained BERT models developed for Urdu. Since this is a multi-class problem and the number of classes is equal to the vocabulary size, this accuracy is still promising. Based on the present performance, BERT seems to be effective for the Urdu language, and this paper lays the groundwork for future studies.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140016379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MECOS: A bilingual Manipuri–English spontaneous code-switching speech corpus for automatic speech recognition MECOS:用于自动语音识别的曼尼普尔语-英语双语自发代码转换语音库
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-20 DOI: 10.1016/j.csl.2024.101627
Naorem Karline Singh, Yambem Jina Chanu, Hoomexsun Pangsatabam

In this study, we introduce a new code-switched speech database with 57h of Manipuri–English annotated spontaneous speech. Manipuri is an official language of India and is primarily spoken in the north–eastern Indian state of Manipur. Most Manipur native speakers today are bilingual and frequently use code switching in everyday discussions. By carefully assessing the amount of code-switched speech in each video, recordings from YouTube are gathered. 21,339 utterances and 291,731 instances of code switching are present in the database. Given the code-switching nature of the data, a proper annotation procedure is used, and the data are manually annotated using the Meitei Mayek unicode font and the roman alphabets for Manipuri and English, respectively. The transcription includes the information of the speakers, non-speech information, and the corresponding annotation. The aim of this research is to construct an automatic speech recognition (ASR) system as well as offer a thorough analysis and details of the speech corpus. We believe that our research is the first to use an ASR system for Manipuri–English code-switched speech. To evaluate the performance, ASR systems based on hybrid deep neural network and hidden Markov model (DNN–HMM), time delay neural network (TDNN), hybrid time delay neural network and long short-term memory (TDNN–LSTM) and three end-to-end (E2E) models i.e. hybrid connectionist temporal classification and attention model (CTC-Attention), Conformer, wav2vec XLSR are developed for Manipuri–English language. In comparison to other models, pure TDNN produces outcomes that are clearly superior.

在本研究中,我们引入了一个新的代码切换语音数据库,其中包含 57h 曼尼普尔语-英语注释自发语音。曼尼普尔语是印度的官方语言,主要在印度东北部的曼尼普尔邦使用。如今,大多数曼尼普尔本地人都会说两种语言,并且在日常讨论中经常使用语码转换。通过仔细评估每段视频中的代码转换语音量,我们收集了 YouTube 上的录音。数据库中共有 21,339 个语句和 291,731 个语码转换实例。考虑到数据的代码切换性质,我们使用了适当的注释程序,并分别使用曼尼普尔语和英语的迈特马耶克统一编码字体和罗马字母对数据进行了人工注释。转录包括说话人信息、非语音信息和相应的注释。本研究的目的是构建一个自动语音识别(ASR)系统,并对语音语料库进行全面分析和详细说明。我们相信,我们的研究是首个将 ASR 系统用于曼尼普尔语-英语代码转换语音的研究。为了评估性能,我们为曼尼普尔语-英语开发了基于混合深度神经网络和隐马尔可夫模型(DNN-HMM)、时延神经网络(TDNN)、混合时延神经网络和长短期记忆(TDNN-LSTM)以及三种端到端(E2E)模型(即混合连接主义时间分类和注意力模型(CTC-Attention)、Conformer、wav2vec XLSR)的 ASR 系统。与其他模型相比,纯 TDNN 产生的结果明显更优。
{"title":"MECOS: A bilingual Manipuri–English spontaneous code-switching speech corpus for automatic speech recognition","authors":"Naorem Karline Singh,&nbsp;Yambem Jina Chanu,&nbsp;Hoomexsun Pangsatabam","doi":"10.1016/j.csl.2024.101627","DOIUrl":"10.1016/j.csl.2024.101627","url":null,"abstract":"<div><p>In this study, we introduce a new code-switched speech database with 57h of Manipuri–English annotated spontaneous speech. Manipuri is an official language of India and is primarily spoken in the north–eastern Indian state of Manipur. Most Manipur native speakers today are bilingual and frequently use code switching in everyday discussions. By carefully assessing the amount of code-switched speech in each video, recordings from YouTube are gathered. 21,339 utterances and 291,731 instances of code switching are present in the database. Given the code-switching nature of the data, a proper annotation procedure is used, and the data are manually annotated using the Meitei Mayek unicode font and the roman alphabets for Manipuri and English, respectively. The transcription includes the information of the speakers, non-speech information, and the corresponding annotation. The aim of this research is to construct an automatic speech recognition (ASR) system as well as offer a thorough analysis and details of the speech corpus. We believe that our research is the first to use an ASR system for Manipuri–English code-switched speech. To evaluate the performance, ASR systems based on hybrid deep neural network and hidden Markov model (DNN–HMM), time delay neural network (TDNN), hybrid time delay neural network and long short-term memory (TDNN–LSTM) and three end-to-end (E2E) models i.e. hybrid connectionist temporal classification and attention model (CTC-Attention), Conformer, wav2vec XLSR are developed for Manipuri–English language. In comparison to other models, pure TDNN produces outcomes that are clearly superior.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translating scientific abstracts in the bio-medical domain with structure-aware models 翻译和预测医学领域科学摘要的文档结构
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-09 DOI: 10.1016/j.csl.2024.101623
Sadaf Abdul Rauf , François Yvon

Machine Translation (MT) technologies have improved in many ways and generate usable outputs for a growing number of domains and language pairs. Yet, most sentence based MT systems struggle with contextual dependencies, processing small chunks of texts, typically sentences, in isolation from their textual context. This is likely to cause systematic errors or inconsistencies when processing long documents. While various attempts are made to handle extended contexts in translation, the relevance of these contextual cues, especially those related to the structural organization, and the extent to which they affect translation quality remains an under explored area. In this work, we explore ways to take these structural aspects into account, by integrating document structure as an extra conditioning context. Our experiments on biomedical abstracts, which are usually structured in a rigid way, suggest that this type of structural information can be useful for MT and document structure prediction. We also present in detail the impact of structural information on MT output and assess the degree to which structural information can be learned from the data.

机器翻译(MT)技术在很多方面都有所改进,并为越来越多的领域和语言对生成可用的输出结果。然而,大多数基于句子的 MT 系统在处理上下文依存关系时都很吃力,它们在处理小块文本(通常是句子)时脱离了文本上下文。在处理长篇文档时,这很可能会导致系统错误或不一致。虽然在翻译中处理扩展上下文的尝试层出不穷,但这些上下文线索(尤其是与结构组织相关的线索)的相关性及其对翻译质量的影响程度仍是一个尚未充分探索的领域。在这项工作中,我们通过将文档结构整合为额外的条件语境,探索将这些结构方面考虑在内的方法。我们在生物医学摘要上进行的实验表明,这类结构信息对于 MT 和文档结构预测非常有用。我们还详细介绍了结构信息对 MT 输出的影响,并评估了从数据中学习结构信息的程度。
{"title":"Translating scientific abstracts in the bio-medical domain with structure-aware models","authors":"Sadaf Abdul Rauf ,&nbsp;François Yvon","doi":"10.1016/j.csl.2024.101623","DOIUrl":"10.1016/j.csl.2024.101623","url":null,"abstract":"<div><p>Machine Translation (MT) technologies have improved in many ways and generate usable outputs for a growing number of domains and language pairs. Yet, most sentence based MT systems struggle with contextual dependencies, processing small chunks of texts, typically sentences, in isolation from their textual context. This is likely to cause systematic errors or inconsistencies when processing long documents. While various attempts are made to handle extended contexts in translation, the relevance of these contextual cues, especially those related to the structural organization, and the extent to which they affect translation quality remains an under explored area. In this work, we explore ways to take these structural aspects into account, by integrating document structure as an extra conditioning context. Our experiments on biomedical abstracts, which are usually structured in a rigid way, suggest that this type of structural information can be useful for MT and document structure prediction. We also present in detail the impact of structural information on MT output and assess the degree to which structural information can be learned from the data.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139884004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel Chinese–Tibetan mixed-language rumor detector with multi-extractor representations 采用多提取器表征的新型汉藏混合语言谣言检测器
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-07 DOI: 10.1016/j.csl.2024.101625
Lisu Yu , Fei Li , Lixin Yu , Wei Li , Zhicheng Dong , Donghong Cai , Zhen Wang

Rumors can easily propagate through social media, posing potential threats to both individual and public health. Most existing approaches focus on single-language rumor detection, which leads to unsatisfying performance when these are applied to mixed-language rumor detection. Meanwhile, the type of mixed-language (mixture of word-level or sentence-level) is a great challenge for mixed-language rumor detection. In this paper, focusing on a mixed scene of Chinese and Tibetan, the research first provides a Chinese–Tibetan mixed-language rumor detection dataset (Weibo_Ch_Ti) that comprises 1,617 non-rumor tweets and 1,456 rumor tweets in two mixed-language types. Then, the research proposes an effective model with multi-extractors, named “MER-CTRD” for short. This model mainly consists of three extractors. The Multi-task Extractor helps the model to extract feature representations of different mixed-language types adaptively. The Rich-semantic Extractor enriches the semantic features representations of Tibetan in the Chinese–Tibetan-mixed language. The Fusion-feature Extractor fuses the mean and disparity semantic features of Chinese and Tibetan to complement feature representations of the mixed language. Finally, the research conducts experiments on Weibo_Ch_Ti. The results show that the proposed model improves accuracy by about 3%–12% over the baseline models, indicating its effectiveness in the Chinese–Tibetan mixed-language rumor detection scenario.

谣言很容易通过社交媒体传播,对个人和公众健康都构成潜在威胁。现有的大多数方法都侧重于单语言谣言检测,当这些方法应用于混合语言谣言检测时,其性能并不令人满意。同时,混合语言的类型(词级混合或句子级混合)也是混合语言谣言检测的一大挑战。本文以汉藏混合场景为研究对象,首先提供了一个汉藏混合语言谣言检测数据集(Weibo_Ch_Ti),其中包括 1,617 条非谣言推文和 1,456 条谣言推文两种混合语言类型。然后,研究提出了一个有效的多提取器模型,简称为 "MER-CTRD"。该模型主要由三个提取器组成。多任务提取器帮助模型自适应地提取不同混合语言类型的特征表征。丰富语义提取器丰富了汉藏混合语中藏语的语义特征表征。融合特征提取器融合了汉语和藏语的均值和差异语义特征,以补充混合语言的特征表征。最后,研究人员在微博_Ch_Ti 上进行了实验。结果表明,所提出的模型比基线模型的准确率提高了约 3%-12%,表明其在汉藏混合语谣言检测场景中的有效性。
{"title":"A novel Chinese–Tibetan mixed-language rumor detector with multi-extractor representations","authors":"Lisu Yu ,&nbsp;Fei Li ,&nbsp;Lixin Yu ,&nbsp;Wei Li ,&nbsp;Zhicheng Dong ,&nbsp;Donghong Cai ,&nbsp;Zhen Wang","doi":"10.1016/j.csl.2024.101625","DOIUrl":"10.1016/j.csl.2024.101625","url":null,"abstract":"<div><p>Rumors can easily propagate through social media, posing potential threats to both individual and public health. Most existing approaches focus on single-language rumor detection, which leads to unsatisfying performance when these are applied to mixed-language rumor detection. Meanwhile, the type of mixed-language (mixture of word-level or sentence-level) is a great challenge for mixed-language rumor detection. In this paper, focusing on a mixed scene of Chinese and Tibetan, the research first provides a Chinese–Tibetan mixed-language rumor detection dataset (Weibo_Ch_Ti) that comprises 1,617 non-rumor tweets and 1,456 rumor tweets in two mixed-language types. Then, the research proposes an effective model with multi-extractors, named “MER-CTRD” for short. This model mainly consists of three extractors. The Multi-task Extractor helps the model to extract feature representations of different mixed-language types adaptively. The Rich-semantic Extractor enriches the semantic features representations of Tibetan in the Chinese–Tibetan-mixed language. The Fusion-feature Extractor fuses the mean and disparity semantic features of Chinese and Tibetan to complement feature representations of the mixed language. Finally, the research conducts experiments on Weibo_Ch_Ti. The results show that the proposed model improves accuracy by about 3%–12% over the baseline models, indicating its effectiveness in the Chinese–Tibetan mixed-language rumor detection scenario.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139829040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-channel speech enhancement using colored spectrograms 利用彩色频谱图增强单通道语音效果
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-07 DOI: 10.1016/j.csl.2024.101626
Sania Gul , Muhammad Salman Khan , Muhammad Fazeel

Speech enhancement concerns the processes required to remove unwanted background sounds from the target speech to improve its quality and intelligibility. In this paper, a novel approach for single-channel speech enhancement is presented using colored spectrograms. We propose the use of a deep neural network (DNN) architecture adapted from the pix2pix generative adversarial network (GAN) and train it over colored spectrograms of speech to denoise them. After denoising, the colors of spectrograms are translated to magnitudes of short-time Fourier transform (STFT) using a shallow regression neural network. These estimated STFT magnitudes are later combined with the noisy phases to obtain an enhanced speech. The results show an improvement of almost 0.84 points in the perceptual evaluation of speech quality (PESQ) and 1 % in the short-term objective intelligibility (STOI) over the unprocessed noisy data. The gain in quality and intelligibility over the unprocessed signal is almost equal to the gain achieved by the baseline methods used for comparison with the proposed model, but at a much reduced computational cost. The proposed solution offers a comparative PESQ score at almost 10 times reduced computational cost than a similar baseline model that has generated the highest PESQ score trained on grayscaled spectrograms, while it provides only a 1 % deficit in STOI at 28 times reduced computational cost when compared to another baseline system based on convolutional neural network-GAN (CNN-GAN) that produces the most intelligible speech.

语音增强是指从目标语音中去除不需要的背景声音,以提高语音质量和可懂度的过程。本文提出了一种利用彩色频谱图进行单通道语音增强的新方法。我们建议使用深度神经网络(DNN)架构,该架构改编自 pix2pix 生成式对抗网络(GAN),并在语音的彩色频谱图上对其进行去噪训练。去噪后,使用浅层回归神经网络将频谱图的颜色转换为短时傅里叶变换(STFT)的幅度。这些估计的 STFT 幅值随后与噪声相位相结合,从而获得增强语音。结果表明,与未处理的噪声数据相比,语音质量感知评估(PESQ)提高了近 0.84 分,短期客观可懂度(STOI)提高了 1%。与未经处理的信号相比,质量和可懂度的提升几乎等同于用于与拟议模型进行比较的基线方法所实现的提升,但计算成本却大大降低。与在灰度频谱图上训练生成最高 PESQ 分数的类似基线模型相比,所提出的解决方案在降低计算成本近 10 倍的情况下提供了可比较的 PESQ 分数,而与另一个基于卷积神经网络-GAN(CNN-GAN)的基线系统(可生成最清晰的语音)相比,所提出的解决方案在降低计算成本 28 倍的情况下仅提供了 1 % 的 STOI 损失。
{"title":"Single-channel speech enhancement using colored spectrograms","authors":"Sania Gul ,&nbsp;Muhammad Salman Khan ,&nbsp;Muhammad Fazeel","doi":"10.1016/j.csl.2024.101626","DOIUrl":"https://doi.org/10.1016/j.csl.2024.101626","url":null,"abstract":"<div><p>Speech enhancement concerns the processes required to remove unwanted background sounds from the target speech to improve its quality and intelligibility. In this paper, a novel approach for single-channel speech enhancement is presented using colored spectrograms. We propose the use of a deep neural network (DNN) architecture adapted from the pix2pix generative adversarial network (GAN) and train it over colored spectrograms of speech to denoise them. After denoising, the colors of spectrograms are translated to magnitudes of short-time Fourier transform (STFT) using a shallow regression neural network. These estimated STFT magnitudes are later combined with the noisy phases to obtain an enhanced speech. The results show an improvement of almost 0.84 points in the perceptual evaluation of speech quality (PESQ) and 1 % in the short-term objective intelligibility (STOI) over the unprocessed noisy data. The gain in quality and intelligibility over the unprocessed signal is almost equal to the gain achieved by the baseline methods used for comparison with the proposed model, but at a much reduced computational cost. The proposed solution offers a comparative PESQ score at almost 10 times reduced computational cost than a similar baseline model that has generated the highest PESQ score trained on grayscaled spectrograms, while it provides only a 1 % deficit in STOI at 28 times reduced computational cost when compared to another baseline system based on convolutional neural network-GAN (CNN-GAN) that produces the most intelligible speech.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139749127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A method of phonemic annotation for Chinese dialects based on a deep learning model with adaptive temporal attention and a feature disentangling structure 基于具有自适应时空注意力的深度学习模型和特征分解结构的汉语方言音位标注方法
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-05 DOI: 10.1016/j.csl.2024.101624
Bowen Jiang , Qianhui Dong , Guojin Liu

Phonemic annotation is aimed at annotating a speech fragment with phonemic symbols. As the phonetic features of a speech fragment vary greatly among different languages including their dialects, it is a significant way to describe and write down the phonetic system of a language utilizing phonemic symbols. It is meaningful to develop an automatic and effective method for this task. In this paper, we first establish a Chinese dataset where each datum consists of an original speech signal and the corresponding phonemic characters which are annotated manually. Furthermore, we propose a deep learning model to realize automatic phonemic annotation for speech fragments spoken in diverse Chinese dialects. The overall structure of the model is a many-to-many deep bi-directional gated recurrent unit (GRU) network, and an adaptive temporal attention mechanism is applied to communicate the encoder and decoder modules to prevent any loss of features adaptively. Meanwhile, a feature disentangling structure based on a generative adversarial network (GAN) is adopted to attenuate the interference towards the phonemic annotation task caused by unrelated tone features in the original speech signal and further improve the phonemic annotation performance. Extensive experimental results have verified the superiority of our model and proposed strategies over the utilized dataset.

音位标注的目的是用音位符号标注语音片段。由于不同语言(包括方言)语音片段的语音特征差异很大,因此利用音位符号来描述和记录一种语言的语音系统是一种重要的方法。为音位标注任务开发一种自动有效的方法很有意义。在本文中,我们首先利用了一个中文数据集,其中每个数据都由原始语音信号和相应的人工标注音位字符组成。此外,我们还提出了一种深度学习模型,以实现对不同汉语方言语音片段的音位标注。该模型的整体结构是一个多对多的深度双向门控递归单元(GRU)网络,并采用自适应时间注意力机制来沟通编码器和解码器模块,以自适应地防止任何特征丢失。同时,还采用了基于生成对抗网络(GAN)的特征分离结构,以减弱原始语音片段中不相关的音调特征对音位标注任务的干扰,进一步提高音位标注性能。广泛的实验结果验证了我们的模型和建议的策略在所使用的数据集上的优越性。
{"title":"A method of phonemic annotation for Chinese dialects based on a deep learning model with adaptive temporal attention and a feature disentangling structure","authors":"Bowen Jiang ,&nbsp;Qianhui Dong ,&nbsp;Guojin Liu","doi":"10.1016/j.csl.2024.101624","DOIUrl":"10.1016/j.csl.2024.101624","url":null,"abstract":"<div><p>Phonemic annotation is aimed at annotating a speech fragment with phonemic symbols. As the phonetic features of a speech fragment vary greatly among different languages including their dialects, it is a significant way to describe and write down the phonetic system of a language utilizing phonemic symbols. It is meaningful to develop an automatic and effective method for this task. In this paper, we first establish a Chinese dataset where each datum consists of an original speech signal and the corresponding phonemic characters which are annotated manually. Furthermore, we propose a deep learning model to realize automatic phonemic annotation for speech fragments spoken in diverse Chinese dialects. The overall structure of the model is a many-to-many deep bi-directional gated recurrent unit (GRU) network, and an adaptive temporal attention mechanism is applied to communicate the encoder and decoder modules to prevent any loss of features adaptively. Meanwhile, a feature disentangling structure based on a generative adversarial network (GAN) is adopted to attenuate the interference towards the phonemic annotation task caused by unrelated tone features in the original speech signal and further improve the phonemic annotation performance. Extensive experimental results have verified the superiority of our model and proposed strategies over the utilized dataset.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139689402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LeBenchmark 2.0: A standardized, replicable and enhanced framework for self-supervised representations of French speech LeBenchmark 2.0:法语语音自监督表征的标准化、可复制和增强型框架
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-03 DOI: 10.1016/j.csl.2024.101622
Titouan Parcollet , Ha Nguyen , Solène Evain , Marcely Zanon Boito , Adrien Pupier , Salima Mdhaffar , Hang Le , Sina Alisamir , Natalia Tomashenko , Marco Dinarelli , Shucong Zhang , Alexandre Allauzen , Maximin Coavoux , Yannick Estève , Mickael Rouvier , Jerôme Goulian , Benjamin Lecouteux , François Portet , Solange Rossato , Fabien Ringeval , Laurent Besacier

Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most of the current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing and building SSL-equipped French speech technologies. It includes documented, large-scale and heterogeneous corpora with up to 14,000 h of heterogeneous speech, ten pre-trained SSL wav2vec 2.0 models containing from 26 million to one billion learnable parameters shared with the community, and an evaluation protocol made of six downstream tasks to complement existing benchmarks. LeBenchmark 2.0 also presents unique perspectives on pre-trained SSL models for speech with the investigation of frozen versus fine-tuned downstream models, task-agnostic versus task-specific pre-trained models as well as a discussion on the carbon footprint of large-scale model training. Overall, the newly introduced models trained on 14,000 h of French speech outperform multilingual and previous LeBenchmark SSL models across the benchmark but also required up to four times more energy for pre-training.

自我监督学习(SSL)是计算机视觉和自然语言处理等许多不同领域取得前所未有的进步的源泉。语音处理从自监督学习中获益匪浅,因为目前大多数与领域相关的任务都是通过预训练模型完成的。这项工作介绍了 LeBenchmark 2.0,这是一个用于评估和构建配备 SSL 的法语语音技术的开源框架。它包括记录了多达 14,000 小时异构语音的大规模异构语料库、十个预先训练好的 SSL wav2vec 2.0 模型(包含 2,600 万到十亿个可学习参数)以及一个由六个下游任务组成的评估协议,以补充现有的基准。LeBenchmark 2.0 还对语音的预训练 SSL 模型提出了独特的观点,包括研究冻结下游模型与微调下游模型、与任务无关的预训练模型与特定任务预训练模型,以及讨论大规模模型训练的碳足迹。总体而言,新引入的模型在 14,000 小时的法语语音基础上进行了训练,在整个基准测试中的表现优于多语种和以前的 LeBenchmark SSL 模型,但所需的预训练能量也比它们高出四倍。
{"title":"LeBenchmark 2.0: A standardized, replicable and enhanced framework for self-supervised representations of French speech","authors":"Titouan Parcollet ,&nbsp;Ha Nguyen ,&nbsp;Solène Evain ,&nbsp;Marcely Zanon Boito ,&nbsp;Adrien Pupier ,&nbsp;Salima Mdhaffar ,&nbsp;Hang Le ,&nbsp;Sina Alisamir ,&nbsp;Natalia Tomashenko ,&nbsp;Marco Dinarelli ,&nbsp;Shucong Zhang ,&nbsp;Alexandre Allauzen ,&nbsp;Maximin Coavoux ,&nbsp;Yannick Estève ,&nbsp;Mickael Rouvier ,&nbsp;Jerôme Goulian ,&nbsp;Benjamin Lecouteux ,&nbsp;François Portet ,&nbsp;Solange Rossato ,&nbsp;Fabien Ringeval ,&nbsp;Laurent Besacier","doi":"10.1016/j.csl.2024.101622","DOIUrl":"10.1016/j.csl.2024.101622","url":null,"abstract":"<div><p>Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most of the current domain-related tasks are now being approached with pre-trained models. This work introduces <em>LeBenchmark 2.0</em> an open-source framework for assessing and building SSL-equipped French speech technologies. It includes documented, large-scale and heterogeneous corpora with up to 14,000 h of heterogeneous speech, ten pre-trained SSL wav2vec 2.0 models containing from 26 million to one billion learnable parameters shared with the community, and an evaluation protocol made of six downstream tasks to complement existing benchmarks. <em>LeBenchmark 2.0</em> also presents unique perspectives on pre-trained SSL models for speech with the investigation of frozen versus fine-tuned downstream models, task-agnostic versus task-specific pre-trained models as well as a discussion on the carbon footprint of large-scale model training. Overall, the newly introduced models trained on 14,000 h of French speech outperform multilingual and previous <em>LeBenchmark</em> SSL models across the benchmark but also required up to four times more energy for pre-training.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139679924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral–temporal saliency masks and modulation tensorgrams for generalizable COVID-19 detection 用于通用 COVID-19 检测的频谱-时序突出掩码和调制张量图
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-01 DOI: 10.1016/j.csl.2024.101620
Yi Zhu, Tiago H. Falk

Speech COVID-19 detection systems have gained popularity as they represent an easy-to-use and low-cost solution that is well suited for at-home long-term monitoring of patients with persistent symptoms. Recently, however, the limited generalization capability of existing deep neural network based systems to unseen datasets has been raised as a serious concern, as has their limited interpretability. In this study, we aim to develop an interpretable and generalizable speech-based COVID-19 detection system. First, we propose the use of a 3-dimensional modulation frequency tensor (called modulation tensorgram representation, MTR) as input to a convolutional recurrent neural network for COVID-19 detection. The MTR representation is known to capture long-term dynamics of speech correlated with articulation and respiration, hence being a potential candidate for characterizing COVID-19 speech. The customized network explores both the spectral and temporal pattern from MTR to learn the underlying COVID-19 speech pattern. Next, we design a spectro-temporal saliency masking to aggregate regions of the MTR related to COVID-19, thus helping further improve the generalizability and interpretability of the model. Experiments are conducted on three public datasets and results show the proposed solution consistently outperforming two benchmark systems in within-, across-, and unseen-dataset tests. The learned salient regions have been shown correlated with whispered speech and vocal hoarseness, which explains the increased generalizability. Furthermore, our model relies on a small amount of parameters, thus offering a promising solution for on-device remote monitoring of COVID-19 infection.

语音 COVID-19 检测系统是一种易于使用且成本低廉的解决方案,非常适合在家中对有持续症状的患者进行长期监测,因此广受欢迎。但最近,现有基于深度神经网络的系统对未见数据集的泛化能力有限以及可解释性有限的问题引起了人们的严重关注。在本研究中,我们旨在开发一种可解释、可泛化的基于语音的 COVID-19 检测系统。首先,我们建议使用三维调制频率张量(称为调制张量图表示法,MTR)作为卷积递归神经网络的输入,用于 COVID-19 检测。众所周知,MTR 表示法能捕捉与发音和呼吸相关的语音长期动态,因此是描述 COVID-19 语音特征的潜在候选方法。定制网络从 MTR 中探索频谱和时间模式,以学习 COVID-19 的基本语音模式。接下来,我们设计了一种频谱-时间显著性掩蔽,以聚合 MTR 中与 COVID-19 相关的区域,从而有助于进一步提高模型的通用性和可解释性。实验在三个公共数据集上进行,结果表明所提出的解决方案在内部、跨数据集和未见数据集测试中的表现始终优于两个基准系统。实验结果表明,所学的突出区域与耳语语音和声音嘶哑相关,这也是通用性提高的原因。此外,我们的模型只需少量参数,因此为设备远程监控 COVID-19 感染提供了一个很有前景的解决方案。
{"title":"Spectral–temporal saliency masks and modulation tensorgrams for generalizable COVID-19 detection","authors":"Yi Zhu,&nbsp;Tiago H. Falk","doi":"10.1016/j.csl.2024.101620","DOIUrl":"10.1016/j.csl.2024.101620","url":null,"abstract":"<div><p>Speech COVID-19 detection systems have gained popularity as they represent an easy-to-use and low-cost solution that is well suited for at-home long-term monitoring of patients with persistent symptoms. Recently, however, the limited generalization capability of existing deep neural network based systems to unseen datasets has been raised as a serious concern, as has their limited interpretability. In this study, we aim to develop an interpretable and generalizable speech-based COVID-19 detection system. First, we propose the use of a 3-dimensional modulation frequency tensor (called modulation tensorgram representation, MTR) as input to a convolutional recurrent neural network for COVID-19 detection. The MTR representation is known to capture long-term dynamics of speech correlated with articulation and respiration, hence being a potential candidate for characterizing COVID-19 speech. The customized network explores both the spectral and temporal pattern from MTR to learn the underlying COVID-19 speech pattern. Next, we design a spectro-temporal saliency masking to aggregate regions of the MTR related to COVID-19, thus helping further improve the generalizability and interpretability of the model. Experiments are conducted on three public datasets and results show the proposed solution consistently outperforming two benchmark systems in within-, across-, and unseen-dataset tests. The learned salient regions have been shown correlated with whispered speech and vocal hoarseness, which explains the increased generalizability. Furthermore, our model relies on a small amount of parameters, thus offering a promising solution for on-device remote monitoring of COVID-19 infection.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000032/pdfft?md5=e39e0b3ee7ea45c5b9c50622ff48dbd4&pid=1-s2.0-S0885230824000032-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139664299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effective infant cry signal analysis and reasoning using IARO based leaky Bi-LSTM model 使用基于 IARO 的泄漏 Bi-LSTM 模型有效分析和推理婴儿哭声信号
IF 4.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-01-24 DOI: 10.1016/j.csl.2024.101621
B.M. Mala, Smita Sandeep Darandale

In the present scenario, the recognition of particular emotions or needs from an infant's cry is a difficult process in the field of pattern recognition as it does not have any verbal information. In this article, an automated model is introduced for an effective recognition of infant cries. At first, the infant cry signals are collected from the Baby Chillanto (BC) dataset and the Donate a Cry Corpus (DCC) dataset. These acquired signals are converted into feature vectors by employing nine techniques namely, Zero Crossing Rate (ZCR), acoustic features, audio features, amplitude, energy, Root Mean Square (RMS), statistical moments, autocorrelation, and Mel-Frequency Cepstral Coefficients (MFCCs). These obtained feature vectors are multi-dimensional; therefore, a Simulated Annealing Algorithm (SAA) is employed to select informative feature vectors. The selected informative feature vectors are passed to the leaky Bi-directional Long Short Term Memory (Bi-LSTM) model for classifying the types of infant cries. Specifically, in the leaky Bi-LSTM model, the conventional activation functions (Tangent (Tanh) and sigmoid) are replaced with the leaky Rectified Linear Unit (leaky ReLU) activation function. This process significantly mitigates the vanishing gradient problem and improves convergence during data training, which is vital for signal classification tasks. Furthermore, an Improved Artificial Rabbit's Optimization (IARO) algorithm is proposed to choose optimal hyper-parameters in the leaky Bi-LSTM model, where this mechanism reduces the complexity and training time of the classification model. In the IARO algorithm, selective opposition and Lévy flight strategies are integrated with the conventional ARO algorithm to enhance the dynamics and diversity of the population, along with the model's tracking efficiency. The empirical investigation denotes that the proposed IARO based leaky Bi-LSTM model achieves 99.66 % and 95.92 % of classification accuracy on the BC and DCC datasets, respectively. The proposed IARO based leaky Bi-LSTM model achieves maximum classification results when related to the conventional recognition models.

目前,从婴儿的哭声中识别特定情绪或需求是模式识别领域的一个难题,因为它没有任何语言信息。本文介绍了一种有效识别婴儿哭声的自动化模型。首先,从婴儿 Chillanto(BC)数据集和 Donate a Cry Corpus(DCC)数据集中收集婴儿哭声信号。这些采集到的信号通过九种技术转换成特征向量,即零交叉率(ZCR)、声学特征、音频特征、振幅、能量、均方根(RMS)、统计矩、自相关性和梅尔-频率倒频谱系数(MFCC)。这些获得的特征向量是多维的,因此采用了模拟退火算法(SAA)来选择信息量大的特征向量。筛选出的信息特征向量将传递给泄漏双向长短期记忆(Bi-LSTM)模型,用于对婴儿哭声类型进行分类。具体来说,在泄漏双向长时短记忆模型中,传统的激活函数(正切(Tanh)和sigmoid)被泄漏整流线性单元(leaky ReLU)激活函数所取代。这一过程大大缓解了梯度消失问题,提高了数据训练过程中的收敛性,这对信号分类任务至关重要。此外,还提出了一种改进的人工兔优化(IARO)算法,用于在泄漏 Bi-LSTM 模型中选择最佳超参数,这种机制降低了分类模型的复杂性并缩短了训练时间。在 IARO 算法中,选择性对抗和 Lévy 飞行策略与传统的 ARO 算法相结合,增强了种群的动态性和多样性,同时也提高了模型的跟踪效率。实证研究表明,所提出的基于 IARO 的泄漏 Bi-LSTM 模型在 BC 和 DCC 数据集上的分类准确率分别达到了 99.66% 和 95.92%。与传统识别模型相比,所提出的基于 IARO 的泄漏 Bi-LSTM 模型取得了最大的分类结果。
{"title":"Effective infant cry signal analysis and reasoning using IARO based leaky Bi-LSTM model","authors":"B.M. Mala,&nbsp;Smita Sandeep Darandale","doi":"10.1016/j.csl.2024.101621","DOIUrl":"10.1016/j.csl.2024.101621","url":null,"abstract":"<div><p>In the present scenario, the recognition of particular emotions or needs from an infant's cry is a difficult process in the field of pattern recognition as it does not have any verbal information. In this article, an automated model is introduced for an effective recognition of infant cries. At first, the infant cry signals are collected from the Baby Chillanto (BC) dataset and the Donate a Cry Corpus (DCC) dataset. These acquired signals are converted into feature vectors by employing nine techniques namely, Zero Crossing Rate (ZCR), acoustic features, audio features, amplitude, energy, Root Mean Square (RMS), statistical moments, autocorrelation, and Mel-Frequency Cepstral Coefficients (MFCCs). These obtained feature vectors are multi-dimensional; therefore, a Simulated Annealing Algorithm (SAA) is employed to select informative feature vectors. The selected informative feature vectors are passed to the leaky Bi-directional Long Short Term Memory (Bi-LSTM) model for classifying the types of infant cries. Specifically, in the leaky Bi-LSTM model, the conventional activation functions (Tangent (Tanh) and sigmoid) are replaced with the leaky Rectified Linear Unit (leaky ReLU) activation function. This process significantly mitigates the vanishing gradient problem and improves convergence during data training, which is vital for signal classification tasks. Furthermore, an Improved Artificial Rabbit's Optimization (IARO) algorithm is proposed to choose optimal hyper-parameters in the leaky Bi-LSTM model, where this mechanism reduces the complexity and training time of the classification model. In the IARO algorithm, selective opposition and Lévy flight strategies are integrated with the conventional ARO algorithm to enhance the dynamics and diversity of the population, along with the model's tracking efficiency. The empirical investigation denotes that the proposed IARO based leaky Bi-LSTM model achieves 99.66 % and 95.92 % of classification accuracy on the BC and DCC datasets, respectively. The proposed IARO based leaky Bi-LSTM model achieves maximum classification results when related to the conventional recognition models.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":4.3,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139556448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Speech and Language
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1