首页 > 最新文献

VNU Journal of Science: Computer Science and Communication Engineering最新文献

英文 中文
Stabilizing Techniques for Secure On-chip Key Generation Based on RO-PUF 基于RO-PUF的安全片上密钥生成稳定技术
Pub Date : 2022-12-16 DOI: 10.25073/2588-1086/vnucsce.306
Van‐Phuc Hoang, Van-Toan Tran, Quang-Kien Trinh
Based on intrinsic physical characteristics of devices, Physically Unclonable Functions (PUFs) provide the high reliability while maintaining the sufficient uniqueness. However, in the practical implementation based on PUFs, the extracted bit-string normally exhibits the unavoidable small fluctuation. Hence, PUFs can be used for the application of chip identification, but not suitable for the application that strictly requires an exact generated number. In this work, we propose several techniques to stabilize the generated value based on the existing Ring Oscillator (RO)-PUF circuit so that the stable unique number can be directly used for high-profile hardware security applications. In detail, we design a specialized on-chip key generation circuit that repeatedly samples the RO frequency values for statistical analysis and dynamically phases out the unstable bits, resulting in a unique and stable output bit-string. The experiments are conducted for the actual data measured from Xilinx Artix-7  FPGA devices. The generated key is proven to be relatively stable and can be readily used for the emerging security applications.
物理不可克隆功能(physical unclable Functions, puf)基于设备本身的物理特性,在保持设备足够唯一性的同时,提供了高可靠性。然而,在基于puf的实际实现中,提取的位串通常会出现不可避免的小波动。因此,puf可以用于芯片识别的应用,但不适用于严格要求精确生成数的应用。在这项工作中,我们提出了几种技术来稳定基于现有环振荡器(RO)-PUF电路的生成值,以便稳定的唯一数字可以直接用于高端硬件安全应用。详细地说,我们设计了一个专门的片上密钥生成电路,该电路反复采样RO频率值进行统计分析,并动态淘汰不稳定的比特,从而产生唯一且稳定的输出比特串。实验采用Xilinx Artix-7 FPGA器件测量的实际数据进行。生成的密钥被证明是相对稳定的,可以很容易地用于新兴的安全应用程序。
{"title":"Stabilizing Techniques for Secure On-chip Key Generation Based on RO-PUF","authors":"Van‐Phuc Hoang, Van-Toan Tran, Quang-Kien Trinh","doi":"10.25073/2588-1086/vnucsce.306","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.306","url":null,"abstract":"Based on intrinsic physical characteristics of devices, Physically Unclonable Functions (PUFs) provide the high reliability while maintaining the sufficient uniqueness. However, in the practical implementation based on PUFs, the extracted bit-string normally exhibits the unavoidable small fluctuation. Hence, PUFs can be used for the application of chip identification, but not suitable for the application that strictly requires an exact generated number. In this work, we propose several techniques to stabilize the generated value based on the existing Ring Oscillator (RO)-PUF circuit so that the stable unique number can be directly used for high-profile hardware security applications. In detail, we design a specialized on-chip key generation circuit that repeatedly samples the RO frequency values for statistical analysis and dynamically phases out the unstable bits, resulting in a unique and stable output bit-string. The experiments are conducted for the actual data measured from Xilinx Artix-7  FPGA devices. The generated key is proven to be relatively stable and can be readily used for the emerging security applications.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132520356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ViMRC - VLSP 2021: An empirical study of Vietnamese Machine Reading Comprehension with Unsupervised Context Selector and Adversarial Learning ViMRC - VLSP 2021:基于无监督上下文选择器和对抗学习的越南语机器阅读理解实证研究
Pub Date : 2022-12-16 DOI: 10.25073/2588-1086/vnucsce.344
Minh Le Nguyen
Machine Reading Comprehension (MRC) is a great NLP task that requires concentration on making the machine read, scan documents, and extract meaning from the text, just like a human reader.One of the MRC system challenges is not only having to understand the context to extract the answer but also being aware of the trust-worthy of the given question is possible or not.Thought pre-trained language models (PTMs) have shown their performance on many NLP downstream tasks, but it still has a limitation in the fixed-length input. We propose an unsupervised context selector that shortens the given context but still contains the answers within related contexts.In VLSP2021-MRC shared task dataset, we also empirical several training strategies consisting of unanswerable question sample selection and different adversarial training approaches, which slightly boost the performance 2.5% in EM score and 1% in F1 score.
机器阅读理解(MRC)是一项伟大的NLP任务,需要专注于让机器阅读、扫描文档,并从文本中提取意义,就像人类读者一样。MRC系统面临的挑战之一是不仅要理解上下文以提取答案,还要意识到给定问题的可信度是否可能。虽然预训练语言模型(ptm)在许多NLP下游任务中表现出了良好的性能,但它在固定长度输入方面仍然存在局限性。我们提出了一个无监督的上下文选择器,它缩短了给定的上下文,但仍然包含相关上下文中的答案。在VLSP2021-MRC共享任务数据集上,我们还实验了几种由不可回答问题样本选择和不同对抗训练方法组成的训练策略,这些策略略微提高了EM分数2.5%和F1分数1%的性能。
{"title":"ViMRC - VLSP 2021: An empirical study of Vietnamese Machine Reading Comprehension with Unsupervised Context Selector and Adversarial Learning","authors":"Minh Le Nguyen","doi":"10.25073/2588-1086/vnucsce.344","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.344","url":null,"abstract":"Machine Reading Comprehension (MRC) is a great NLP task that requires concentration on making the machine read, scan documents, and extract meaning from the text, just like a human reader.One of the MRC system challenges is not only having to understand the context to extract the answer but also being aware of the trust-worthy of the given question is possible or not.Thought pre-trained language models (PTMs) have shown their performance on many NLP downstream tasks, but it still has a limitation in the fixed-length input. We propose an unsupervised context selector that shortens the given context but still contains the answers within related contexts.In VLSP2021-MRC shared task dataset, we also empirical several training strategies consisting of unanswerable question sample selection and different adversarial training approaches, which slightly boost the performance 2.5% in EM score and 1% in F1 score.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131782706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VLSP 2021 - VieCap4H Challenge: Automatic Image Caption Generation for Healthcare Domain in Vietnamese VLSP 2021 - VieCap4H挑战:越南医疗保健领域的自动图像标题生成
Pub Date : 2022-12-16 DOI: 10.25073/2588-1086/vnucsce.364
P. Phan
Machine reading comprehension (MRC) is a challenging Natural Language Processing (NLP) research fieldand wide real-world applications. The great progress of this field in recents is mainly due to the emergence offew datasets for machine reading comprehension tasks with large sizes and deep learning. For the Vietnameselanguage, some datasets, such as UIT-ViQuAD [1] and UIT-ViNewsQA [2], most recently, UIT-ViQuAD 2.0 [3] - adataset of the competitive VLSP 2021-MRC Shared Task 1 . MRC systems must not only answer questions whennecessary but also tactfully abstain from answering when no answer is available according to the given passage.In this paper, we proposed two types of joint models for answerability prediction and pure-MRC prediction with/without a dependency mechanism to learn the correlation between a start position and end position in pure-MRCoutput prediction. Besides, we use ensemble models and a verification strategy by voting the best answer from thetop K answers of different models. Our proposed approach is evaluated on the benchmark VLSP 2021-MRC SharedTask challenge dataset UIT-ViQuAD 2.0 [3] shows that our approach is significantly better than the baseline.
机器阅读理解(MRC)是一个具有挑战性的自然语言处理(NLP)研究领域和广泛的现实应用。近年来该领域的巨大进步主要是由于出现了一些用于大规模和深度学习的机器阅读理解任务的数据集。对于越南语,一些数据集,如unit - viquad[1]和unit - viquad[2],最近,unit - viquad 2.0[3] -竞争的VLSP 2021-MRC共享任务1的数据集。MRC系统不仅要在必要时回答问题,而且要机智地避免在没有答案时根据给定的文章回答问题。为了学习纯mrc输出预测中起始位置和结束位置之间的相关性,本文提出了两种联合模型,分别用于可答性预测和纯mrc预测,其中有/没有依赖机制。此外,我们使用集成模型和验证策略,从不同模型的前K个答案中投票选出最佳答案。我们提出的方法在基准VLSP 2021-MRC SharedTask挑战数据集unit - viquad 2.0[3]上进行了评估,结果表明我们的方法明显优于基线。
{"title":"VLSP 2021 - VieCap4H Challenge: Automatic Image Caption Generation for Healthcare Domain in Vietnamese","authors":"P. Phan","doi":"10.25073/2588-1086/vnucsce.364","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.364","url":null,"abstract":"Machine reading comprehension (MRC) is a challenging Natural Language Processing (NLP) research fieldand wide real-world applications. The great progress of this field in recents is mainly due to the emergence offew datasets for machine reading comprehension tasks with large sizes and deep learning. For the Vietnameselanguage, some datasets, such as UIT-ViQuAD [1] and UIT-ViNewsQA [2], most recently, UIT-ViQuAD 2.0 [3] - adataset of the competitive VLSP 2021-MRC Shared Task 1 . MRC systems must not only answer questions whennecessary but also tactfully abstain from answering when no answer is available according to the given passage.In this paper, we proposed two types of joint models for answerability prediction and pure-MRC prediction with/without a dependency mechanism to learn the correlation between a start position and end position in pure-MRCoutput prediction. Besides, we use ensemble models and a verification strategy by voting the best answer from thetop K answers of different models. Our proposed approach is evaluated on the benchmark VLSP 2021-MRC SharedTask challenge dataset UIT-ViQuAD 2.0 [3] shows that our approach is significantly better than the baseline.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134316037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
N/A Modern Approaches in Natural Language Processing 自然语言处理中的现代方法
Pub Date : 2022-12-16 DOI: 10.25073/2588-1086/vnucsce.302
T. Quan
Natural Language Processing (NLP) is one of the major branches in the emerging field of Artificial Intelligence (AI). Classical approaches in this area were mostly based on parsing and information extraction techniques, which suffered from great difficulty when dealing with very large textual datasets available in practical applications. This issue can potentially be addressed with the recent advancement of the Deep Learning (DL) techniques, which are naturally assuming very large datasets for training. In fact, NLP research has witnessed a remarkable achievement with the introduction of Word Embedding techniques, which allows a document to be represented meaningfully as a matrix, on which major DL models like CNN or RNN can be deployed effectively to accomplish common NLP tasks. Gradually, NLP scholars keep developing specific models for their areas, notably attention-enhanced BiLSTM, Transformer and BERT. The births of those models have introduced a new wave of modern approaches which frequently report new breaking results and open much novel research directions. The aim of this paper is to give readers a roadmap of those modern approaches in NLP, including their ideas, theories and applications. This would hopefully offer a solid background for further research in this area.
自然语言处理(NLP)是新兴的人工智能(AI)领域的主要分支之一。该领域的经典方法大多基于解析和信息提取技术,在实际应用中处理非常大的文本数据集时存在很大的困难。这个问题可以通过深度学习(DL)技术的最新进展来解决,这些技术自然会假设非常大的数据集进行训练。事实上,随着单词嵌入技术的引入,NLP研究已经取得了显著的成就,它允许文档以矩阵的形式有意义地表示,在矩阵上可以有效地部署CNN或RNN等主要深度学习模型来完成常见的NLP任务。逐渐地,NLP学者不断为他们的领域开发特定的模型,特别是注意增强的BiLSTM、Transformer和BERT。这些模型的诞生带来了一波新的现代方法,这些方法经常报道新的突破性成果,开辟了许多新的研究方向。本文的目的是给读者一个现代NLP方法的路线图,包括它们的思想、理论和应用。这有望为该领域的进一步研究提供坚实的背景。
{"title":"N/A Modern Approaches in Natural Language Processing","authors":"T. Quan","doi":"10.25073/2588-1086/vnucsce.302","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.302","url":null,"abstract":"Natural Language Processing (NLP) is one of the major branches in the emerging field of Artificial Intelligence (AI). Classical approaches in this area were mostly based on parsing and information extraction techniques, which suffered from great difficulty when dealing with very large textual datasets available in practical applications. This issue can potentially be addressed with the recent advancement of the Deep Learning (DL) techniques, which are naturally assuming very large datasets for training. In fact, NLP research has witnessed a remarkable achievement with the introduction of Word Embedding techniques, which allows a document to be represented meaningfully as a matrix, on which major DL models like CNN or RNN can be deployed effectively to accomplish common NLP tasks. Gradually, NLP scholars keep developing specific models for their areas, notably attention-enhanced BiLSTM, Transformer and BERT. The births of those models have introduced a new wave of modern approaches which frequently report new breaking results and open much novel research directions. The aim of this paper is to give readers a roadmap of those modern approaches in NLP, including their ideas, theories and applications. This would hopefully offer a solid background for further research in this area.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130790134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
gPartition: An Efficient Alignment Partitioning Program for Genome Datasets gPartition:一个高效的基因组数据集比对分区程序
Pub Date : 2022-12-16 DOI: 10.25073/2588-1086/vnucsce.353
Le Kim Thu, Do Duc Dong, Bui Ngoc Thang, Hoang Thi Diep, Nguyen Phuong Thao, L. Vinh
Phylogenomics, or evolutionary inference based on genome alignment, is becoming prominent thanks to next-generation sequencing technologies. In model-based phylogenomics, the partition scheme has a significant impact on inference performance, both in terms of log-likelihoods and computation time. Therefore, finding an optimal partition scheme, or partitioning, is critical in a phylogenomic inference pipeline. To accomplish this, one needs to divide the alignment sites into disjoint partitions so that the sites of similar evolutionary models are in the same partition. Computational partitioning is a recent approach of increasing interest due to its capability of modeling the site-rate heterogeneity within a single gene. State-of-the-art computational partitioning methods, such as mPartition or RatePartition, are, however, ineffective on long alignments of millions of sites. In this paper, we introduce gPartition, a new computational partitioning method leveraging both the site rate and the best-fit substitution model. We conducted experiments on recently published alignments to compare gPartition with mPartition and RatePartition. gPartition was orders of magnitude faster than other methods. The AIC score demonstrated that gPartition produced partition schemes that were better or comparable to mPartition. gPartition outperformed RatePartition on all examined alignments. We implemented our proposed method in the gPartition program to help researchers partition genome alignments with millions of sites more efficiently.
由于下一代测序技术,系统基因组学或基于基因组比对的进化推断正变得越来越突出。在基于模型的系统基因组学中,划分方案在对数似然和计算时间方面对推理性能有显著影响。因此,在系统基因组推断管道中,找到一个最优的分区方案或分区是至关重要的。要做到这一点,需要将对齐位点划分为不相交的分区,以便相似进化模型的位点在同一分区中。计算划分是最近的一种越来越受关注的方法,因为它能够模拟单个基因内的位点率异质性。然而,最先进的计算分区方法,如mPartition或RatePartition,在数百万个站点的长对齐上是无效的。在本文中,我们介绍了gPartition,一种利用站点率和最佳拟合替代模型的新的计算分区方法。我们对最近发布的对齐进行了实验,以比较gPartition与mPartition和RatePartition。gPartition的速度比其他方法快几个数量级。AIC分数表明gPartition产生的分区方案比mPartition更好或相当。gPartition在所有检查的对齐上都优于RatePartition。我们在gPartition程序中实现了我们提出的方法,以帮助研究人员更有效地划分数百万个位点的基因组比对。
{"title":"gPartition: An Efficient Alignment Partitioning Program for Genome Datasets","authors":"Le Kim Thu, Do Duc Dong, Bui Ngoc Thang, Hoang Thi Diep, Nguyen Phuong Thao, L. Vinh","doi":"10.25073/2588-1086/vnucsce.353","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.353","url":null,"abstract":"Phylogenomics, or evolutionary inference based on genome alignment, is becoming prominent thanks to next-generation sequencing technologies. In model-based phylogenomics, the partition scheme has a significant impact on inference performance, both in terms of log-likelihoods and computation time. Therefore, finding an optimal partition scheme, or partitioning, is critical in a phylogenomic inference pipeline. To accomplish this, one needs to divide the alignment sites into disjoint partitions so that the sites of similar evolutionary models are in the same partition. Computational partitioning is a recent approach of increasing interest due to its capability of modeling the site-rate heterogeneity within a single gene. State-of-the-art computational partitioning methods, such as mPartition or RatePartition, are, however, ineffective on long alignments of millions of sites. In this paper, we introduce gPartition, a new computational partitioning method leveraging both the site rate and the best-fit substitution model. We conducted experiments on recently published alignments to compare gPartition with mPartition and RatePartition. gPartition was orders of magnitude faster than other methods. The AIC score demonstrated that gPartition produced partition schemes that were better or comparable to mPartition. gPartition outperformed RatePartition on all examined alignments. We implemented our proposed method in the gPartition program to help researchers partition genome alignments with millions of sites more efficiently.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115075615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NER - VLSP 2021: Two Stage Model for Nested Named Entity Recognition NER - VLSP 2021:嵌套命名实体识别的两阶段模型
Pub Date : 2022-06-30 DOI: 10.25073/2588-1086/vnucsce.368
Quan Chu Quoc, Viola Van
Named entity recognition (NER) is a widely studied task in natural language processing. Recently, a growing number of studies have focused on the nested NER. The span-based methods consider the named entity recognition as span classification task, can deal with nested entities naturally. But they suffer from class imbalance problem because the number of non-entity spans accounts for the majority of total spans. To address this issue, we propose a two stage model for nested NER. We utilize an entity proposal module to filter an easy non-entity spans for efficient training. In addition, we combine all variants of the model to improve overall accuracy of our system. Our method achieves 1st place on the Vietnamese NER shared task at the 8th International Workshop on Vietnamese Language and Speech Processing (VLSP) with F1-score of 62.71 on the private test dataset. For research purposes, our source code is available at https://github.com/quancq/VLSP2021_NER
命名实体识别(NER)是自然语言处理中一个被广泛研究的课题。近年来,越来越多的研究集中在嵌套NER上。基于跨度的方法将命名实体识别看作是跨度分类任务,能够自然地处理嵌套实体。但由于非实体跨类的数量占总跨类的绝大部分,存在着班级失衡的问题。为了解决这个问题,我们提出了一个嵌套NER的两阶段模型。我们利用实体建议模块来过滤一个简单的非实体跨度,以实现高效的培训。此外,我们结合了模型的所有变体,以提高系统的整体准确性。我们的方法在第8届越南语言和语音处理(VLSP)国际研讨会上获得了越南NER共享任务的第一名,在私人测试数据集上获得了f1 - 71分。出于研究目的,我们的源代码可在https://github.com/quancq/VLSP2021_NER上获得
{"title":"NER - VLSP 2021: Two Stage Model for Nested Named Entity Recognition","authors":"Quan Chu Quoc, Viola Van","doi":"10.25073/2588-1086/vnucsce.368","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.368","url":null,"abstract":"Named entity recognition (NER) is a widely studied task in natural language processing. Recently, a growing number of studies have focused on the nested NER. The span-based methods consider the named entity recognition as span classification task, can deal with nested entities naturally. But they suffer from class imbalance problem because the number of non-entity spans accounts for the majority of total spans. To address this issue, we propose a two stage model for nested NER. We utilize an entity proposal module to filter an easy non-entity spans for efficient training. In addition, we combine all variants of the model to improve overall accuracy of our system. Our method achieves 1st place on the Vietnamese NER shared task at the 8th International Workshop on Vietnamese Language and Speech Processing (VLSP) with F1-score of 62.71 on the private test dataset. For research purposes, our source code is available at https://github.com/quancq/VLSP2021_NER","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129671565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SV - VLSP 2021: Combine Attentive Statistical Pooling-based Xvector and Pretrained ECAPA-TDNN for Vietnamese Text-Independent Speaker Verification SV - VLSP 2021:结合细心统计池的Xvector和预训练ECAPA-TDNN的越南语文本独立说话人验证
Pub Date : 2022-06-30 DOI: 10.25073/2588-1086/vnucsce.320
T. Thang, Huynh Thi Thanh Binh
Recently, Xvectors and ECAPA-TDNN have been considered state-of-the-art models in designing speaker verification systems. This paper proposes a novel approach that combines Attentive statistic pooling-based Xvector and pre-trained ECAPA-TDNN for Vietnamese speaker verification. Experiments are conducted on various recent Vietnamese speech datasets. The results portrayed that our proposed combination outperformed all constitutive models with 4% to 37% relative EER improvement and ranked second place in Task 2 of the 2021 VLSP Speaker Verification competition.
最近,Xvectors和ECAPA-TDNN被认为是设计扬声器验证系统的最先进模型。本文提出了一种将基于细心统计池的Xvector和预训练ECAPA-TDNN相结合的越南语说话人验证方法。在不同的越南语语音数据集上进行了实验。结果表明,我们提出的组合以4%至37%的相对EER改进优于所有本构模型,并在2021年VLSP演讲者验证竞赛的任务2中排名第二。
{"title":"SV - VLSP 2021: Combine Attentive Statistical Pooling-based Xvector and Pretrained ECAPA-TDNN for Vietnamese Text-Independent Speaker Verification","authors":"T. Thang, Huynh Thi Thanh Binh","doi":"10.25073/2588-1086/vnucsce.320","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.320","url":null,"abstract":"Recently, Xvectors and ECAPA-TDNN have been considered state-of-the-art models in designing speaker verification systems. This paper proposes a novel approach that combines Attentive statistic pooling-based Xvector and pre-trained ECAPA-TDNN for Vietnamese speaker verification. Experiments are conducted on various recent Vietnamese speech datasets. The results portrayed that our proposed combination outperformed all constitutive models with 4% to 37% relative EER improvement and ranked second place in Task 2 of the 2021 VLSP Speaker Verification competition.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133550297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
TTS - VLSP 2021: The Thunder Text-To-Speech System TTS - VLSP 2021:迅雷文本转语音系统
Pub Date : 2022-06-30 DOI: 10.25073/2588-1086/vnucsce.342
N. Ngoc Anh, Nguyen Tien Thanh, Le Dang Linh
This paper describes our speech synthesis system participating in the Vietnamese Text-To-Speech track of the 2021 VLSP evaluation campaign. The goal of this challenge is to build a synthetic voice from a provided spontaneous speech corpus in Vietnamese. In this paper, we propose our implementation of FastSpeech2 model on spontaneous speech. We used a special strategy with spontaneous datasets using the TTS system. We present our utilization in generating mel-spectrograms from given texts and then synthesize speech from generated mel-spectrograms using a separately trained vocoder. In evaluation, our team achieved 3.943 mean score in MOS in-domain test, 3.3 in MOS out-domain test, and 85.00% SUS, which indicates the effectiveness of the proposed system.
本文描述了我们的语音合成系统参与2021年VLSP评估活动的越南文本到语音轨道。这个挑战的目标是从提供的越南语自发语音语料库中构建一个合成语音。在本文中,我们提出了在自发语音上实现FastSpeech2模型。我们使用TTS系统对自发数据集使用了一种特殊的策略。我们介绍了从给定文本生成梅尔谱图的应用,然后使用单独训练的声码器从生成的梅尔谱图合成语音。在评估中,我们的团队在MOS域内测试中获得了3.943分的平均分,在MOS域外测试中获得了3.3分,SUS达到了85.00%,表明我们提出的系统是有效的。
{"title":"TTS - VLSP 2021: The Thunder Text-To-Speech System","authors":"N. Ngoc Anh, Nguyen Tien Thanh, Le Dang Linh","doi":"10.25073/2588-1086/vnucsce.342","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.342","url":null,"abstract":"This paper describes our speech synthesis system participating in the Vietnamese Text-To-Speech track of the 2021 VLSP evaluation campaign. The goal of this challenge is to build a synthetic voice from a provided spontaneous speech corpus in Vietnamese. In this paper, we propose our implementation of FastSpeech2 model on spontaneous speech. We used a special strategy with spontaneous datasets using the TTS system. We present our utilization in generating mel-spectrograms from given texts and then synthesize speech from generated mel-spectrograms using a separately trained vocoder. In evaluation, our team achieved 3.943 mean score in MOS in-domain test, 3.3 in MOS out-domain test, and 85.00% SUS, which indicates the effectiveness of the proposed system.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125176794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
VLSP 2021 - NER Challenge: Named Entity Recognition for Vietnamese VLSP 2021 - NER挑战:越南命名实体识别
Pub Date : 2022-06-30 DOI: 10.25073/2588-1086/vnucsce.362
Ha My Linh, Do Duy Dao, Nguyen Thi Minh Huyen, Ngo The Quyen, Doan Xuan Dung
Named entities (NE) are phrases that contain the names of persons, organizations, locations, times, quantities, email, phone number, etc., in a document. Named Entity Recognition (NER) is a fundamental task that is useful in many applications, especially in information extraction and question answering. Shared tasks on NER provides several reference datasets in many languages. In the 2016 and 2018 editions of the VLSP workshop series, reference NER datasets have been published with only three main entity categories: person, organization and location. At the VLSP 2021 workshop, another challenge on NER is organized for dealing with an extended set of 14 main entity types and 26 sub-entity types. This paper describes the published datasets and the evaluated systems in the framework of the VLSP 2021 evaluation campaign.
命名实体(NE)是文档中包含人员、组织、地点、时间、数量、电子邮件、电话号码等名称的短语。命名实体识别(NER)是一项基础任务,在许多应用中都很有用,特别是在信息提取和问题回答中。NER上的共享任务提供了多种语言的参考数据集。在2016年和2018年版本的VLSP研讨会系列中,参考NER数据集仅发布了三个主要实体类别:人、组织和地点。在VLSP 2021研讨会上,组织了关于NER的另一个挑战,以处理14个主要实体类型和26个子实体类型的扩展集。本文描述了VLSP 2021评估活动框架下发布的数据集和评估系统。
{"title":"VLSP 2021 - NER Challenge: Named Entity Recognition for Vietnamese","authors":"Ha My Linh, Do Duy Dao, Nguyen Thi Minh Huyen, Ngo The Quyen, Doan Xuan Dung","doi":"10.25073/2588-1086/vnucsce.362","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.362","url":null,"abstract":"Named entities (NE) are phrases that contain the names of persons, organizations, locations, times, quantities, email, phone number, etc., in a document. Named Entity Recognition (NER) is a fundamental task that is useful in many applications, especially in information extraction and question answering. Shared tasks on NER provides several reference datasets in many languages. In the 2016 and 2018 editions of the VLSP workshop series, reference NER datasets have been published with only three main entity categories: person, organization and location. At the VLSP 2021 workshop, another challenge on NER is organized for dealing with an extended set of 14 main entity types and 26 sub-entity types. This paper describes the published datasets and the evaluated systems in the framework of the VLSP 2021 evaluation campaign.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129025056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
TTS - VLSP 2021: The NAVI’s Text-To-Speech System for Vietnamese TTS - VLSP 2021: NAVI的越南语文本转语音系统
Pub Date : 2022-06-30 DOI: 10.25073/2588-1086/vnucsce.347
Nguyen Le Minh, An Quoc Do, Viet Q. Vu, Huyen Thuc Khanh Vo
The Association for Vietnamese Language and Speech Processing (VLSP) has organized a series of workshops intending to bring together researchers and professionals working in NLP and attempt a synthesis of research in the Vietnamese language. One of the shared tasks held at the eighth workshop is TTS [14] using a dataset that only consists of spontaneous audio. This poses a challenge for current TTS models since they only perform well constructing reading-style speech (e.g, audiobook). Not only that, the quality of the audio provided by the dataset has a huge impact on the performance of the model. Specifically, samples with noisy backgrounds or with multiple voices speaking at the same time will deteriorate the performance of our model. In this paper, we describe our approach to tackle this problem: we first preprocess the training data then use it to train a FastSpeech2 [10] acoustic model with some replacements in the external aligner model, finally we use HiFiGAN [4] vocoder to construct the waveform. According to the official evaluation of VLSP 2021 competition in the TTS task, our approach achieves 3.729 in-domain MOS, 3.557 out-of-domain MOS, and 79.70% SUS score. Audio samples are available at https://navi-tts.github.io/.
越南语和语音处理协会(VLSP)组织了一系列讲习班,旨在将从事自然语言处理的研究人员和专业人员聚集在一起,并试图综合越南语的研究。在第八届研讨会上举行的共享任务之一是TTS[14],使用仅由自发音频组成的数据集。这对当前的TTS模型提出了挑战,因为它们只能很好地构建阅读风格的语音(例如有声读物)。不仅如此,数据集提供的音频质量对模型的性能也有巨大的影响。具体来说,具有嘈杂背景或同时有多个声音说话的样本会降低我们模型的性能。在本文中,我们描述了我们解决这个问题的方法:我们首先对训练数据进行预处理,然后使用它来训练fastspeech h2[10]声学模型,并在外部对准器模型中进行一些替换,最后我们使用HiFiGAN[4]声码器来构建波形。根据官方对TTS任务中VLSP 2021竞赛的评估,我们的方法实现了3.729域内MOS, 3.557域外MOS和79.70% SUS得分。音频样本可在https://navi-tts.github.io/上获得。
{"title":"TTS - VLSP 2021: The NAVI’s Text-To-Speech System for Vietnamese","authors":"Nguyen Le Minh, An Quoc Do, Viet Q. Vu, Huyen Thuc Khanh Vo","doi":"10.25073/2588-1086/vnucsce.347","DOIUrl":"https://doi.org/10.25073/2588-1086/vnucsce.347","url":null,"abstract":"The Association for Vietnamese Language and Speech Processing (VLSP) has organized a series of workshops intending to bring together researchers and professionals working in NLP and attempt a synthesis of research in the Vietnamese language. One of the shared tasks held at the eighth workshop is TTS [14] using a dataset that only consists of spontaneous audio. This poses a challenge for current TTS models since they only perform well constructing reading-style speech (e.g, audiobook). Not only that, the quality of the audio provided by the dataset has a huge impact on the performance of the model. Specifically, samples with noisy backgrounds or with multiple voices speaking at the same time will deteriorate the performance of our model. In this paper, we describe our approach to tackle this problem: we first preprocess the training data then use it to train a FastSpeech2 [10] acoustic model with some replacements in the external aligner model, finally we use HiFiGAN [4] vocoder to construct the waveform. According to the official evaluation of VLSP 2021 competition in the TTS task, our approach achieves 3.729 in-domain MOS, 3.557 out-of-domain MOS, and 79.70% SUS score. Audio samples are available at https://navi-tts.github.io/.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131040199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
VNU Journal of Science: Computer Science and Communication Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1