首页 > 最新文献

2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)最新文献

英文 中文
SFTRLS-Based Speech Enhancement Method Using CNN to Determine the Noise Type and the Optimal Forgetting Factor 基于sftrls的语音增强方法,利用CNN确定噪声类型和最佳遗忘因子
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520741
De-You Tang, Guoqiang Chen
This paper presents a speech enhancement method combining the convolutional neural network (CNN) and SFTRLS, SFTRLS-CNN, which consists of two tiers of CNN to customize parameters for the SFTRLS algorithm. The first CNN identifies noise type, and the second CNN matches the best forgetting factor. The experimental results show that the noise recognition rate of SFTRLS-CNN goes up to 99.97% and displays better performance than the k-nearest neighbor (KNN) and the support vector machine (SVM). The accuracy ratio of matching the best forgetting factor for the SFTRLS is up to 99.40%. The improvement of the perceptual evaluation of speech quality (PESQ) is 23%, and the decrease of log-spectral distortion (LSD) is 4% on average. SFTRLS-CNN also improves the SNR of all speeches significantly.
本文提出了一种将卷积神经网络(CNN)与SFTRLS相结合的语音增强方法,即SFTRLS-CNN,该方法由两层CNN组成,为SFTRLS算法定制参数。第一个CNN识别噪声类型,第二个CNN匹配最佳遗忘因子。实验结果表明,SFTRLS-CNN的噪声识别率高达99.97%,优于k近邻(KNN)和支持向量机(SVM)。对最佳遗忘因子的匹配正确率达99.40%。语音质量感知评价(PESQ)平均提高23%,对数频谱失真(LSD)平均降低4%。SFTRLS-CNN也显著提高了所有演讲的信噪比。
{"title":"SFTRLS-Based Speech Enhancement Method Using CNN to Determine the Noise Type and the Optimal Forgetting Factor","authors":"De-You Tang, Guoqiang Chen","doi":"10.1109/PRML52754.2021.9520741","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520741","url":null,"abstract":"This paper presents a speech enhancement method combining the convolutional neural network (CNN) and SFTRLS, SFTRLS-CNN, which consists of two tiers of CNN to customize parameters for the SFTRLS algorithm. The first CNN identifies noise type, and the second CNN matches the best forgetting factor. The experimental results show that the noise recognition rate of SFTRLS-CNN goes up to 99.97% and displays better performance than the k-nearest neighbor (KNN) and the support vector machine (SVM). The accuracy ratio of matching the best forgetting factor for the SFTRLS is up to 99.40%. The improvement of the perceptual evaluation of speech quality (PESQ) is 23%, and the decrease of log-spectral distortion (LSD) is 4% on average. SFTRLS-CNN also improves the SNR of all speeches significantly.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116859867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probability Preserving Discriminative Nonnegative Matrix Factorization 保概率判别非负矩阵分解
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520691
Liuyin Lin, Xin Shu, Jing Song, C. Yu
Non-negative matrix factorization (NMF) has received increasing attention since it is a practical decomposition approach in computer vision and pattern recognition. NMF allows only additive combinations which leads to parts-based representation. Further, NMF and its variants often ignore the underlying local structure information. In this paper, we propose a novel objective which provides enough probabilistic semantics of intrinsic local topology via the probability preserving regularizer, together with the joint multiplicative update routine. Additionally, through the class indictor matrix coupled with the loss function, the generative and discriminative components with the property of local probability preservation can be simultaneously acquired which is rather optimal for the classification. The experimental results of both clustering and classification tasks demonstrate that performance of the proposed approach is clearly competitive with several other state-of-the-art algorithms.
非负矩阵分解(NMF)作为一种实用的分解方法,在计算机视觉和模式识别领域受到越来越多的关注。NMF只允许加法组合,这将导致基于零件的表示。此外,NMF及其变体经常忽略潜在的局部结构信息。在本文中,我们提出了一种新的目标,该目标通过概率保持正则化器和联合乘法更新例程来提供足够的本征局部拓扑的概率语义。此外,通过类指标矩阵与损失函数的耦合,可以同时获得具有局部概率保持性质的生成和判别分量,这对分类是最优的。聚类和分类任务的实验结果表明,该方法的性能明显优于其他几种最先进的算法。
{"title":"Probability Preserving Discriminative Nonnegative Matrix Factorization","authors":"Liuyin Lin, Xin Shu, Jing Song, C. Yu","doi":"10.1109/PRML52754.2021.9520691","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520691","url":null,"abstract":"Non-negative matrix factorization (NMF) has received increasing attention since it is a practical decomposition approach in computer vision and pattern recognition. NMF allows only additive combinations which leads to parts-based representation. Further, NMF and its variants often ignore the underlying local structure information. In this paper, we propose a novel objective which provides enough probabilistic semantics of intrinsic local topology via the probability preserving regularizer, together with the joint multiplicative update routine. Additionally, through the class indictor matrix coupled with the loss function, the generative and discriminative components with the property of local probability preservation can be simultaneously acquired which is rather optimal for the classification. The experimental results of both clustering and classification tasks demonstrate that performance of the proposed approach is clearly competitive with several other state-of-the-art algorithms.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133595628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Error Correction Based on Transformer LM in Uyghur Speech Recognition 基于变换LM的维吾尔语语音识别纠错方法
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520740
Yan Zhang, Mijit Ablimit, A. Hamdulla
For Uyghur, Kazakh and other minority languages or dialects, it is difficult to collect large-scale labeled corpus. In the case of low resources, reducing the recognition granularity which using phonemes or characters as the recognition unit can get good character recognition rate, but the information between words is not fully utilized intuitively, which can not solve the problem of high word error rate in the practical process. In order to correct the wrong words in the recognition, this paper proposes to use Levenshtein distance and Transformer language model with words as modeling units as the secondary scoring criteria to correct the end-to-end recognition results. In the Uyghur end-to-end recognition deployed with Conformer-CTC acoustic model, the WER decreases by 5.7%, In the end-to-end recognition deployed with BLSTM-CTC as acoustic model, it decreased by 9.1%.
对于维吾尔语、哈萨克语等少数民族语言或方言,很难收集到大规模标注语料库。在资源较低的情况下,降低以音素或字符为识别单元的识别粒度,可以获得较好的字符识别率,但无法直观地充分利用词间信息,无法解决实际过程中错误率高的问题。为了纠正识别中的错误单词,本文提出使用Levenshtein距离和以单词为建模单位的Transformer语言模型作为次级评分标准来纠正端到端识别结果。在采用Conformer-CTC声学模型的维吾尔语端到端识别中,WER降低了5.7%,在采用BLSTM-CTC声学模型的端到端识别中,WER降低了9.1%。
{"title":"Error Correction Based on Transformer LM in Uyghur Speech Recognition","authors":"Yan Zhang, Mijit Ablimit, A. Hamdulla","doi":"10.1109/PRML52754.2021.9520740","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520740","url":null,"abstract":"For Uyghur, Kazakh and other minority languages or dialects, it is difficult to collect large-scale labeled corpus. In the case of low resources, reducing the recognition granularity which using phonemes or characters as the recognition unit can get good character recognition rate, but the information between words is not fully utilized intuitively, which can not solve the problem of high word error rate in the practical process. In order to correct the wrong words in the recognition, this paper proposes to use Levenshtein distance and Transformer language model with words as modeling units as the secondary scoring criteria to correct the end-to-end recognition results. In the Uyghur end-to-end recognition deployed with Conformer-CTC acoustic model, the WER decreases by 5.7%, In the end-to-end recognition deployed with BLSTM-CTC as acoustic model, it decreased by 9.1%.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"45 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130558254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Moroccan Dialect “Darija” Automatic Speech Recognition: A Survey 摩洛哥方言“Darija”自动语音识别研究
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520690
Maria Labied, A. Belangour
Nowadays, human-machine interaction is growing swiftly, and Automatic Speech Recognition is gaining immense interest to make the daily routines much easier. This could be illustrated by the various applications of Speech Recognition in our daily lives, such as voice dictation, interactive voice response systems, device control, telephone applications, and others. Besides Automatic Speech Recognition, Natural language processing has gained significant improvements in terms of technologies and used approaches. Till today great results have been achieved in those Fields, especially for international languages such as English, Spanish, French, and Arabic. Whereas few results have been reached for dialects of languages such as the case of Moroccan dialect “Darija”. The growing use of Moroccan Darija on social media, videos, chatting and others, opens new research directions for Moroccan Darija speech recognition. The leading goal of this paper is to give a literature review on Moroccan Darija Automatic Speech Recognition. Through presenting the dialect specific constraints, the different works conducted in the field of Moroccan Darija speech recognition, and the progress made in recent years.
在人机交互飞速发展的今天,自动语音识别技术正引起人们极大的兴趣,使日常工作变得更加容易。这可以通过语音识别在我们日常生活中的各种应用来说明,例如语音听写、交互式语音响应系统、设备控制、电话应用等。除了自动语音识别之外,自然语言处理在技术和使用方法方面也取得了重大进展。直到今天,在这些领域取得了巨大的成果,特别是在英语、西班牙语、法语和阿拉伯语等国际语言方面。然而,对语言方言(如摩洛哥方言“Darija”)的研究结果却很少。摩洛哥语Darija在社交媒体、视频、聊天等方面的使用越来越多,为摩洛哥语Darija语音识别开辟了新的研究方向。本文的主要目的是对摩洛哥语Darija自动语音识别进行文献综述。通过介绍方言的具体制约因素,介绍在摩洛哥语达里加语语音识别领域开展的不同工作,以及近年来取得的进展。
{"title":"Moroccan Dialect “Darija” Automatic Speech Recognition: A Survey","authors":"Maria Labied, A. Belangour","doi":"10.1109/PRML52754.2021.9520690","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520690","url":null,"abstract":"Nowadays, human-machine interaction is growing swiftly, and Automatic Speech Recognition is gaining immense interest to make the daily routines much easier. This could be illustrated by the various applications of Speech Recognition in our daily lives, such as voice dictation, interactive voice response systems, device control, telephone applications, and others. Besides Automatic Speech Recognition, Natural language processing has gained significant improvements in terms of technologies and used approaches. Till today great results have been achieved in those Fields, especially for international languages such as English, Spanish, French, and Arabic. Whereas few results have been reached for dialects of languages such as the case of Moroccan dialect “Darija”. The growing use of Moroccan Darija on social media, videos, chatting and others, opens new research directions for Moroccan Darija speech recognition. The leading goal of this paper is to give a literature review on Moroccan Darija Automatic Speech Recognition. Through presenting the dialect specific constraints, the different works conducted in the field of Moroccan Darija speech recognition, and the progress made in recent years.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127482433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Lifestyle Analysis via a Corpus of Disease-Fighting Weblogs 通过抗病博客语料库分析生活方式
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520697
Kazuyuki Matsumoto, Mopuaa Ryu, Minoru Yoshida, K. Kita
In recent years, the population of pre-diabetics in Japan has been increasing year by year. Type 2 diabetes is a type of lifestyle-related disease that can be prevented to a certain extent by correcting lifestyle habits. However, by the time we realize that there is a problem with our lifestyle, it may be too late. Therefore, early detection of risk factors for lifestyle-related diseases is important. In this study, we collected the blogs of lifestyle-related disease fighters, set up multiple keyword categories that are considered to be related to risk factors, and constructed a corpus of disease fighting blogs with labels for each category. The results of the evaluation experiments show that our proposed method can be applied to a wide range of topics. As a result of evaluation experiments, our proposed method achieves categorization of keywords and sentences with higher accuracy than the simple method.
近年来,日本糖尿病前期人口逐年增加。2型糖尿病是一种与生活方式有关的疾病,可以通过纠正生活习惯在一定程度上加以预防。然而,当我们意识到我们的生活方式有问题时,可能已经太晚了。因此,及早发现与生活方式有关的疾病的危险因素是很重要的。在这项研究中,我们收集了与生活方式相关的疾病斗士的博客,建立了多个被认为与危险因素相关的关键词类别,并构建了一个疾病斗士博客语料库,每个类别都有标签。评估实验结果表明,我们提出的方法可以应用于广泛的主题。通过评价实验,我们提出的方法实现了对关键词和句子的分类,准确率高于简单方法。
{"title":"Lifestyle Analysis via a Corpus of Disease-Fighting Weblogs","authors":"Kazuyuki Matsumoto, Mopuaa Ryu, Minoru Yoshida, K. Kita","doi":"10.1109/PRML52754.2021.9520697","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520697","url":null,"abstract":"In recent years, the population of pre-diabetics in Japan has been increasing year by year. Type 2 diabetes is a type of lifestyle-related disease that can be prevented to a certain extent by correcting lifestyle habits. However, by the time we realize that there is a problem with our lifestyle, it may be too late. Therefore, early detection of risk factors for lifestyle-related diseases is important. In this study, we collected the blogs of lifestyle-related disease fighters, set up multiple keyword categories that are considered to be related to risk factors, and constructed a corpus of disease fighting blogs with labels for each category. The results of the evaluation experiments show that our proposed method can be applied to a wide range of topics. As a result of evaluation experiments, our proposed method achieves categorization of keywords and sentences with higher accuracy than the simple method.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132202031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Concept Word Extraction for Bilingual Ontology Construction in Unstructured Text Environment* 非结构化文本环境下双语本体构建的概念词提取*
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520708
Turdi Tohti, Le Chang, A. Hamdulla, Hankiz Yilahun
Aiming at the unsatisfactory efficiency of concept word extraction from unstructured text for domain ontology construction, this work first uses a combined statistic to judge the correctness of the concept word boundary determined by the word segmentation, and corrects the wrong segmentation position, thereby strengthening the structural integrity of the segmented candidate concept words. On this basis, the improved methods and various resource libraries are used to adjust the weight of concept words, and the main purpose is to strengthen the correlation between the weight and its domain attributes of concept words. We conducted experiments and comparisons on English-Chinese bilingual corpus, and found that the method of strengthening the structural integrity of concept words and the method of dynamically adjusting the weight of concept words proposed in this paper both brought a certain improvement in the efficiency of concept word extraction.
针对从非结构化文本中提取概念词用于领域本体构建效率不理想的问题,本工作首先采用组合统计对分词确定的概念词边界的正确性进行判断,并对错误的分词位置进行纠正,从而增强了分词候选概念词的结构完整性。在此基础上,利用改进的方法和各种资源库对概念词的权重进行调整,主要目的是加强概念词的权重与其领域属性之间的相关性。我们在英汉双语语料库上进行了实验和对比,发现本文提出的加强概念词结构完整性的方法和动态调整概念词权重的方法都对概念词提取的效率带来了一定的提高。
{"title":"Concept Word Extraction for Bilingual Ontology Construction in Unstructured Text Environment*","authors":"Turdi Tohti, Le Chang, A. Hamdulla, Hankiz Yilahun","doi":"10.1109/PRML52754.2021.9520708","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520708","url":null,"abstract":"Aiming at the unsatisfactory efficiency of concept word extraction from unstructured text for domain ontology construction, this work first uses a combined statistic to judge the correctness of the concept word boundary determined by the word segmentation, and corrects the wrong segmentation position, thereby strengthening the structural integrity of the segmented candidate concept words. On this basis, the improved methods and various resource libraries are used to adjust the weight of concept words, and the main purpose is to strengthen the correlation between the weight and its domain attributes of concept words. We conducted experiments and comparisons on English-Chinese bilingual corpus, and found that the method of strengthening the structural integrity of concept words and the method of dynamically adjusting the weight of concept words proposed in this paper both brought a certain improvement in the efficiency of concept word extraction.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134545568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Research on Tibetan-Chinese Machine Translation Based on Multi-Strategy Processing 基于多策略处理的藏汉机器翻译研究
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520733
Saihu Liu, Jie Zhu, Zhensong Li, Zhixiang Luo
This article takes the low-resource nature of Tibetan-Chinese machine translation as the research object, acquires training data through a variety of strategies, and explores the problem of domain adaptability in Tibetan-Chinese materials and the problem of multi-granularity segmentation. Researched the Tibetan-Chinese machine translation method based on Transformer attention mechanism, studied the Tibetan-Chinese machine translation method with different segmentation granularity applied to both ends of encoder-decoder, evaluated multiple granular segmentation, corpus fusion of different fields and different types. The effect of corpus fusion is the experimental result with the highest BLEU score of 44.9 points.
本文以藏汉机器翻译的低资源特性为研究对象,通过多种策略获取训练数据,探索藏汉材料的领域自适应问题和多粒度分割问题。研究了基于Transformer注意机制的藏汉机器翻译方法,研究了不同分割粒度的藏汉机器翻译方法在编解码器两端的应用,评估了不同领域、不同类型的多粒度分割、语料库融合。体融合效果是BLEU评分最高的实验结果,达到44.9分。
{"title":"Research on Tibetan-Chinese Machine Translation Based on Multi-Strategy Processing","authors":"Saihu Liu, Jie Zhu, Zhensong Li, Zhixiang Luo","doi":"10.1109/PRML52754.2021.9520733","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520733","url":null,"abstract":"This article takes the low-resource nature of Tibetan-Chinese machine translation as the research object, acquires training data through a variety of strategies, and explores the problem of domain adaptability in Tibetan-Chinese materials and the problem of multi-granularity segmentation. Researched the Tibetan-Chinese machine translation method based on Transformer attention mechanism, studied the Tibetan-Chinese machine translation method with different segmentation granularity applied to both ends of encoder-decoder, evaluated multiple granular segmentation, corpus fusion of different fields and different types. The effect of corpus fusion is the experimental result with the highest BLEU score of 44.9 points.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114412175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deeply Fine-Tune a Convolutional Neural Network in Remote Sensing Image Classification: Easter Africa Countries (EAC) 深度微调卷积神经网络在遥感图像分类中的应用:东非国家(EAC)
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520703
M. J. Bosco, Wang Guoyin
Remote sensing is resource data accessible and easy to get in different areas without time-consuming. The traditional image recognition task was unlimited to better classification. A convolutional neural network (CNN) was introduced to improve remote sensing image classification accuracy by eliminating the intra-class and class similarity. Training CNN from scratch requires a large annotated dataset that is occasional in the remote sensing area. Transfer learning of CNN weights from another large non-remote sensing dataset can occasionally help overcome typical RS image applications. Transfer learning consists of fine-tuning CNN layers to better the new dataset. In this paper, all of the experiments were done on nine categories for dataset collected in east Africa community countries (EAC) using three state-of-the-art architectures based on the effect of fine-tuning and pre-trained weights of CNN. Results indicate that fine-tuning the entire network is not always a significant way; we compared it with a process of using VGG16-DensNet pre-trained weights and RF as machine learning classified results can be improved up to 97.60. Alternatively, fine-tuning the top blocks can save computational power and produce a more robust classifier.
遥感是一种可访问的资源数据,在不同的地区容易获得,不需要花费时间。传统的图像识别任务对更好的分类是没有限制的。引入卷积神经网络(CNN),通过消除类内相似性和类相似性来提高遥感图像的分类精度。从头开始训练CNN需要一个大的带注释的数据集,这在遥感区域是偶然的。从另一个大型非遥感数据集迁移学习CNN权值偶尔可以帮助克服典型的RS图像应用。迁移学习包括微调CNN层以更好地处理新数据集。在本文中,基于CNN的微调和预训练权值的效果,使用三种最先进的架构,在东非社区国家(EAC)收集的数据集上对9个类别进行了所有实验。结果表明,对整个网络进行微调并不总是有效的方法;我们将其与使用VGG16-DensNet预训练权值和RF作为机器学习分类结果的过程进行了比较,可以提高到97.60。或者,微调顶部块可以节省计算能力并产生更健壮的分类器。
{"title":"Deeply Fine-Tune a Convolutional Neural Network in Remote Sensing Image Classification: Easter Africa Countries (EAC)","authors":"M. J. Bosco, Wang Guoyin","doi":"10.1109/PRML52754.2021.9520703","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520703","url":null,"abstract":"Remote sensing is resource data accessible and easy to get in different areas without time-consuming. The traditional image recognition task was unlimited to better classification. A convolutional neural network (CNN) was introduced to improve remote sensing image classification accuracy by eliminating the intra-class and class similarity. Training CNN from scratch requires a large annotated dataset that is occasional in the remote sensing area. Transfer learning of CNN weights from another large non-remote sensing dataset can occasionally help overcome typical RS image applications. Transfer learning consists of fine-tuning CNN layers to better the new dataset. In this paper, all of the experiments were done on nine categories for dataset collected in east Africa community countries (EAC) using three state-of-the-art architectures based on the effect of fine-tuning and pre-trained weights of CNN. Results indicate that fine-tuning the entire network is not always a significant way; we compared it with a process of using VGG16-DensNet pre-trained weights and RF as machine learning classified results can be improved up to 97.60. Alternatively, fine-tuning the top blocks can save computational power and produce a more robust classifier.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127867346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facial Beauty Study Based on 3D Geometric Features 基于三维几何特征的面部美研究
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520726
Wenming Han, Fangmei Chen, Fuming Sun
Facial beauty is related to different kinds of features, such as geometry, texture and expression. Geometric features are the most investigated ones, because 1) they have clear and interpretable definitions; 2) they do not change with face make-up, illumination and resolution; and 3) they can be used to guide the aesthetic plastic surgeries. Due to the high cost of 3D scanning, most existing works focus on 2D geometric features extracted from frontal face images. However, the profile information is neglected, which also plays an important role in facial beauty judgment. In this paper, we reconstruct 3D faces from 2D images using recent monocular 3D face reconstruction method. Then 22 anatomical landmarks are defined on the 3D face, and based on which totally 51 geometric features are extracted. Finally, we design experiments to evaluate the effectiveness of these features. The results show that ratio features are the most influential ones, and lips also affect facial beauty. Comparison between Asian and Caucasian shows that there are significant differences between different ethnic groups. For Asian faces, an angle feature related to face width and nose height has the highest ranking. For the Caucasian groups, the top-ranked features are length and ratio features, and the lip region plays an important role.
面部美与不同种类的特征有关,如几何、纹理和表情。几何特征是研究最多的特征,因为1)几何特征具有清晰和可解释的定义;2)不随面部化妆、光照和分辨率的变化而变化;3)可用于指导美容整形手术。由于3D扫描的高成本,大多数现有的工作都集中在从正面人脸图像中提取二维几何特征。然而,侧面信息在人脸美的判断中也起着重要的作用。本文采用最新的单眼三维人脸重建方法,从二维图像中重建三维人脸。然后在三维人脸上定义22个解剖标志,并在此基础上提取51个几何特征。最后,我们设计了实验来评估这些特征的有效性。结果表明,比例特征是最具影响力的特征,嘴唇也会影响面部美。亚洲人与高加索人的比较表明,不同民族之间存在显著差异。对于亚洲人来说,与脸宽和鼻子高相关的角度特征排名最高。对于高加索人群来说,排名靠前的特征是长度和比例特征,而嘴唇区域起着重要的作用。
{"title":"Facial Beauty Study Based on 3D Geometric Features","authors":"Wenming Han, Fangmei Chen, Fuming Sun","doi":"10.1109/PRML52754.2021.9520726","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520726","url":null,"abstract":"Facial beauty is related to different kinds of features, such as geometry, texture and expression. Geometric features are the most investigated ones, because 1) they have clear and interpretable definitions; 2) they do not change with face make-up, illumination and resolution; and 3) they can be used to guide the aesthetic plastic surgeries. Due to the high cost of 3D scanning, most existing works focus on 2D geometric features extracted from frontal face images. However, the profile information is neglected, which also plays an important role in facial beauty judgment. In this paper, we reconstruct 3D faces from 2D images using recent monocular 3D face reconstruction method. Then 22 anatomical landmarks are defined on the 3D face, and based on which totally 51 geometric features are extracted. Finally, we design experiments to evaluate the effectiveness of these features. The results show that ratio features are the most influential ones, and lips also affect facial beauty. Comparison between Asian and Caucasian shows that there are significant differences between different ethnic groups. For Asian faces, an angle feature related to face width and nose height has the highest ranking. For the Caucasian groups, the top-ranked features are length and ratio features, and the lip region plays an important role.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128296253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rumor Detection Based on Improved Transformer 基于改进型变压器的谣言检测
Pub Date : 2021-07-16 DOI: 10.1109/PRML52754.2021.9520704
Honghao Zheng, Hongtao Yu, Yinuo Hao, Yiteng Wu, Shaomei Li
In the field of rumor detection, the existing Transformer-based methods ignore the location information and fail to effectively use the potential information of the text. Therefore, we propose a social media rumor detection method based on improved Transformer that improves the standard Transformer through two novel techniques. First, learnable relative positional encoding is used to endow the Transformer with the ability of direction- and distance-awareness. Second, absolute positional encoding is used, through which each word with different absolute positions is mapped to its corresponding representation space. The experimental results show that, compared with the current best benchmark method, the accuracy of this method on the three data sets of Twitter15, Twitter16 and Weibo has increased by 0.9%, 0.6%, and 1.4%, respectively. The improved Transformer is effective and can significantly improve the effect of social media rumor detection.
在谣言检测领域,现有的基于transformer的方法忽略了位置信息,不能有效地利用文本的潜在信息。因此,我们提出了一种基于改进Transformer的社交媒体谣言检测方法,该方法通过两种新颖的技术改进了标准Transformer。首先,采用可学习的相对位置编码,赋予变形器方向感知和距离感知能力。其次,采用绝对位置编码,将具有不同绝对位置的单词映射到对应的表示空间。实验结果表明,与目前最好的基准方法相比,该方法在Twitter15、Twitter16和Weibo三个数据集上的准确率分别提高了0.9%、0.6%和1.4%。改进后的Transformer是有效的,可以显著提高社交媒体谣言检测的效果。
{"title":"Rumor Detection Based on Improved Transformer","authors":"Honghao Zheng, Hongtao Yu, Yinuo Hao, Yiteng Wu, Shaomei Li","doi":"10.1109/PRML52754.2021.9520704","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520704","url":null,"abstract":"In the field of rumor detection, the existing Transformer-based methods ignore the location information and fail to effectively use the potential information of the text. Therefore, we propose a social media rumor detection method based on improved Transformer that improves the standard Transformer through two novel techniques. First, learnable relative positional encoding is used to endow the Transformer with the ability of direction- and distance-awareness. Second, absolute positional encoding is used, through which each word with different absolute positions is mapped to its corresponding representation space. The experimental results show that, compared with the current best benchmark method, the accuracy of this method on the three data sets of Twitter15, Twitter16 and Weibo has increased by 0.9%, 0.6%, and 1.4%, respectively. The improved Transformer is effective and can significantly improve the effect of social media rumor detection.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125184176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1