Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005402
Yin Liwei, Z. Heng, Wang Zhonghua
S-band multi-channel T/R module auto-testing system and its implementation are introduced. The system is composed of a test cabinet, a 32-channel waveform generation fixture, a 32channel digital receiver fixture, two 32-channel DAM-T component test fixtures, two 32-channel DAM-R component test fixtures and a set of customized auto-testing software. The auto-testing function is realized by one-click computer switch. Finally, the real object is given. Electronic document is a “live” template. The various components of your paper like title, text, and heads, etc. are already defined on the style sheet, as illustrated by the portions given in this document.
{"title":"An Auto-testing System for S-Band Multi-channel T/R Module","authors":"Yin Liwei, Z. Heng, Wang Zhonghua","doi":"10.1109/ICSAI57119.2022.10005402","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005402","url":null,"abstract":"S-band multi-channel T/R module auto-testing system and its implementation are introduced. The system is composed of a test cabinet, a 32-channel waveform generation fixture, a 32channel digital receiver fixture, two 32-channel DAM-T component test fixtures, two 32-channel DAM-R component test fixtures and a set of customized auto-testing software. The auto-testing function is realized by one-click computer switch. Finally, the real object is given. Electronic document is a “live” template. The various components of your paper like title, text, and heads, etc. are already defined on the style sheet, as illustrated by the portions given in this document.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"439 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132750961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005423
Zhiyong Xiao, Kun Liu, Xinxin Wang
The fuzzy differential equation (FDE) with two-point boundary value (TPBV) based on fuzzy number quotient space (QSFN) is studied. The equivalence between FDE and fuzzy integral equation (FIE) is established by Green’s function and formula of integration by parts. The existence of unique solution of TPBVP is obtained by contraction mapping principle.
{"title":"Two-Point Boundary Value Problems in the Quotient Space of Fuzzy Number","authors":"Zhiyong Xiao, Kun Liu, Xinxin Wang","doi":"10.1109/ICSAI57119.2022.10005423","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005423","url":null,"abstract":"The fuzzy differential equation (FDE) with two-point boundary value (TPBV) based on fuzzy number quotient space (QSFN) is studied. The equivalence between FDE and fuzzy integral equation (FIE) is established by Green’s function and formula of integration by parts. The existence of unique solution of TPBVP is obtained by contraction mapping principle.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133243481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005536
Jing Du, Zhanquan Wang, Mengfei Ye
Discovering spatio-temporal co-occurrence patterns is a significant issue in many fields. Previous algorithms simply looked for positive patterns when mining spatial co-occurrence patterns. However, patterns with strong negative associations are ignored. This paper proposed a novel algorithm for mining both positive and negative co-occurrence patterns. We introduced the notions of positive and negative co-occurrence patterns, and positive and negative co-occurrence patterns are mined by using an effective pruning strategy. This paper analyzed the completeness and correctness of the algorithm. We conducted experiments using both real and synthetic data sets to validate the effectiveness and efficiency of the suggested method.
{"title":"Research on Algorithms of Positive and Negative Co-occurrence in Spatio-temporal Datasets","authors":"Jing Du, Zhanquan Wang, Mengfei Ye","doi":"10.1109/ICSAI57119.2022.10005536","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005536","url":null,"abstract":"Discovering spatio-temporal co-occurrence patterns is a significant issue in many fields. Previous algorithms simply looked for positive patterns when mining spatial co-occurrence patterns. However, patterns with strong negative associations are ignored. This paper proposed a novel algorithm for mining both positive and negative co-occurrence patterns. We introduced the notions of positive and negative co-occurrence patterns, and positive and negative co-occurrence patterns are mined by using an effective pruning strategy. This paper analyzed the completeness and correctness of the algorithm. We conducted experiments using both real and synthetic data sets to validate the effectiveness and efficiency of the suggested method.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114296725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005350
Xiaohu Wu, Xinyong Wang, W. Lin, R. Wang, X. Fan
The booming of renewable distributed generations challenges Power System Operators (SO) with operational and market aspects such as flexibility shortages, electricity price volatility, market liquidity risk, and even operation failures. Many SO released public market data, which prepare market players to investigate the procurement opportunities for advanced energy deployment, policies, and regulatory configurations. A further step for an SO to take is more meaningful to promote market efficiencies, We propose a Perturbation Analysis-based market player behavior modeling and simulation approach to effectively study the impacts of strategic behaviors on the electricity market efficiency. A supervisory controller style electricity market simulation architecture is designed to observe, predict and control behaviors of the electricity market players, which makes the simulation process closed-loop. A case study shows the effectiveness of the proposed approach in investigating market players’ behaviors.
{"title":"Perturbation Analysis Based Simulation Approach for Electricity Market Research and Investigation","authors":"Xiaohu Wu, Xinyong Wang, W. Lin, R. Wang, X. Fan","doi":"10.1109/ICSAI57119.2022.10005350","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005350","url":null,"abstract":"The booming of renewable distributed generations challenges Power System Operators (SO) with operational and market aspects such as flexibility shortages, electricity price volatility, market liquidity risk, and even operation failures. Many SO released public market data, which prepare market players to investigate the procurement opportunities for advanced energy deployment, policies, and regulatory configurations. A further step for an SO to take is more meaningful to promote market efficiencies, We propose a Perturbation Analysis-based market player behavior modeling and simulation approach to effectively study the impacts of strategic behaviors on the electricity market efficiency. A supervisory controller style electricity market simulation architecture is designed to observe, predict and control behaviors of the electricity market players, which makes the simulation process closed-loop. A case study shows the effectiveness of the proposed approach in investigating market players’ behaviors.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"85 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114101038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Facial expression emotion recognition has been a popular research topic, which played an important role in assisting the natural human-machine conversation. The conventional method for emotion estimation from facial expressions is to learn a CNN-based image classification model from scratch, However, learning such model requires a large number of labeled facial expression images, which is still a limited resource until now. To solve this problem, we propose a data augmentation method based on StyleGAN2 to generate artificial expression images with respect to seven emotions and use them as the additional training data. We further train an expression emotion recognition model based on a VGG16 network through transfer learning. In this research, we proposed a method using transfer learning and augmented images of facial expressions using trained VGG16 and StyleGAN2 and conducted experiments to achieve higher recognition accuracy for racial expression emotion recognition. Our experiment based on the CFEE dataset suggested that an emotion recognition accuracy of 75.10% could be obtained through transfer learning and the accuracy could further improved to 82.04% with the augmented expression images.
{"title":"Facial Expression Emotion Recognition Based on Transfer Learning and Generative Model","authors":"Tomoki Kusunose, Xin Kang, Keita Kiuchi, Ryota Nishimura, M. Sasayama, Kazuyuki Matsumoto","doi":"10.1109/ICSAI57119.2022.10005478","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005478","url":null,"abstract":"Facial expression emotion recognition has been a popular research topic, which played an important role in assisting the natural human-machine conversation. The conventional method for emotion estimation from facial expressions is to learn a CNN-based image classification model from scratch, However, learning such model requires a large number of labeled facial expression images, which is still a limited resource until now. To solve this problem, we propose a data augmentation method based on StyleGAN2 to generate artificial expression images with respect to seven emotions and use them as the additional training data. We further train an expression emotion recognition model based on a VGG16 network through transfer learning. In this research, we proposed a method using transfer learning and augmented images of facial expressions using trained VGG16 and StyleGAN2 and conducted experiments to achieve higher recognition accuracy for racial expression emotion recognition. Our experiment based on the CFEE dataset suggested that an emotion recognition accuracy of 75.10% could be obtained through transfer learning and the accuracy could further improved to 82.04% with the augmented expression images.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126663280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005405
Ruili Xiao, Xiangrong Tong, Yinggang Li
It is essential to predict the level of trust among users before they interact to reduce the risk of interaction. Due to the sparsity of trust relationships, it is inefficient to simply use explicit trust relationships to predict the trust among users, and even the trust path may be lost. On the other hand, there are implicit trust relationships among users such as the joint items that several users all rated. Once the trust relationship is extracted, it will greatly expand the number of trusted users. To this end, a trust prediction method incorporating rating information is proposed to address this problem. It first constructs a heterogeneous information network consisting of social and rating information. Secondly, in the trust prediction period, if the user has no trusted users to choose from, the joint item is used as a bridge to find implicit trusted users from users who have jointly rated the item. Finally, the Dueling DQN algorithm is used to calculate the strength of the trust path, and the predicted trust value is derived by aggregating multiple trust paths based on an aggregation function. The experimental results on two datasets indicate the presented approach outperforms most existing trust prediction methods.
{"title":"A Trust Prediction Method Based on Heterogeneous Information Networks","authors":"Ruili Xiao, Xiangrong Tong, Yinggang Li","doi":"10.1109/ICSAI57119.2022.10005405","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005405","url":null,"abstract":"It is essential to predict the level of trust among users before they interact to reduce the risk of interaction. Due to the sparsity of trust relationships, it is inefficient to simply use explicit trust relationships to predict the trust among users, and even the trust path may be lost. On the other hand, there are implicit trust relationships among users such as the joint items that several users all rated. Once the trust relationship is extracted, it will greatly expand the number of trusted users. To this end, a trust prediction method incorporating rating information is proposed to address this problem. It first constructs a heterogeneous information network consisting of social and rating information. Secondly, in the trust prediction period, if the user has no trusted users to choose from, the joint item is used as a bridge to find implicit trusted users from users who have jointly rated the item. Finally, the Dueling DQN algorithm is used to calculate the strength of the trust path, and the predicted trust value is derived by aggregating multiple trust paths based on an aggregation function. The experimental results on two datasets indicate the presented approach outperforms most existing trust prediction methods.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122443289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chicken counting is an essential task in large-scale farming management. Due to dense distribution, uneven illumination, and partial occlusion, accurate chicken counting remains challenging. In this paper, an automated chicken counting algorithm based on You Only Look Once (YOLO) v5x model is implemented. The intersection over union (IoU) threshold is set by analyzing the width and height of the ground truth (GT) boxes of the training images. Three objective-oriented data enhancements, i.e., Mosaic, horizontal flipping combined with lightness changing, and test time augmentation (TTA), are applied to diversify the training data. To validate the efficiency of our proposed method, extensive experiments are conducted on a well-annotated dataset collected from a real farm with 1,100 images and 170,906 chickens in total. Our implementation achieves the average_accuracy of 95.87% and inference speed of 23 ms per image, even if chickens are partially occluded in extremely uneven illumination perspectives.
鸡的计数是规模化养殖管理的一项重要工作。由于分布密集,光照不均匀和部分遮挡,准确的鸡计数仍然具有挑战性。本文实现了一种基于You Only Look Once (YOLO) v5x模型的自动数鸡算法。通过分析训练图像的ground truth (GT) box的宽度和高度,设置交集超过联合(IoU)阈值。采用三种面向目标的数据增强,即马赛克、水平翻转结合亮度变化和测试时间增强(TTA),使训练数据多样化。为了验证我们提出的方法的效率,我们在一个来自真实农场的数据集上进行了大量的实验,该数据集收集了1100张图像和170,906只鸡。我们的实现实现了95.87%的平均准确率和23毫秒的每张图像的推理速度,即使鸡在极不均匀的光照视角下被部分遮挡。
{"title":"Automated Chicken Counting Using YOLO-v5x Algorithm","authors":"Xiangyuan Zhu, Chuhui Wu, Yefeng Yang, Yuelin Yao, Yanshan Wu","doi":"10.1109/ICSAI57119.2022.10005522","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005522","url":null,"abstract":"Chicken counting is an essential task in large-scale farming management. Due to dense distribution, uneven illumination, and partial occlusion, accurate chicken counting remains challenging. In this paper, an automated chicken counting algorithm based on You Only Look Once (YOLO) v5x model is implemented. The intersection over union (IoU) threshold is set by analyzing the width and height of the ground truth (GT) boxes of the training images. Three objective-oriented data enhancements, i.e., Mosaic, horizontal flipping combined with lightness changing, and test time augmentation (TTA), are applied to diversify the training data. To validate the efficiency of our proposed method, extensive experiments are conducted on a well-annotated dataset collected from a real farm with 1,100 images and 170,906 chickens in total. Our implementation achieves the average_accuracy of 95.87% and inference speed of 23 ms per image, even if chickens are partially occluded in extremely uneven illumination perspectives.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129713555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005515
Yanan Li, Yunyan Wang, Yanfang Wang
According to the relevant data queried by China Statistical Yearbook, we can see that China’s population has been declining in recent years. In order to better grasp the trend of population development, this paper comprehensively considers the factors affecting the number of China’s population, uses Lars and Glmnet to screen variables based on Lasso model, and determines the main factors affecting the number of China’s population screened by Lars Lasso model by comparing the results and searching relevant literature. Further, this paper introduces multivariate fractional order grey model to predict the population of China, 2005-2017 under different order forecast error is determine the differential order number, 2018-2020 data in model verification, improve the model accuracy, in order to predict the future ten years, the population of predicted results found that by 2030, The total population of China will fall to 1,348,3740 million, which is a certain gap from the number predicted by the national population planning policy. In order to achieve the expected size of the national population planning policy, the future population development should focus on how to effectively increase the fertility rate and improve the birth policy, so as to increase the number of China’s population.
{"title":"Population Prediction in China Based on Lasso-FGM Model","authors":"Yanan Li, Yunyan Wang, Yanfang Wang","doi":"10.1109/ICSAI57119.2022.10005515","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005515","url":null,"abstract":"According to the relevant data queried by China Statistical Yearbook, we can see that China’s population has been declining in recent years. In order to better grasp the trend of population development, this paper comprehensively considers the factors affecting the number of China’s population, uses Lars and Glmnet to screen variables based on Lasso model, and determines the main factors affecting the number of China’s population screened by Lars Lasso model by comparing the results and searching relevant literature. Further, this paper introduces multivariate fractional order grey model to predict the population of China, 2005-2017 under different order forecast error is determine the differential order number, 2018-2020 data in model verification, improve the model accuracy, in order to predict the future ten years, the population of predicted results found that by 2030, The total population of China will fall to 1,348,3740 million, which is a certain gap from the number predicted by the national population planning policy. In order to achieve the expected size of the national population planning policy, the future population development should focus on how to effectively increase the fertility rate and improve the birth policy, so as to increase the number of China’s population.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134407274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005485
Yiming Wei, Xiaobo Lu
Tiny bolts are widely used on high-speed trains, playing an important role in fixing train components. However, because of the complex running environment of trains, missing bolts occur from time to time and may cause traffic accidents, resulting in property damage and, in serious cases, endangering the lives of the occupants. Therefore, it is essential to detect missing bolts on high-speed trains. The bolts discussed in this paper are generally located on the underside of high-speed trains and their small size makes detection more difficult. In this paper, we first expand the dataset, and then add the Attention module and Transformer based on YOLOv5, and change the FPN of YOLOv5 to BiFPN, fuse the features of different layers using different weights, and crop the high-resolution original image during training and testing, and finally return to the original image. Our method eventually achieves 95.3% AP, effectively improving the detection accuracy.
{"title":"Missing Small Bolt Detection on High-speed Train Using Improved Yolov5","authors":"Yiming Wei, Xiaobo Lu","doi":"10.1109/ICSAI57119.2022.10005485","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005485","url":null,"abstract":"Tiny bolts are widely used on high-speed trains, playing an important role in fixing train components. However, because of the complex running environment of trains, missing bolts occur from time to time and may cause traffic accidents, resulting in property damage and, in serious cases, endangering the lives of the occupants. Therefore, it is essential to detect missing bolts on high-speed trains. The bolts discussed in this paper are generally located on the underside of high-speed trains and their small size makes detection more difficult. In this paper, we first expand the dataset, and then add the Attention module and Transformer based on YOLOv5, and change the FPN of YOLOv5 to BiFPN, fuse the features of different layers using different weights, and crop the high-resolution original image during training and testing, and finally return to the original image. Our method eventually achieves 95.3% AP, effectively improving the detection accuracy.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134116336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-10DOI: 10.1109/ICSAI57119.2022.10005394
Arata Ochi, Xin Kang
Speech emotion recognition (SER) classifies speech into emotion categories such as “happy”, “sad”, and “angry”. Speech emotion recognition has attracted more and more attention in recent years as a challenging pattern recognition task, but its performance is limited by the amount of training data. In this paper, we propose a parallel network consisting of a CNN and a Transformer that receives two types of inputs. The Convolutional Neural Network (CNN) accurately recognizes emotions from the speech data using a mel-spectrogram feature. The transformer uses Multi-Attention from Mel-Frequency Cepstrum Coefficient (MFCC) to realize the extraction of emotional semantic information in a sequence. Experiments are carried out on the Ryerson Audio-Visual Database of Emotion Speech and Song (RAVDESS) dataset. The results demonstrate the effectiveness of the proposed method and show significant improvement over previous results with fewer data and less training time without data augmentation.
{"title":"Learning a Parallel Network for Emotion Recognition Based on Small Training Data","authors":"Arata Ochi, Xin Kang","doi":"10.1109/ICSAI57119.2022.10005394","DOIUrl":"https://doi.org/10.1109/ICSAI57119.2022.10005394","url":null,"abstract":"Speech emotion recognition (SER) classifies speech into emotion categories such as “happy”, “sad”, and “angry”. Speech emotion recognition has attracted more and more attention in recent years as a challenging pattern recognition task, but its performance is limited by the amount of training data. In this paper, we propose a parallel network consisting of a CNN and a Transformer that receives two types of inputs. The Convolutional Neural Network (CNN) accurately recognizes emotions from the speech data using a mel-spectrogram feature. The transformer uses Multi-Attention from Mel-Frequency Cepstrum Coefficient (MFCC) to realize the extraction of emotional semantic information in a sequence. Experiments are carried out on the Ryerson Audio-Visual Database of Emotion Speech and Song (RAVDESS) dataset. The results demonstrate the effectiveness of the proposed method and show significant improvement over previous results with fewer data and less training time without data augmentation.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130499812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}