Pub Date : 2023-03-01DOI: 10.1109/ICNLP58431.2023.00069
Shuanggen Liu, Hui Xu, Rui Zang
With the continuous development of wearable devices and the Internet of Things, Internet of Medical Things (IoMT) consisting of wireless body area networks is booming. The IoMT system can play the role of continuous and remote monitoring of various medical parameters of patients, which provides a new direction for future medical diagnosis and can well increase the efficiency of medical services. However, the medical sensitive data collected during the communication process is vulnerable to various attacks when transmitted through wireless channels, which can easily induce to the leakage of patients’ privacy data and other security problems. In addition, owing to the limited computing resources of wearable devices, designing a secure authentication protocol is still a challenge that requires concerted efforts. In this paper, an improved lightweight anonymous authentication protocol is proposed, and the proposed scheme can guarantee the required security in IOMT system. It can be verified by the formal and informal security proofs presented in the paper. And the overall performance has been improved, including the improvement of security performance and the reduction of communication cost.
{"title":"An Improved Anonymous Authentication Scheme for Internet of Medical Things Based on Elliptic Curve Cryptography","authors":"Shuanggen Liu, Hui Xu, Rui Zang","doi":"10.1109/ICNLP58431.2023.00069","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00069","url":null,"abstract":"With the continuous development of wearable devices and the Internet of Things, Internet of Medical Things (IoMT) consisting of wireless body area networks is booming. The IoMT system can play the role of continuous and remote monitoring of various medical parameters of patients, which provides a new direction for future medical diagnosis and can well increase the efficiency of medical services. However, the medical sensitive data collected during the communication process is vulnerable to various attacks when transmitted through wireless channels, which can easily induce to the leakage of patients’ privacy data and other security problems. In addition, owing to the limited computing resources of wearable devices, designing a secure authentication protocol is still a challenge that requires concerted efforts. In this paper, an improved lightweight anonymous authentication protocol is proposed, and the proposed scheme can guarantee the required security in IOMT system. It can be verified by the formal and informal security proofs presented in the paper. And the overall performance has been improved, including the improvement of security performance and the reduction of communication cost.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"3 1","pages":"345-349"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81335003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/ICNLP58431.2023.00070
Chenglin Xu, Guohui Zhu, Qianwen Yang
In the complex environment of network slicing in 5G core network, such applications with severe demand on latency like ultra Reliable and Low Latency Communication for virtual reality, autonomous driving and remote control. A soft migration actor-critic based network slicing cache allocation algorithm under 5G core network is proposed for such applications. With the goal of reducing request latency and the cost of deploying network slices, a cache deployment model that minimizes latency and cost is established. The soft migration actor-critic (STAC) cache allocation algorithm is used to adjust the deployment by interacting with the environment non-stop to reduce the response latency of user requested contents more effectively, improve the user’s web service experience, and respond to user requests faster, while reducing the cost of deploying network slices to achieve the solution of the model. The simulation results show that the proposed cache allocation algorithm can effectively reduce the cost and user request latency.
{"title":"Cache allocation Algorithm of 5G Core Network Slicing Based on Soft Migration Actor Critic*","authors":"Chenglin Xu, Guohui Zhu, Qianwen Yang","doi":"10.1109/ICNLP58431.2023.00070","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00070","url":null,"abstract":"In the complex environment of network slicing in 5G core network, such applications with severe demand on latency like ultra Reliable and Low Latency Communication for virtual reality, autonomous driving and remote control. A soft migration actor-critic based network slicing cache allocation algorithm under 5G core network is proposed for such applications. With the goal of reducing request latency and the cost of deploying network slices, a cache deployment model that minimizes latency and cost is established. The soft migration actor-critic (STAC) cache allocation algorithm is used to adjust the deployment by interacting with the environment non-stop to reduce the response latency of user requested contents more effectively, improve the user’s web service experience, and respond to user requests faster, while reducing the cost of deploying network slices to achieve the solution of the model. The simulation results show that the proposed cache allocation algorithm can effectively reduce the cost and user request latency.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"74 1","pages":"350-356"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88355681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the continuous evolution of industrial network technology, industrial production has higher and higher requirements for reliability and real-time performance. At present Time Sensitive Networks (TSN) has become one of the evolution directions of the next generation network bearing technology. In TSN, redundancy is usually adopted to meet the reliability requirements of Time-Triggered (TT) flow. However, a large amount of bandwidth consumption caused by redundancy is also inevitable, which will directly affect the schedulability of TT flows. Therefore, this paper proposes a Genetically based Reliable Routing algorithm (GRR) based on Frame Replication and Elimination for Reliability (FRER) standard, which provides reliable and non-overlapping routes for different copies of each TT flow. It not only guarantees the reliability of the TT flow, but also increases its schedulability, so as to meet the needs of industrial applications. Through simulation experiments, compared with the existing methods, the reliability of the proposed reliable routing algorithm is improved by 2.08% on average and the schedulability is improved by 6.3% on average.
{"title":"Reliable Routing of Time-Triggered Traffic in Time Sensitive Networks","authors":"Shaojie Hou, Wujun Yang, Yuanzheng Cheng, Liyuan Feng","doi":"10.1109/ICNLP58431.2023.00090","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00090","url":null,"abstract":"With the continuous evolution of industrial network technology, industrial production has higher and higher requirements for reliability and real-time performance. At present Time Sensitive Networks (TSN) has become one of the evolution directions of the next generation network bearing technology. In TSN, redundancy is usually adopted to meet the reliability requirements of Time-Triggered (TT) flow. However, a large amount of bandwidth consumption caused by redundancy is also inevitable, which will directly affect the schedulability of TT flows. Therefore, this paper proposes a Genetically based Reliable Routing algorithm (GRR) based on Frame Replication and Elimination for Reliability (FRER) standard, which provides reliable and non-overlapping routes for different copies of each TT flow. It not only guarantees the reliability of the TT flow, but also increases its schedulability, so as to meet the needs of industrial applications. Through simulation experiments, compared with the existing methods, the reliability of the proposed reliable routing algorithm is improved by 2.08% on average and the schedulability is improved by 6.3% on average.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"2 1","pages":"469-474"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85702429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/icnlp58431.2023.00072
Shi Yali, Yang Zhi, Chunyan Xiao
Aiming at the communication range of vehicles leaving the edge server, a distributed computing offload scheme is proposed. This scheme divides the vehicle computing intensive tasks into multiple subtasks, makes full use of the computing resources of surrounding vehicles and considers the allocation of communication resources. The problem is modeled as minimizing the maximum processing delay of all subtasks, a resource allocation scheme based on DQN (RADQN) is proposed. The simulation results show that the proposed algorithm has certain advantages compared with the scheme without considering communication resource allocation, and it is still superior to other schemes when the service vehicle speed is fast.
{"title":"Distributed Resource Allocation and Offloading Strategy Based on Deep Reinforcement Learning in V2V","authors":"Shi Yali, Yang Zhi, Chunyan Xiao","doi":"10.1109/icnlp58431.2023.00072","DOIUrl":"https://doi.org/10.1109/icnlp58431.2023.00072","url":null,"abstract":"Aiming at the communication range of vehicles leaving the edge server, a distributed computing offload scheme is proposed. This scheme divides the vehicle computing intensive tasks into multiple subtasks, makes full use of the computing resources of surrounding vehicles and considers the allocation of communication resources. The problem is modeled as minimizing the maximum processing delay of all subtasks, a resource allocation scheme based on DQN (RADQN) is proposed. The simulation results show that the proposed algorithm has certain advantages compared with the scheme without considering communication resource allocation, and it is still superior to other schemes when the service vehicle speed is fast.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"16 1","pages":"362-366"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82650582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/icnlp58431.2023.00029
Jin Wu, Gege Chong, Wenting Pang, Lei Wang
Aiming at the problem that the correct rate of speech endpoint detection is low in the environment with low signal-to-noise ratio, a speech endpoint detection algorithm based on Empirical Mode Decomposition (EMD) and improved spectral subtraction is proposed, considering some noise reduction before endpoint detection. After EMD decomposition and reconstruction, the algorithm uses the improved spectral subtraction of multi-window spectral estimation to reduce noise, which improves the signal-to-noise ratio of speech signal, and then detects the endpoint by using the Teager energy and Zero-Crossing Rate(ZCR). The effectiveness and feasibility of the method presented in this paper are verified by the simulation experiment. The speech signals selected in the experiment were recorded in a quiet environment. Compared with the speech endpoint detection algorithm based on empirical modal decomposition and improved two-threshold method, the proposed algorithm has significantly improved the accuracy and accuracy of endpoint detection.
{"title":"Speech Endpoint Detection Based on EMD and Improved Spectral Subtraction","authors":"Jin Wu, Gege Chong, Wenting Pang, Lei Wang","doi":"10.1109/icnlp58431.2023.00029","DOIUrl":"https://doi.org/10.1109/icnlp58431.2023.00029","url":null,"abstract":"Aiming at the problem that the correct rate of speech endpoint detection is low in the environment with low signal-to-noise ratio, a speech endpoint detection algorithm based on Empirical Mode Decomposition (EMD) and improved spectral subtraction is proposed, considering some noise reduction before endpoint detection. After EMD decomposition and reconstruction, the algorithm uses the improved spectral subtraction of multi-window spectral estimation to reduce noise, which improves the signal-to-noise ratio of speech signal, and then detects the endpoint by using the Teager energy and Zero-Crossing Rate(ZCR). The effectiveness and feasibility of the method presented in this paper are verified by the simulation experiment. The speech signals selected in the experiment were recorded in a quiet environment. Compared with the speech endpoint detection algorithm based on empirical modal decomposition and improved two-threshold method, the proposed algorithm has significantly improved the accuracy and accuracy of endpoint detection.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"5 1","pages":"126-130"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75914675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/ICNLP58431.2023.00036
Yewei Song, Saad Ezzini, Jacques Klein, Tegawendé F. Bissyandé, C. Lefebvre, A. Goujon
Natural language processing of Low-Resource Languages (LRL) is often challenged by the lack of data. Therefore, achieving accurate machine translation (MT) in a low-resource environment is a real problem that requires practical solutions. Research in multilingual models have shown that some LRLs can be handled with such models. However, their large size and computational needs make their use in constrained environments (e.g., mobile/IoT devices or limited/old servers) impractical. In this paper, we address this problem by leveraging the power of large multilingual MT models using knowledge distillation. Knowledge distillation can transfer knowledge from a large and complex teacher model to a simpler and smaller student model without losing much in performance. We also make use of high-resource languages that are related or share the same linguistic root as the target LRL. For our evaluation, we consider Luxembourgish as the LRL that shares some roots and properties with German. We build multiple resource-efficient models based on German, knowledge distillation from the multilingual No Language Left Behind (NLLB) model, and pseudo-translation. We find that our efficient models are more than 30% faster and perform only 4% lower compared to the large state-of-the-art NLLB model.
低资源语言(LRL)的自然语言处理经常受到数据缺乏的挑战。因此,在低资源环境下实现准确的机器翻译(MT)是一个需要实际解决方案的现实问题。对多语言模型的研究表明,一些LRLs可以用这种模型来处理。然而,它们的大尺寸和计算需求使得它们在受限环境(例如,移动/物联网设备或有限/旧服务器)中的使用不切实际。在本文中,我们通过利用知识蒸馏的大型多语言机器翻译模型的功能来解决这个问题。知识蒸馏可以将知识从一个庞大而复杂的教师模型转移到一个更简单、更小的学生模型中,而不会损失太多的性能。我们还利用与目标LRL相关或具有相同语言根的高资源语言。在我们的评估中,我们将卢森堡语视为与德语具有某些根源和特性的LRL。我们构建了基于德语的多个资源高效模型、基于多语言无语言遗留模型(No Language Left Behind, NLLB)的知识蒸馏和伪翻译。我们发现,与最先进的大型NLLB模型相比,我们的高效模型速度快30%以上,性能仅低4%。
{"title":"Letz Translate: Low-Resource Machine Translation for Luxembourgish","authors":"Yewei Song, Saad Ezzini, Jacques Klein, Tegawendé F. Bissyandé, C. Lefebvre, A. Goujon","doi":"10.1109/ICNLP58431.2023.00036","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00036","url":null,"abstract":"Natural language processing of Low-Resource Languages (LRL) is often challenged by the lack of data. Therefore, achieving accurate machine translation (MT) in a low-resource environment is a real problem that requires practical solutions. Research in multilingual models have shown that some LRLs can be handled with such models. However, their large size and computational needs make their use in constrained environments (e.g., mobile/IoT devices or limited/old servers) impractical. In this paper, we address this problem by leveraging the power of large multilingual MT models using knowledge distillation. Knowledge distillation can transfer knowledge from a large and complex teacher model to a simpler and smaller student model without losing much in performance. We also make use of high-resource languages that are related or share the same linguistic root as the target LRL. For our evaluation, we consider Luxembourgish as the LRL that shares some roots and properties with German. We build multiple resource-efficient models based on German, knowledge distillation from the multilingual No Language Left Behind (NLLB) model, and pseudo-translation. We find that our efficient models are more than 30% faster and perform only 4% lower compared to the large state-of-the-art NLLB model.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"55 1","pages":"165-170"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79041683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/ICNLP58431.2023.00037
Shiren Ye, Ding Li, Ali Md Rinku
The unbalanced distribution of category labels and the correlation between these labels tend to cause over-learning issues in deep learning models. In fine-grained sentiment analysis datasets, the correlation between category labels and the heterogeneity of tag distribution are prominent. In the deep learning model, we use the adjusted circle-loss to introduce margin and gradient attenuation in the loss function to handle the challenges caused by unbalanced label distribution and non-independence between labels. This method can be well combined with pre-trained models and adapt to various learning models and algorithms. Compared with the state-of-the-art typical models, our loss function mechanism achieves significant improvement using SemEval18 and GoeEmotions by measure of Jaccard coefficient, micro-F1, and macro-F1. It implies that our solution could work efficiently for sentiment analysis and sentiment analysis tasks.
{"title":"Design and Optimization of Loss Functions in Fine-grained Sentiment and Emotion Analysis","authors":"Shiren Ye, Ding Li, Ali Md Rinku","doi":"10.1109/ICNLP58431.2023.00037","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00037","url":null,"abstract":"The unbalanced distribution of category labels and the correlation between these labels tend to cause over-learning issues in deep learning models. In fine-grained sentiment analysis datasets, the correlation between category labels and the heterogeneity of tag distribution are prominent. In the deep learning model, we use the adjusted circle-loss to introduce margin and gradient attenuation in the loss function to handle the challenges caused by unbalanced label distribution and non-independence between labels. This method can be well combined with pre-trained models and adapt to various learning models and algorithms. Compared with the state-of-the-art typical models, our loss function mechanism achieves significant improvement using SemEval18 and GoeEmotions by measure of Jaccard coefficient, micro-F1, and macro-F1. It implies that our solution could work efficiently for sentiment analysis and sentiment analysis tasks.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"19 1","pages":"171-176"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77707727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/ICNLP58431.2023.00078
Juan Guo, Huixiao Wang, Zexin Wang, Zhixian Chang
To analyze the upper bound of end-to-end delay in time-sensitive networks, an improved calculation model of end-to-end transmission delay based on network calculus is derived according to the relevant knowledge of deterministic network calculus. The effects of GCL gate opening time, scheduling period, the overlap of transmission Windows, and different priorities of traffic on the delay index were analyzed experimentally, and the end-to-end worst-case transmission delay of time triggered (TT) traffic in the whole process of traffic transmission was calculated. Finally, through comprehensive analysis of GCL and reasonable configuration of parameters, good transmission capacity of the communication network is ensured, which provides a theoretical basis for efficient deployment of the communication network and optimal allocation of network resources.
{"title":"Upper Bound Analysis of TSN End-to-End Delay Based on Network Calculus","authors":"Juan Guo, Huixiao Wang, Zexin Wang, Zhixian Chang","doi":"10.1109/ICNLP58431.2023.00078","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00078","url":null,"abstract":"To analyze the upper bound of end-to-end delay in time-sensitive networks, an improved calculation model of end-to-end transmission delay based on network calculus is derived according to the relevant knowledge of deterministic network calculus. The effects of GCL gate opening time, scheduling period, the overlap of transmission Windows, and different priorities of traffic on the delay index were analyzed experimentally, and the end-to-end worst-case transmission delay of time triggered (TT) traffic in the whole process of traffic transmission was calculated. Finally, through comprehensive analysis of GCL and reasonable configuration of parameters, good transmission capacity of the communication network is ensured, which provides a theoretical basis for efficient deployment of the communication network and optimal allocation of network resources.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"6 1","pages":"394-399"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78688639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/ICNLP58431.2023.00055
Wenkai Zhang
Text matching is a fundamental task in natural language processing. To address the short and ambiguous search statements in e-commerce domain, the complexity of headlines and the expensive manual annotation samples, this paper proposes a two-stage "vectorized retrieval + refined ranking" text matching model with a mixture of contrastive learning and course-based hard negative example sampling. By using supervised learning data augmentation, domain pre-training, comparative learning and hard case sampling to assist in ranking, this work achieves an MRR@10 value of 0.3890 in the test set of the 2022 "Ali Lingjie" E-Commerce Search Algorithm Competition, ranking second, demonstrating the effectiveness of the model.
{"title":"A two-stage e-commerce search matching model incorporating contrastive learning and course-based hard negative example sampling","authors":"Wenkai Zhang","doi":"10.1109/ICNLP58431.2023.00055","DOIUrl":"https://doi.org/10.1109/ICNLP58431.2023.00055","url":null,"abstract":"Text matching is a fundamental task in natural language processing. To address the short and ambiguous search statements in e-commerce domain, the complexity of headlines and the expensive manual annotation samples, this paper proposes a two-stage \"vectorized retrieval + refined ranking\" text matching model with a mixture of contrastive learning and course-based hard negative example sampling. By using supervised learning data augmentation, domain pre-training, comparative learning and hard case sampling to assist in ranking, this work achieves an MRR@10 value of 0.3890 in the test set of the 2022 \"Ali Lingjie\" E-Commerce Search Algorithm Competition, ranking second, demonstrating the effectiveness of the model.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"211 1 1","pages":"263-267"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85636946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/icnlp58431.2023.00040
GuoHua Zhu, Jian Wang
When processing Chinese named entity recognition, the traditional algorithm model have been having the ambiguity of word segmentation and the singleness of the word vector, and the training consequence of algorithm models was not well. To solve this problem, a BERT-BiLSTM Multi-Attention (PMA-CNER) model was proposed to improve the accuracy of Chinese named entity recognition (CNER). This model used BERT model to embed words based on BiLSTM model, which can extract global context semantic features more effectively. Next, a layer of Multi-head attention mechanism was added behind the BiLSTM layer, which can effectively extract multiple semantic features and overcome the shortage of BiLSTM in obtaining local features. Finally, the experimental results on the CLUSER2020 dataset and the Yudu-S4K dataset show that the accuracy rate is significantly improved, reaching 93.94% and 91.83% respectively.
{"title":"Named Entity Recognition Based on Pre-training Model and Multi-head Attention Mechanism","authors":"GuoHua Zhu, Jian Wang","doi":"10.1109/icnlp58431.2023.00040","DOIUrl":"https://doi.org/10.1109/icnlp58431.2023.00040","url":null,"abstract":"When processing Chinese named entity recognition, the traditional algorithm model have been having the ambiguity of word segmentation and the singleness of the word vector, and the training consequence of algorithm models was not well. To solve this problem, a BERT-BiLSTM Multi-Attention (PMA-CNER) model was proposed to improve the accuracy of Chinese named entity recognition (CNER). This model used BERT model to embed words based on BiLSTM model, which can extract global context semantic features more effectively. Next, a layer of Multi-head attention mechanism was added behind the BiLSTM layer, which can effectively extract multiple semantic features and overcome the shortage of BiLSTM in obtaining local features. Finally, the experimental results on the CLUSER2020 dataset and the Yudu-S4K dataset show that the accuracy rate is significantly improved, reaching 93.94% and 91.83% respectively.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"36 1","pages":"187-190"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86318983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}