Unlike Emotion Cause Extraction (ECE) task which consists of pre-annotate emotions and passage, emotion-cause pair extraction (ECPE) aims at extracting potential emotions and corresponding causes in the document without the need for pre-annotations. Traditional ECPE solutions divide the extracting emotions and causes operation into two separate parts. However, separating the bidirectional dependence between emotion and cause may lose a lot of potentially useful information. In this paper, we propose a novel interactive recurrent attention network (IRAN). Our approach focuses on the bidirectional impact between emotions and causes, and extracts emotions and causes simultaneously. The information in the document can be fully exploited through multiple modeling and information extraction. Our emotion-specific transformation and distance fusion correlation can adaptively focus on the emotions and the distance, gracefully incorporate them into a distinguishable neural network attention framework. The experimental results show that our proposed model achieves better performance than other widely-used models on the ECPE corpus.
{"title":"A Novel Interactive Recurrent Attention Network for Emotion-Cause Pair Extraction","authors":"Xiangyu Jia, Xinhai Chen, Qian Wan, Jie Liu","doi":"10.1145/3446132.3446195","DOIUrl":"https://doi.org/10.1145/3446132.3446195","url":null,"abstract":"Unlike Emotion Cause Extraction (ECE) task which consists of pre-annotate emotions and passage, emotion-cause pair extraction (ECPE) aims at extracting potential emotions and corresponding causes in the document without the need for pre-annotations. Traditional ECPE solutions divide the extracting emotions and causes operation into two separate parts. However, separating the bidirectional dependence between emotion and cause may lose a lot of potentially useful information. In this paper, we propose a novel interactive recurrent attention network (IRAN). Our approach focuses on the bidirectional impact between emotions and causes, and extracts emotions and causes simultaneously. The information in the document can be fully exploited through multiple modeling and information extraction. Our emotion-specific transformation and distance fusion correlation can adaptively focus on the emotions and the distance, gracefully incorporate them into a distinguishable neural network attention framework. The experimental results show that our proposed model achieves better performance than other widely-used models on the ECPE corpus.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129705855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emotions classification in large amount of Twitter's data is very effective to analyze the users’ mood about a concerned product, news, topic, and so on. However, it is really a challenging task to extract meaningful features from a burst of raw tweets as emotions are subjective with limited fuzzy boundaries. These subjective features can be expressed in different terminologies and perceptions. In this paper, we proposed a hybrid approach of LDA and machine learning to predict emotions for large scale of imbalanced tweets. First, the raw tweets are preprocessed using tokenization method for capturing useful features without noisy information. Second, the local and global feature's importance is estimated by applying TFIDF statistical technique. Third, the Latent Dirichlet Allocation (LDA) topic modeling method is used to extract topics from these features. These topics explain concepts of related tweet which is really helpful for classification. Fourth, the Adaptive Synthetic (ADASYN) class balancing technique is applied to oversample the data and balance each class of topic. Finally, the K-Nearest Neighbor (KNN) machine learning algorithm is applied to predict the emotions in extracted topics. The class balancing method increase the significance of minor classes and solve the problem of class imbalance. The proposed approach is evaluated on two different Twitters’ emotions datasets. It is proved that, this methodology outperformed as compared to the popular state of the art methods in terms of precision, recall, f-measure and classification accuracy.
{"title":"Sentimental Analysis based on hybrid approach of Latent Dirichlet Allocation and Machine Learning for Large-Scale of Imbalanced Twitter Data","authors":"Nasir Jamal, Xianqiao Chen, Junaid Hussain Abro, Doniyor Tukhtakhunov","doi":"10.1145/3446132.3446413","DOIUrl":"https://doi.org/10.1145/3446132.3446413","url":null,"abstract":"Emotions classification in large amount of Twitter's data is very effective to analyze the users’ mood about a concerned product, news, topic, and so on. However, it is really a challenging task to extract meaningful features from a burst of raw tweets as emotions are subjective with limited fuzzy boundaries. These subjective features can be expressed in different terminologies and perceptions. In this paper, we proposed a hybrid approach of LDA and machine learning to predict emotions for large scale of imbalanced tweets. First, the raw tweets are preprocessed using tokenization method for capturing useful features without noisy information. Second, the local and global feature's importance is estimated by applying TFIDF statistical technique. Third, the Latent Dirichlet Allocation (LDA) topic modeling method is used to extract topics from these features. These topics explain concepts of related tweet which is really helpful for classification. Fourth, the Adaptive Synthetic (ADASYN) class balancing technique is applied to oversample the data and balance each class of topic. Finally, the K-Nearest Neighbor (KNN) machine learning algorithm is applied to predict the emotions in extracted topics. The class balancing method increase the significance of minor classes and solve the problem of class imbalance. The proposed approach is evaluated on two different Twitters’ emotions datasets. It is proved that, this methodology outperformed as compared to the popular state of the art methods in terms of precision, recall, f-measure and classification accuracy.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124642614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Electronic medical records(EMR) contain a lot of medical diagnosis information; In order to mine the value of data, it is necessary to extract the attributes of the electronic medical record. The deep learning method has been widely used in attribute extraction tasks and has achieved remarkable results in general text datasets. However, in specific medical fields, such as our electronic medical record extraction task, attribute extraction often lacks a lot of high-quality annotation data; besides, the attributes in the corpus can be divided into two types: discriminative attribute and extractive attribute, there is a strong correlation between some attributes. Independent Modeling each attribute cannot use this information, which will lead to insufficient information that the model can learn. This paper proposes a unified framework for medical record attribute extraction based on ALBERT, uses a large amount of general corpus as external knowledge for pre-training and fine-tuning, and adopts multi-task learning to make all attributes share the underlying cod-ing and train. Experiments show that this framework is greatly improved than the traditional LSTM-CRF model; it performs better in practical application scenarios.
{"title":"A unified framework for attribute extraction in electronic medical records","authors":"Ming Du, Wenkun Wang, Sufen Wang, Bo Xu","doi":"10.1145/3446132.3446410","DOIUrl":"https://doi.org/10.1145/3446132.3446410","url":null,"abstract":"Electronic medical records(EMR) contain a lot of medical diagnosis information; In order to mine the value of data, it is necessary to extract the attributes of the electronic medical record. The deep learning method has been widely used in attribute extraction tasks and has achieved remarkable results in general text datasets. However, in specific medical fields, such as our electronic medical record extraction task, attribute extraction often lacks a lot of high-quality annotation data; besides, the attributes in the corpus can be divided into two types: discriminative attribute and extractive attribute, there is a strong correlation between some attributes. Independent Modeling each attribute cannot use this information, which will lead to insufficient information that the model can learn. This paper proposes a unified framework for medical record attribute extraction based on ALBERT, uses a large amount of general corpus as external knowledge for pre-training and fine-tuning, and adopts multi-task learning to make all attributes share the underlying cod-ing and train. Experiments show that this framework is greatly improved than the traditional LSTM-CRF model; it performs better in practical application scenarios.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132369081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, leveraging the characteristics of users’ historical behavior to predict click-through rates (CTRs) has become a key point of interest in studies of recommender systems. Although theoretical and experimental investigations of CTR models have increased substantially, most models focus on linear feature interaction; however, crucial user characteristics in the real world are discovered implicitly by non-linear features. In this paper, we propose a novel model that integrates the advantages of linear and non-linear feature interaction. Our deep factorization machines network with non-linear interaction for recommend systems (DFNR) model identifies non-linear feature interactions by designing a new Non-linear interaction (NL-interaction) layer. We also incorporate a deeper multilayer perceptron (MLP) than other CTR models, which yields more accurate information about higher-order feature interactions. The MLP in the proposed model is unique because we use the residual structure to correct problems caused by a deeper network structure. Findings show that our DFNR model performs better on a CTR prediction task compared to other models. Results demonstrate the effective-ness of our model based on its non-linear interaction layer and deeper neural network architecture.
{"title":"Deep Factorization Machines network with Non-linear interaction for Recommender System","authors":"Chuchu Yu, Xinmei Yang, Han Jiang","doi":"10.1145/3446132.3446134","DOIUrl":"https://doi.org/10.1145/3446132.3446134","url":null,"abstract":"In recent years, leveraging the characteristics of users’ historical behavior to predict click-through rates (CTRs) has become a key point of interest in studies of recommender systems. Although theoretical and experimental investigations of CTR models have increased substantially, most models focus on linear feature interaction; however, crucial user characteristics in the real world are discovered implicitly by non-linear features. In this paper, we propose a novel model that integrates the advantages of linear and non-linear feature interaction. Our deep factorization machines network with non-linear interaction for recommend systems (DFNR) model identifies non-linear feature interactions by designing a new Non-linear interaction (NL-interaction) layer. We also incorporate a deeper multilayer perceptron (MLP) than other CTR models, which yields more accurate information about higher-order feature interactions. The MLP in the proposed model is unique because we use the residual structure to correct problems caused by a deeper network structure. Findings show that our DFNR model performs better on a CTR prediction task compared to other models. Results demonstrate the effective-ness of our model based on its non-linear interaction layer and deeper neural network architecture.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132895638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper describes the development a corpus of an English variety, i.e. China English, in or-der to provide a linguistic resource for researchers in the field of China English. The Corpus of China English (CCE) was built with due consideration given to its representativeness and authenticity. It was composed of more than 13,962,102 tokens in 15,333 texts evenly divided between the following four genres: newspapers, magazines, fiction and academic writings. The texts cover a wide range of domains, such as news, financial, politics, environment, social, culture, technology, sports, education, philosophy, literary, etc. It is a helpful resource for research on China English, computational linguistics, natural language processing, corpus linguistics and English language education.
{"title":"The Design and Construction of the Corpus of China English","authors":"L. Xia, Yun Xia","doi":"10.1145/3446132.3446398","DOIUrl":"https://doi.org/10.1145/3446132.3446398","url":null,"abstract":"The paper describes the development a corpus of an English variety, i.e. China English, in or-der to provide a linguistic resource for researchers in the field of China English. The Corpus of China English (CCE) was built with due consideration given to its representativeness and authenticity. It was composed of more than 13,962,102 tokens in 15,333 texts evenly divided between the following four genres: newspapers, magazines, fiction and academic writings. The texts cover a wide range of domains, such as news, financial, politics, environment, social, culture, technology, sports, education, philosophy, literary, etc. It is a helpful resource for research on China English, computational linguistics, natural language processing, corpus linguistics and English language education.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124045150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Underwater optical images are scarce, and there are varying degrees of blur and color distortion, which brings great challenges to the detection of underwater objects. In view of the shortcomings of the original Single Shot MultiBox Detector (SSD), in this paper, a shallow object detection layer is added to the original SSD model to improve the network's ability to detect small objects. At the same time, this article improves the confidence loss to narrow the ability of SSD to detect different types of objects. Using the Multi-Scale Retinex with Color Restoration (MSRCR) algorithm to process the original images, enhance the feature information of the objects in the underwater images. Training the improved SSD network through transfer learning to overcome the limitations of insufficient underwater images. Experimental results show that the algorithm proposed in this paper has better detection performance than the original SSD, YOLO v3 and other algorithms, which is of great significance to the realization of underwater object detection.
水下光学图像稀缺,并且存在不同程度的模糊和色彩失真,这给水下物体的检测带来了很大的挑战。针对原有单射多盒检测器(Single Shot MultiBox Detector, SSD)存在的不足,本文在原有的SSD模型上增加了一个浅层的目标检测层,以提高网络对小目标的检测能力。同时,本文通过改进置信度损失来缩小SSD检测不同类型对象的能力。利用多尺度Retinex with Color Restoration (MSRCR)算法对原始图像进行处理,增强水下图像中物体的特征信息。通过迁移学习训练改进后的SSD网络,克服水下图像不足的局限性。实验结果表明,本文提出的算法比原有的SSD、YOLO v3等算法具有更好的检测性能,对实现水下目标检测具有重要意义。
{"title":"Underwater Object Detection Based on Improved Single Shot MultiBox Detector","authors":"Zhongyun Jiang, Rong-Sheng Wang","doi":"10.1145/3446132.3446170","DOIUrl":"https://doi.org/10.1145/3446132.3446170","url":null,"abstract":"Underwater optical images are scarce, and there are varying degrees of blur and color distortion, which brings great challenges to the detection of underwater objects. In view of the shortcomings of the original Single Shot MultiBox Detector (SSD), in this paper, a shallow object detection layer is added to the original SSD model to improve the network's ability to detect small objects. At the same time, this article improves the confidence loss to narrow the ability of SSD to detect different types of objects. Using the Multi-Scale Retinex with Color Restoration (MSRCR) algorithm to process the original images, enhance the feature information of the objects in the underwater images. Training the improved SSD network through transfer learning to overcome the limitations of insufficient underwater images. Experimental results show that the algorithm proposed in this paper has better detection performance than the original SSD, YOLO v3 and other algorithms, which is of great significance to the realization of underwater object detection.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120960456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Based on the subject's keyboard typing time series dataset, an long short term (LSTM) network model was developed to predict the early-stage Parkinson's disease. The training and test results show that the area under ROC curve (AUC) is 0.82, accuracy rate (ACC) is 0.84, precision (PRE) is 0.85, recall rate (REC) is 0.98, and F1 score is 0.90. This indicates that the LSTM prediction model can botain high accuracy, precision and sensitivity results by automatically extracting keyboard typing time series characteristics of keyboard typing time series data.
{"title":"An application of LSTM prediction model based on keystroke data","authors":"O. Min, Zhang Wei, Zhou Nian, Xie Su","doi":"10.1145/3446132.3446191","DOIUrl":"https://doi.org/10.1145/3446132.3446191","url":null,"abstract":"Based on the subject's keyboard typing time series dataset, an long short term (LSTM) network model was developed to predict the early-stage Parkinson's disease. The training and test results show that the area under ROC curve (AUC) is 0.82, accuracy rate (ACC) is 0.84, precision (PRE) is 0.85, recall rate (REC) is 0.98, and F1 score is 0.90. This indicates that the LSTM prediction model can botain high accuracy, precision and sensitivity results by automatically extracting keyboard typing time series characteristics of keyboard typing time series data.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115646282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The meaning of the same word or sentence is likely to change in different semantic contexts, which challenges general-purpose translation system to maintain stable performance across different domains. Therefore, domain adaptation is an essential researching topic in Neural Machine Translation practice. In order to efficiently train translation models for different domains, in this work we take the Tibetan-Chinese general translation model as the parent model, and obtain two domain-specific Tibetan-Chinese translation models with small-scale in-domain data. The empirical results indicate that the method provides a positive approach for domain adaptation in low-resource scenarios, resulting in better bleu metrics as well as faster training speed over our general baseline models.
{"title":"Domain Adaptation for Tibetan-Chinese Neural Machine Translation","authors":"Maoxian Zhou, Jia Secha, Rangjia Cai","doi":"10.1145/3446132.3446404","DOIUrl":"https://doi.org/10.1145/3446132.3446404","url":null,"abstract":"The meaning of the same word or sentence is likely to change in different semantic contexts, which challenges general-purpose translation system to maintain stable performance across different domains. Therefore, domain adaptation is an essential researching topic in Neural Machine Translation practice. In order to efficiently train translation models for different domains, in this work we take the Tibetan-Chinese general translation model as the parent model, and obtain two domain-specific Tibetan-Chinese translation models with small-scale in-domain data. The empirical results indicate that the method provides a positive approach for domain adaptation in low-resource scenarios, resulting in better bleu metrics as well as faster training speed over our general baseline models.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132668561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cunchao Zhu, Guangquan Cheng, Yang Ma, Jiuyao Jiang, M. Wang, Tingfei Huang
Link prediction is an important application in complex networks. It predicts existing but undiscovered associations or possible future relationships in the network. However, networks in real life have much noise. The networks we observe are incomplete or redundant which interfere with the effect of link prediction. This paper summarizes and constructs four kinds of common noises in social networks, then analyzes the robustness of traditional link prediction methods and methods based on network representation under the influence of different kinds and different degrees of noises on multiple social networks. The experimental results confirm that algorithms using local network properties have higher link accuracy, while methods based on the global properties have higher robustness. CCS CONCEPTS • Networks∼Network performance evaluation∼Network performance analysis • Networks∼Network performance evaluation∼Network experimentation • Networks∼Network performance evaluation∼Network performance modeling
{"title":"Robustness analysis of noise network link prediction","authors":"Cunchao Zhu, Guangquan Cheng, Yang Ma, Jiuyao Jiang, M. Wang, Tingfei Huang","doi":"10.1145/3446132.3446143","DOIUrl":"https://doi.org/10.1145/3446132.3446143","url":null,"abstract":"Link prediction is an important application in complex networks. It predicts existing but undiscovered associations or possible future relationships in the network. However, networks in real life have much noise. The networks we observe are incomplete or redundant which interfere with the effect of link prediction. This paper summarizes and constructs four kinds of common noises in social networks, then analyzes the robustness of traditional link prediction methods and methods based on network representation under the influence of different kinds and different degrees of noises on multiple social networks. The experimental results confirm that algorithms using local network properties have higher link accuracy, while methods based on the global properties have higher robustness. CCS CONCEPTS • Networks∼Network performance evaluation∼Network performance analysis • Networks∼Network performance evaluation∼Network experimentation • Networks∼Network performance evaluation∼Network performance modeling","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130705716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Shi, Xiaoyang Zeng, Jie Wu, Mengshu Hou, Hao Zhu
Extracting valuable information from text has always been a hot point for research and event detection is an essential subtask of information extraction. Most existing methods of event detection only focus on sentence-level information and do not consider the correlation between different event types. To address these problems, in this paper, we propose a novel pre-trained language model based event detection framework named CFEE that utilizes document-level information and event correlation to enhance the event detection task. To obtain event correlation, we project all event types into a shared semantic space through a Skip-gram model, where the event correlation can be represented as the distance between event embeddings. In order to capture document-level information, we utilize a bidirectional recurrent neural network to fuse the context information. Experiments on the ACE2005 dataset demonstrate that our proposed model is better than most existing methods, and also demonstrate the effectiveness of event correlation and document-level information.
{"title":"Context Event Features and Event Embedding Enhanced Event Detection","authors":"Xin Shi, Xiaoyang Zeng, Jie Wu, Mengshu Hou, Hao Zhu","doi":"10.1145/3446132.3446397","DOIUrl":"https://doi.org/10.1145/3446132.3446397","url":null,"abstract":"Extracting valuable information from text has always been a hot point for research and event detection is an essential subtask of information extraction. Most existing methods of event detection only focus on sentence-level information and do not consider the correlation between different event types. To address these problems, in this paper, we propose a novel pre-trained language model based event detection framework named CFEE that utilizes document-level information and event correlation to enhance the event detection task. To obtain event correlation, we project all event types into a shared semantic space through a Skip-gram model, where the event correlation can be represented as the distance between event embeddings. In order to capture document-level information, we utilize a bidirectional recurrent neural network to fuse the context information. Experiments on the ACE2005 dataset demonstrate that our proposed model is better than most existing methods, and also demonstrate the effectiveness of event correlation and document-level information.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"14 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131751366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}