Jingjing Wang, Fei Song, Kavita Walia, Jeffery Farber, R. Dara
{"title":"使用卷积神经网络提取关键字和关键词:以食源性疾病为例","authors":"Jingjing Wang, Fei Song, Kavita Walia, Jeffery Farber, R. Dara","doi":"10.1109/ICMLA.2019.00228","DOIUrl":null,"url":null,"abstract":"Keywords and keyphrases are important for Natural Language Processing (NLP) applications such as document classification, information retrieval, and topic identification. They are also useful for capturing different classes of entities from content related to healthcare, biology, food science, and journalism fields. There are different approaches to extract keywords and keyphrases. Deep learning approaches have achieved high-performance results in terms of keywords and keyphrase extraction. However, among deep learning approaches, Convolutional Neural Network (CNN) potentials have not been fully explored as a technique for extracting keywords and keyphrases. In this work, we performed a comparative study using a benchmark dataset, the IEEE Xplore collection to test the CNN generalization ability in selecting keywords and keyphrases. In addition, we further collected a corpus in the field of foodborne illness outbreaks. We utilize this corpus to develop a CNN-based identification approach of keywords and keyphrases related to foodborne illnesses. Results were compared with several supervised (KEA, GuidedLDA) and unsupervised (LDA) machine learning algorithms. CNN outperformed these algorithms in selecting relevant keywords and keyphrases for foodborne illnesses. The findings of this study have also confirmed superiority of CNN-based algorithm for keyphrase extraction to other machine learning approaches.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Using Convolutional Neural Networks to Extract Keywords and Keyphrases: A Case Study for Foodborne Illnesses\",\"authors\":\"Jingjing Wang, Fei Song, Kavita Walia, Jeffery Farber, R. Dara\",\"doi\":\"10.1109/ICMLA.2019.00228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Keywords and keyphrases are important for Natural Language Processing (NLP) applications such as document classification, information retrieval, and topic identification. They are also useful for capturing different classes of entities from content related to healthcare, biology, food science, and journalism fields. There are different approaches to extract keywords and keyphrases. Deep learning approaches have achieved high-performance results in terms of keywords and keyphrase extraction. However, among deep learning approaches, Convolutional Neural Network (CNN) potentials have not been fully explored as a technique for extracting keywords and keyphrases. In this work, we performed a comparative study using a benchmark dataset, the IEEE Xplore collection to test the CNN generalization ability in selecting keywords and keyphrases. In addition, we further collected a corpus in the field of foodborne illness outbreaks. We utilize this corpus to develop a CNN-based identification approach of keywords and keyphrases related to foodborne illnesses. Results were compared with several supervised (KEA, GuidedLDA) and unsupervised (LDA) machine learning algorithms. CNN outperformed these algorithms in selecting relevant keywords and keyphrases for foodborne illnesses. The findings of this study have also confirmed superiority of CNN-based algorithm for keyphrase extraction to other machine learning approaches.\",\"PeriodicalId\":436714,\"journal\":{\"name\":\"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2019.00228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Convolutional Neural Networks to Extract Keywords and Keyphrases: A Case Study for Foodborne Illnesses
Keywords and keyphrases are important for Natural Language Processing (NLP) applications such as document classification, information retrieval, and topic identification. They are also useful for capturing different classes of entities from content related to healthcare, biology, food science, and journalism fields. There are different approaches to extract keywords and keyphrases. Deep learning approaches have achieved high-performance results in terms of keywords and keyphrase extraction. However, among deep learning approaches, Convolutional Neural Network (CNN) potentials have not been fully explored as a technique for extracting keywords and keyphrases. In this work, we performed a comparative study using a benchmark dataset, the IEEE Xplore collection to test the CNN generalization ability in selecting keywords and keyphrases. In addition, we further collected a corpus in the field of foodborne illness outbreaks. We utilize this corpus to develop a CNN-based identification approach of keywords and keyphrases related to foodborne illnesses. Results were compared with several supervised (KEA, GuidedLDA) and unsupervised (LDA) machine learning algorithms. CNN outperformed these algorithms in selecting relevant keywords and keyphrases for foodborne illnesses. The findings of this study have also confirmed superiority of CNN-based algorithm for keyphrase extraction to other machine learning approaches.