{"title":"使用统计和机器学习方法的混合阿拉伯语关键短语提取","authors":"Nidaa Ghalib Ali, N. Omar","doi":"10.1109/ICIMU.2014.7066645","DOIUrl":null,"url":null,"abstract":"Keyphrases are single-word or multi-word lexemes that concisely and accurate describe the subject or side of the subject discuss in a document. Manually assigning keyphrases is tedious and time consuming, especially because of Web proliferation. Thus, automatic keyphrase generation systems are urgently needed. This study proposes a keyphrase extraction method that combines several keyphrase extraction methods with the use of machine learning approaches (linear logistic regression, linear discriminant analysis, and support vector machines). The proposed methods use the output of several keyphrase extraction methods as input features for a machine learning algorithm, which then determines whether each term is a keyphrase. Results show that the SVM algorithm achieves the best performance with F1-measures 88.31%. These values are relatively high and comparable with those of previous keyphrase extraction models for the Arabic language.","PeriodicalId":408534,"journal":{"name":"Proceedings of the 6th International Conference on Information Technology and Multimedia","volume":"04 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Arabic keyphrases extraction using a hybrid of statistical and machine learning methods\",\"authors\":\"Nidaa Ghalib Ali, N. Omar\",\"doi\":\"10.1109/ICIMU.2014.7066645\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Keyphrases are single-word or multi-word lexemes that concisely and accurate describe the subject or side of the subject discuss in a document. Manually assigning keyphrases is tedious and time consuming, especially because of Web proliferation. Thus, automatic keyphrase generation systems are urgently needed. This study proposes a keyphrase extraction method that combines several keyphrase extraction methods with the use of machine learning approaches (linear logistic regression, linear discriminant analysis, and support vector machines). The proposed methods use the output of several keyphrase extraction methods as input features for a machine learning algorithm, which then determines whether each term is a keyphrase. Results show that the SVM algorithm achieves the best performance with F1-measures 88.31%. These values are relatively high and comparable with those of previous keyphrase extraction models for the Arabic language.\",\"PeriodicalId\":408534,\"journal\":{\"name\":\"Proceedings of the 6th International Conference on Information Technology and Multimedia\",\"volume\":\"04 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th International Conference on Information Technology and Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIMU.2014.7066645\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Information Technology and Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIMU.2014.7066645","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Arabic keyphrases extraction using a hybrid of statistical and machine learning methods
Keyphrases are single-word or multi-word lexemes that concisely and accurate describe the subject or side of the subject discuss in a document. Manually assigning keyphrases is tedious and time consuming, especially because of Web proliferation. Thus, automatic keyphrase generation systems are urgently needed. This study proposes a keyphrase extraction method that combines several keyphrase extraction methods with the use of machine learning approaches (linear logistic regression, linear discriminant analysis, and support vector machines). The proposed methods use the output of several keyphrase extraction methods as input features for a machine learning algorithm, which then determines whether each term is a keyphrase. Results show that the SVM algorithm achieves the best performance with F1-measures 88.31%. These values are relatively high and comparable with those of previous keyphrase extraction models for the Arabic language.