{"title":"Citation sentence identification and classification for related work summarization","authors":"D. H. Widyantoro, Imaduddin Amin","doi":"10.1109/ICACSIS.2014.7065871","DOIUrl":null,"url":null,"abstract":"Scientific article summarization is an important problem because it can be of helpful for researchers, particularly for those who start a new research topic. In this paper, we address the problem of related work summarization from scientific papers. The process of summarization comprises of extracting citation sentence followed by classifying the rhetorical category of citation sentence. Citation sentence extraction is performed by combining regular expression-based patterns, co-reference system, evidence-based approach and additional extraction rule. Citation sentence is represented as feature vectors containing term frequency, sentence length, thematic word and cue phrase feature groups. The learning of classification model is explored using Naïve Bayes, Complement Naïve Bayes and Decision Tree. Experiment results reveal that the approaches adopted for citation sentence extraction and rhetorical category classification is promising to provide the ground work for related work summarization.","PeriodicalId":443250,"journal":{"name":"2014 International Conference on Advanced Computer Science and Information System","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Advanced Computer Science and Information System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS.2014.7065871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Scientific article summarization is an important problem because it can be of helpful for researchers, particularly for those who start a new research topic. In this paper, we address the problem of related work summarization from scientific papers. The process of summarization comprises of extracting citation sentence followed by classifying the rhetorical category of citation sentence. Citation sentence extraction is performed by combining regular expression-based patterns, co-reference system, evidence-based approach and additional extraction rule. Citation sentence is represented as feature vectors containing term frequency, sentence length, thematic word and cue phrase feature groups. The learning of classification model is explored using Naïve Bayes, Complement Naïve Bayes and Decision Tree. Experiment results reveal that the approaches adopted for citation sentence extraction and rhetorical category classification is promising to provide the ground work for related work summarization.