{"title":"Ensemble Learning Approach for Clickbait Detection Using Article Headline Features","authors":"Dilip Singh Sisodia","doi":"10.28945/4279","DOIUrl":null,"url":null,"abstract":"Aim/Purpose: The aim of this paper is to propose an ensemble learners based classification model for classification clickbaits from genuine article headlines.\n\nBackground: Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempted visitors to click on a particular link either to monetize the landing page or to spread the false news for sensationalization. The presence of clickbaits on any news aggregator portal may lead to an unpleasant experience for readers. Therefore, it is essential to distinguish clickbaits from authentic headlines to mitigate their impact on readers’ perception.\n\nMethodology: A total of one hundred thousand article headlines are collected from news aggregator sites consists of clickbaits and authentic news headlines. The collected data samples are divided into five training sets of balanced and unbalanced data. The natural language processing techniques are used to extract 19 manually selected features from article headlines. \n\nContribution: Three ensemble learning techniques including bagging, boosting, and random forests are used to design a classifier model for classifying a given headline into the clickbait or non-clickbait. The performances of learners are evaluated using accuracy, precision, recall, and F-measures.\n\nFindings: It is observed that the random forest classifier detects clickbaits better than the other classifiers with an accuracy of 91.16 %, a total precision, recall, and f-measure of 91 %.","PeriodicalId":39754,"journal":{"name":"Informing Science","volume":"15 1","pages":"31-44"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informing Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28945/4279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Aim/Purpose: The aim of this paper is to propose an ensemble learners based classification model for classification clickbaits from genuine article headlines.
Background: Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempted visitors to click on a particular link either to monetize the landing page or to spread the false news for sensationalization. The presence of clickbaits on any news aggregator portal may lead to an unpleasant experience for readers. Therefore, it is essential to distinguish clickbaits from authentic headlines to mitigate their impact on readers’ perception.
Methodology: A total of one hundred thousand article headlines are collected from news aggregator sites consists of clickbaits and authentic news headlines. The collected data samples are divided into five training sets of balanced and unbalanced data. The natural language processing techniques are used to extract 19 manually selected features from article headlines.
Contribution: Three ensemble learning techniques including bagging, boosting, and random forests are used to design a classifier model for classifying a given headline into the clickbait or non-clickbait. The performances of learners are evaluated using accuracy, precision, recall, and F-measures.
Findings: It is observed that the random forest classifier detects clickbaits better than the other classifiers with an accuracy of 91.16 %, a total precision, recall, and f-measure of 91 %.
期刊介绍:
The academically peer refereed journal Informing Science endeavors to provide an understanding of the complexities in informing clientele. Fields from information systems, library science, journalism in all its forms to education all contribute to this science. These fields, which developed independently and have been researched in separate disciplines, are evolving to form a new transdiscipline, Informing Science.