{"title":"Learning-to-Rank for Real-Time High-Precision Hashtag Recommendation for Streaming News","authors":"Bichen Shi, Georgiana Ifrim, N. Hurley","doi":"10.1145/2872427.2882982","DOIUrl":null,"url":null,"abstract":"We address the problem of real-time recommendation of streaming Twitter hashtags to an incoming stream of news articles. The technical challenge can be framed as large scale topic classification where the set of topics (i.e., hashtags) is huge and highly dynamic. Our main applications come from digital journalism, e.g., promoting original content to Twitter communities and social indexing of news to enable better retrieval and story tracking. In contrast to the state-of-the-art that focuses on topic modelling approaches, we propose a learning-to-rank approach for modelling hashtag relevance. This enables us to deal with the dynamic nature of the problem, since a relevance model is stable over time, while a topic model needs to be continuously retrained. We present the data collection and processing pipeline, as well as our methodology for achieving low latency, high precision recommendations. Our empirical results show that our method outperforms the state-of-the-art, delivering more than 80% precision. Our techniques are implemented in a real-time system that is currently under user trial with a big news organisation.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872427.2882982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 44
Abstract
We address the problem of real-time recommendation of streaming Twitter hashtags to an incoming stream of news articles. The technical challenge can be framed as large scale topic classification where the set of topics (i.e., hashtags) is huge and highly dynamic. Our main applications come from digital journalism, e.g., promoting original content to Twitter communities and social indexing of news to enable better retrieval and story tracking. In contrast to the state-of-the-art that focuses on topic modelling approaches, we propose a learning-to-rank approach for modelling hashtag relevance. This enables us to deal with the dynamic nature of the problem, since a relevance model is stable over time, while a topic model needs to be continuously retrained. We present the data collection and processing pipeline, as well as our methodology for achieving low latency, high precision recommendations. Our empirical results show that our method outperforms the state-of-the-art, delivering more than 80% precision. Our techniques are implemented in a real-time system that is currently under user trial with a big news organisation.