News headlines represent the key idea of news articles published in online news media and act as a great resource for discovering news concepts and their relationships. Moreover, the temporal information associated with the news headlines can be utilized to capture the temporal dynamics of the news concepts and their relationships which facilitates the development of many time-aware news analytics applications. Existing works on news data analytics have mostly dealt with news articles, but none of them has talked about the usefulness of news headlines in news data analytics research. In this paper, we analyze the potentiality of news headlines in inferring interesting facts of the news world. We show how news headlines can help us to capture the temporal dynamics of the news concepts and their relationships. We introduce the notion of Time-aware News Concept Graph to capture the said temporal dynamics and show how it opens the doorway of developing numerous interesting news analytics applications. The results of our analysis conform to the facts of the reality and advocate for the success of our effort.
{"title":"News Headlines: What They Can Tell Us?","authors":"S. Mazumder, Bazir Bishnoi, D. Patel","doi":"10.1145/2662117.2662121","DOIUrl":"https://doi.org/10.1145/2662117.2662121","url":null,"abstract":"News headlines represent the key idea of news articles published in online news media and act as a great resource for discovering news concepts and their relationships. Moreover, the temporal information associated with the news headlines can be utilized to capture the temporal dynamics of the news concepts and their relationships which facilitates the development of many time-aware news analytics applications. Existing works on news data analytics have mostly dealt with news articles, but none of them has talked about the usefulness of news headlines in news data analytics research. In this paper, we analyze the potentiality of news headlines in inferring interesting facts of the news world. We show how news headlines can help us to capture the temporal dynamics of the news concepts and their relationships. We introduce the notion of Time-aware News Concept Graph to capture the said temporal dynamics and show how it opens the doorway of developing numerous interesting news analytics applications. The results of our analysis conform to the facts of the reality and advocate for the success of our effort.","PeriodicalId":358827,"journal":{"name":"Proceedings of the 6th IBM Collaborative Academia Research Exchange Conference (I-CARE) on I-CARE 2014 - I-CARE 2014","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133213952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most predictive models built for binary decision problems compute a real valued score as an intermediate step and then apply a threshold on this score to make a final decision. Conventionally, the threshold is chosen which optimizes a desired performance metric (such as accuracy, F-score, precision@k, recall@k, etc.) on the training set. However very often in practice it so happens that the same threshold when applied to a test set, results in a sub-optimal performance because of drift in test distribution. In this work we propose a method that adaptively changes the threshold such that the optimal performance achieved on the training set is maintained. The method is completely unsupervised and is based on fitting a parametric mixture model to the test scores and choosing the threshold that optimizes a performance metric based on the corresponding parametric approximation.
{"title":"Decisions under drift: Adapting binary decision thresholds to drifts in test distribution","authors":"Sachin Kumar, V. Raykar, Priyanka Agrawal","doi":"10.1145/2662117.2662134","DOIUrl":"https://doi.org/10.1145/2662117.2662134","url":null,"abstract":"Most predictive models built for binary decision problems compute a real valued score as an intermediate step and then apply a threshold on this score to make a final decision. Conventionally, the threshold is chosen which optimizes a desired performance metric (such as accuracy, F-score, precision@k, recall@k, etc.) on the training set. However very often in practice it so happens that the same threshold when applied to a test set, results in a sub-optimal performance because of drift in test distribution. In this work we propose a method that adaptively changes the threshold such that the optimal performance achieved on the training set is maintained. The method is completely unsupervised and is based on fitting a parametric mixture model to the test scores and choosing the threshold that optimizes a performance metric based on the corresponding parametric approximation.","PeriodicalId":358827,"journal":{"name":"Proceedings of the 6th IBM Collaborative Academia Research Exchange Conference (I-CARE) on I-CARE 2014 - I-CARE 2014","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133444073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}