{"title":"An Analysis of Quantitative Aspects in the Evaluation of Thematic Segmentation Algorithms","authors":"Maria Georgescul, Alexander Clark, S. Armstrong","doi":"10.3115/1654595.1654622","DOIUrl":null,"url":null,"abstract":"We consider here the task of linear the-matic segmentation of text documents, by using features based on word distributions in the text. For this task, a typical and often implicit assumption in previous studies is that a document has just one topic and therefore many algorithms have been tested and have shown encouraging results on artificial data sets, generated by putting together parts of different documents. We show that evaluation on synthetic data is potentially misleading and fails to give an accurate evaluation of the performance on real data. Moreover, we provide a critical review of existing evaluation metrics in the literature and we propose an improved evaluation metric.","PeriodicalId":426429,"journal":{"name":"SIGDIAL Workshop","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGDIAL Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/1654595.1654622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38
Abstract
We consider here the task of linear the-matic segmentation of text documents, by using features based on word distributions in the text. For this task, a typical and often implicit assumption in previous studies is that a document has just one topic and therefore many algorithms have been tested and have shown encouraging results on artificial data sets, generated by putting together parts of different documents. We show that evaluation on synthetic data is potentially misleading and fails to give an accurate evaluation of the performance on real data. Moreover, we provide a critical review of existing evaluation metrics in the literature and we propose an improved evaluation metric.