{"title":"Writing with Style: Venue Classification","authors":"Zaihan Yang, Brian D. Davison","doi":"10.1109/ICMLA.2012.50","DOIUrl":null,"url":null,"abstract":"As early as the late nineteenth century, scientists began research in author attribution, mostly by identifying the writing styles of authors. Following research over centuries has repeatedly demonstrated that people tend to have distinguishable writing styles. Today we not only have more authors, but we also have all different kinds of publications: journals, conferences, workshops, etc., covering different topics and requiring different writing formats. In spite of successful research in author attribution, no work has been carried out to find out whether publication venues are similarly distinguishable by their writing styles. Our work takes the first step into exploring this problem. By approaching the problem using a traditional classification method, we extract three types of writing style-based features and carry out detailed experiments in examining the different impacts among features, and classification techniques, as well as the influence of venue content, topics and genres. Experiments on real data from ACM and Cite Seer digital libraries demonstrate our approach to be an effective method in distinguishing venues in terms of their writing styles.","PeriodicalId":157399,"journal":{"name":"2012 11th International Conference on Machine Learning and Applications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 11th International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2012.50","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
As early as the late nineteenth century, scientists began research in author attribution, mostly by identifying the writing styles of authors. Following research over centuries has repeatedly demonstrated that people tend to have distinguishable writing styles. Today we not only have more authors, but we also have all different kinds of publications: journals, conferences, workshops, etc., covering different topics and requiring different writing formats. In spite of successful research in author attribution, no work has been carried out to find out whether publication venues are similarly distinguishable by their writing styles. Our work takes the first step into exploring this problem. By approaching the problem using a traditional classification method, we extract three types of writing style-based features and carry out detailed experiments in examining the different impacts among features, and classification techniques, as well as the influence of venue content, topics and genres. Experiments on real data from ACM and Cite Seer digital libraries demonstrate our approach to be an effective method in distinguishing venues in terms of their writing styles.