Sai Satyanarayana Reddy Seelam, Shrawan Kumar, Chand M Gopi, Reddy T. Raghunadha
{"title":"A New Term Weight Measure for Gender and Age Prediction of the Authors by analyzing their Written Texts","authors":"Sai Satyanarayana Reddy Seelam, Shrawan Kumar, Chand M Gopi, Reddy T. Raghunadha","doi":"10.1109/IADCC.2018.8692092","DOIUrl":null,"url":null,"abstract":"The Internet is growing rapidly with huge amount of data mainly through social media. Most of the text in the World Wide Web is anonymous. In recent days, knowing the details of the anonymous text is the hot research area to the research community. Author Profiling is one such area attracted by the several researchers to know the information about the anonymous text. Author Profiling is a technique of predicting the demographic characteristics like gender, age and location of the authors by analyzing their written texts. The field of Stylometry is one area used by the researchers to discriminate the authors style of writing. In Author Profiling approaches the researchers proposed various types of stylistic features to distinguish the authors style of writing. The accuracies of demographic characteristics of the authors are not satisfactory when stylometric features were used. Later the researchers experimented with different types of term weight measures to improve the accuracies. In this work, we concentrated on two demographic characteristics such as gender and age. The experimentation is performed on 2014 PAN competition reviews corpus in English language. In this work, a new Profile specific Supervised Term Weight measure is proposed to predict the accuracy of gender and age of the author’s anonymous text. The experimental results of proposed measure is compared with different weight measures and identified that the proposed weight measure obtained best results for predicting gender and age.","PeriodicalId":365713,"journal":{"name":"2018 IEEE 8th International Advance Computing Conference (IACC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 8th International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2018.8692092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The Internet is growing rapidly with huge amount of data mainly through social media. Most of the text in the World Wide Web is anonymous. In recent days, knowing the details of the anonymous text is the hot research area to the research community. Author Profiling is one such area attracted by the several researchers to know the information about the anonymous text. Author Profiling is a technique of predicting the demographic characteristics like gender, age and location of the authors by analyzing their written texts. The field of Stylometry is one area used by the researchers to discriminate the authors style of writing. In Author Profiling approaches the researchers proposed various types of stylistic features to distinguish the authors style of writing. The accuracies of demographic characteristics of the authors are not satisfactory when stylometric features were used. Later the researchers experimented with different types of term weight measures to improve the accuracies. In this work, we concentrated on two demographic characteristics such as gender and age. The experimentation is performed on 2014 PAN competition reviews corpus in English language. In this work, a new Profile specific Supervised Term Weight measure is proposed to predict the accuracy of gender and age of the author’s anonymous text. The experimental results of proposed measure is compared with different weight measures and identified that the proposed weight measure obtained best results for predicting gender and age.