{"title":"A study on the common words found in different literary Romanian corpora","authors":"A. Mitrea, A. Vlad, Octavian Hodea, R. Dragomir","doi":"10.1109/ICCOMM.2014.6866729","DOIUrl":null,"url":null,"abstract":"The experimental analysis focused on a corpus of literary works - novels and short stories (107 books) - comprising about 12.7 million words, obtained by bringing together two different collections of writings of similar length. The main objective was to identify the common words occurring in each of the books constituting the corpus as a whole, in each of the sub-corpora organized per author and also in each sub-collection of texts.","PeriodicalId":366043,"journal":{"name":"2014 10th International Conference on Communications (COMM)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 10th International Conference on Communications (COMM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCOMM.2014.6866729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The experimental analysis focused on a corpus of literary works - novels and short stories (107 books) - comprising about 12.7 million words, obtained by bringing together two different collections of writings of similar length. The main objective was to identify the common words occurring in each of the books constituting the corpus as a whole, in each of the sub-corpora organized per author and also in each sub-collection of texts.