{"title":"Finding Sentiment Dimension in Vector Space of Movie Reviews: An Unsupervised Approach","authors":"Youngsam Kim, Hyopil Shin","doi":"10.17791/JCS.2017.18.1.85","DOIUrl":null,"url":null,"abstract":"This study suggests an unsupervised method to find sentiment orienations of the words in Korean movie reviews. The orientations are represented as real values on a sentiment domain, which is derived from high-dimensional vector space for the movie reviews. To search for the dimension, the Pointwise Mutual Information is first used to select a set of words that are close to common modifiers; The phrases comprised of these words often form good/ bad associations (e.g., “good acting”, “terrible acting”). A neural language model (Word2Vec) is then used to calculate the point-wise similarity distances between the chosen words and, dimensionality reduction algorithms (e.g., PCA, MDS) are employed to find the axis of the sentiment orientations. Finally, the performance of our method is measured by unsupervised classification of the two movie reviews based on the orientation values. According to the results, the best accuracy achieves 66% and 76% for the two datasets.","PeriodicalId":135438,"journal":{"name":"The Journal of Cognitive Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Cognitive Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17791/JCS.2017.18.1.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This study suggests an unsupervised method to find sentiment orienations of the words in Korean movie reviews. The orientations are represented as real values on a sentiment domain, which is derived from high-dimensional vector space for the movie reviews. To search for the dimension, the Pointwise Mutual Information is first used to select a set of words that are close to common modifiers; The phrases comprised of these words often form good/ bad associations (e.g., “good acting”, “terrible acting”). A neural language model (Word2Vec) is then used to calculate the point-wise similarity distances between the chosen words and, dimensionality reduction algorithms (e.g., PCA, MDS) are employed to find the axis of the sentiment orientations. Finally, the performance of our method is measured by unsupervised classification of the two movie reviews based on the orientation values. According to the results, the best accuracy achieves 66% and 76% for the two datasets.