Maria Skeppstedt, Magnus Ahltorp, Kostiantyn Kucher, Matts Lindström
{"title":"从文字云到文字雨:重温经典词云,将气候变化文本可视化","authors":"Maria Skeppstedt, Magnus Ahltorp, Kostiantyn Kucher, Matts Lindström","doi":"10.1177/14738716241236188","DOIUrl":null,"url":null,"abstract":"Word Rain is a development of the classic word cloud. It addresses some of the limitations of word clouds, in particular the lack of a semantically motivated positioning of the words, and the use of font size as a sole indicator of word prominence. Word Rain uses the semantic information encoded in a distributional semantics-based language model – reduced into one dimension – to position the words along the x-axis. Thereby, the horizontal positioning of the words reflects semantic similarity. Font size is still used to signal word prominence, but this signal is supplemented with a bar chart, as well as with the position of the words on the y-axis. We exemplify the use of Word Rain by three concrete visualization tasks, applied on different real-world texts and document collections on climate change. In these case studies, word2vec models, reduced to one dimension with t-SNE, are used to encode semantic similarity, and TF-IDF is used for measuring word prominence. We evaluate the technique further by carrying out domain expert reviews.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"9 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From word clouds to Word Rain: Revisiting the classic word cloud to visualize climate change texts\",\"authors\":\"Maria Skeppstedt, Magnus Ahltorp, Kostiantyn Kucher, Matts Lindström\",\"doi\":\"10.1177/14738716241236188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Word Rain is a development of the classic word cloud. It addresses some of the limitations of word clouds, in particular the lack of a semantically motivated positioning of the words, and the use of font size as a sole indicator of word prominence. Word Rain uses the semantic information encoded in a distributional semantics-based language model – reduced into one dimension – to position the words along the x-axis. Thereby, the horizontal positioning of the words reflects semantic similarity. Font size is still used to signal word prominence, but this signal is supplemented with a bar chart, as well as with the position of the words on the y-axis. We exemplify the use of Word Rain by three concrete visualization tasks, applied on different real-world texts and document collections on climate change. In these case studies, word2vec models, reduced to one dimension with t-SNE, are used to encode semantic similarity, and TF-IDF is used for measuring word prominence. We evaluate the technique further by carrying out domain expert reviews.\",\"PeriodicalId\":50360,\"journal\":{\"name\":\"Information Visualization\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2024-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Visualization\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1177/14738716241236188\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Visualization","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/14738716241236188","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
摘要
词雨是对经典词云的发展。它解决了词云的一些局限性问题,特别是缺乏以语义为基础的词定位,以及将字体大小作为衡量词突出度的唯一指标。字雨使用基于分布语义的语言模型中编码的语义信息(简化为一个维度)来沿 x 轴定位词语。因此,词语的水平定位反映了语义的相似性。字体大小仍用于表示单词的显著性,但这一信号通过条形图以及单词在 y 轴上的位置得到了补充。我们通过三个具体的可视化任务来示范 "词雨 "的使用,这些任务应用于不同的真实文本和有关气候变化的文档集。在这些案例研究中,用 t-SNE 将 word2vec 模型缩减到一个维度来编码语义相似性,用 TF-IDF 来测量词的显著性。我们通过进行领域专家评审来进一步评估该技术。
From word clouds to Word Rain: Revisiting the classic word cloud to visualize climate change texts
Word Rain is a development of the classic word cloud. It addresses some of the limitations of word clouds, in particular the lack of a semantically motivated positioning of the words, and the use of font size as a sole indicator of word prominence. Word Rain uses the semantic information encoded in a distributional semantics-based language model – reduced into one dimension – to position the words along the x-axis. Thereby, the horizontal positioning of the words reflects semantic similarity. Font size is still used to signal word prominence, but this signal is supplemented with a bar chart, as well as with the position of the words on the y-axis. We exemplify the use of Word Rain by three concrete visualization tasks, applied on different real-world texts and document collections on climate change. In these case studies, word2vec models, reduced to one dimension with t-SNE, are used to encode semantic similarity, and TF-IDF is used for measuring word prominence. We evaluate the technique further by carrying out domain expert reviews.
期刊介绍:
Information Visualization is essential reading for researchers and practitioners of information visualization and is of interest to computer scientists and data analysts working on related specialisms. This journal is an international, peer-reviewed journal publishing articles on fundamental research and applications of information visualization. The journal acts as a dedicated forum for the theories, methodologies, techniques and evaluations of information visualization and its applications.
The journal is a core vehicle for developing a generic research agenda for the field by identifying and developing the unique and significant aspects of information visualization. Emphasis is placed on interdisciplinary material and on the close connection between theory and practice.
This journal is a member of the Committee on Publication Ethics (COPE).