Towards Interactive Multidimensional Visualisations for Corpus Linguistics

Paul Rayson, J. Mariani, Bryce Anderson-Cooper, Alistair Baron, David Gullick, Andrew Moore, Stephen Wattam
{"title":"Towards Interactive Multidimensional Visualisations for Corpus Linguistics","authors":"Paul Rayson, J. Mariani, Bryce Anderson-Cooper, Alistair Baron, David Gullick, Andrew Moore, Stephen Wattam","doi":"10.21248/jlcl.31.2016.200","DOIUrl":null,"url":null,"abstract":"We propose the novel application of dynamic and interactive visualisation techniques to support the iterative and exploratory investigations typical of the corpus linguistics methodology. Very large scale text analysis is already carried out in corpus-based language analysis by employing methods such as frequency profiling, keywords, concordancing, collocations and n-grams. However, at present only basic visualisation methods are utilised. In this paper, we describe case studies of multiple types of key word clouds, explorer tools for collocation networks, and compare network and language distance visualisations for online social networks. These are shown to fit better with the iterative data-driven corpus methodology, and permit some level of scalability to cope with ever increasing corpus size and complexity. In addition, they will allow corpus linguistic methods to be used more widely in the digital humanities and social sciences since the learning curve with visualisations is shallower for non-experts","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Lang. Technol. Comput. Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21248/jlcl.31.2016.200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

We propose the novel application of dynamic and interactive visualisation techniques to support the iterative and exploratory investigations typical of the corpus linguistics methodology. Very large scale text analysis is already carried out in corpus-based language analysis by employing methods such as frequency profiling, keywords, concordancing, collocations and n-grams. However, at present only basic visualisation methods are utilised. In this paper, we describe case studies of multiple types of key word clouds, explorer tools for collocation networks, and compare network and language distance visualisations for online social networks. These are shown to fit better with the iterative data-driven corpus methodology, and permit some level of scalability to cope with ever increasing corpus size and complexity. In addition, they will allow corpus linguistic methods to be used more widely in the digital humanities and social sciences since the learning curve with visualisations is shallower for non-experts
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向语料库语言学的交互式多维可视化
我们提出动态和交互式可视化技术的新应用,以支持语料库语言学方法的迭代和探索性调查。在基于语料库的语言分析中,通过使用频率谱、关键词、一致性、搭配和n-grams等方法,已经进行了非常大规模的文本分析。然而,目前只使用基本的可视化方法。在本文中,我们描述了多种类型的关键词云,搭配网络的浏览器工具的案例研究,并比较了在线社交网络的网络和语言距离可视化。它们被证明更适合迭代数据驱动的语料库方法,并允许一定程度的可伸缩性来应对不断增加的语料库大小和复杂性。此外,它们将允许语料库语言学方法在数字人文和社会科学中得到更广泛的应用,因为对于非专家来说,可视化的学习曲线更浅
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Aufbau eines Referenzkorpus zur deutschsprachigen internetbasierten Kommunikation als Zusatzkomponente für die Korpora im Projekt 'Digitales Wörterbuch der deutschen Sprache' (DWDS) Crowdsourcing the OCR Ground Truth of a German and French Cultural Heritage Corpus Comparison of OCR Accuracy on Early Printed Books using the Open Source Engines Calamari and OCRopus Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1