为德语在线讨论开发一个不文明词典——一种结合人类和人工知识的半自动方法

IF 6.3 1区 文学 Q1 COMMUNICATION Communication Methods and Measures Pub Date : 2023-02-05 DOI:10.1080/19312458.2023.2166028
Anke Stoll, L. Wilms, Marc Ziegele
{"title":"为德语在线讨论开发一个不文明词典——一种结合人类和人工知识的半自动方法","authors":"Anke Stoll, L. Wilms, Marc Ziegele","doi":"10.1080/19312458.2023.2166028","DOIUrl":null,"url":null,"abstract":"ABSTRACT Incivility in online discussions has become an important issue in political communication research. Instruments and tools for the automated analysis of uncivil content, however, are rare, especially for non-English user-generated text. In this study, we present a) an extensive dictionary (DIKI - Diktionär für Inzivilität, English: Dictionary for Incivility) to detect incivility in German-language online discussions, and b) a semi-automated two-step-approach that combines manual content analysis with automated keyword collection using a pre-trained word embedding model. We show that DIKI clearly outperforms comparable dictionaries that have been used as alternative instruments to measure incivility (e.g., the LIWC) as well as basic machine learning approaches to text classification. Further, we provide evidence that pre-trained word embeddings can fruitfully be employed in the explorative phase of creating dictionaries. Still, the manual evaluation of DIKI confirms that detecting complex and context-dependent forms of incivility remains challenging and constant update would be needed to maintain performance. Finally, the detailed documentation of the developing and evaluation process of DIKI may serve as a guideline for further research. We therefore provide DIKI as a freely available instrument that also will be applicable in a web interface for drag-and-drop data analysis (diki.limitedminds.org).","PeriodicalId":47552,"journal":{"name":"Communication Methods and Measures","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2023-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Developing an Incivility Dictionary for German Online Discussions – a Semi-Automated Approach Combining Human and Artificial Knowledge\",\"authors\":\"Anke Stoll, L. Wilms, Marc Ziegele\",\"doi\":\"10.1080/19312458.2023.2166028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT Incivility in online discussions has become an important issue in political communication research. Instruments and tools for the automated analysis of uncivil content, however, are rare, especially for non-English user-generated text. In this study, we present a) an extensive dictionary (DIKI - Diktionär für Inzivilität, English: Dictionary for Incivility) to detect incivility in German-language online discussions, and b) a semi-automated two-step-approach that combines manual content analysis with automated keyword collection using a pre-trained word embedding model. We show that DIKI clearly outperforms comparable dictionaries that have been used as alternative instruments to measure incivility (e.g., the LIWC) as well as basic machine learning approaches to text classification. Further, we provide evidence that pre-trained word embeddings can fruitfully be employed in the explorative phase of creating dictionaries. Still, the manual evaluation of DIKI confirms that detecting complex and context-dependent forms of incivility remains challenging and constant update would be needed to maintain performance. Finally, the detailed documentation of the developing and evaluation process of DIKI may serve as a guideline for further research. We therefore provide DIKI as a freely available instrument that also will be applicable in a web interface for drag-and-drop data analysis (diki.limitedminds.org).\",\"PeriodicalId\":47552,\"journal\":{\"name\":\"Communication Methods and Measures\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2023-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communication Methods and Measures\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1080/19312458.2023.2166028\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMMUNICATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communication Methods and Measures","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/19312458.2023.2166028","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMMUNICATION","Score":null,"Total":0}
引用次数: 1

摘要

网络讨论中的不文明行为已经成为政治传播研究中的一个重要问题。然而,用于自动分析不文明内容的仪器和工具很少,特别是对于非英语用户生成的文本。在这项研究中,我们提出了a)一个广泛的词典(DIKI - Diktionär f r Inzivilität,英语:dictionary for Incivility)来检测德语在线讨论中的不文明行为,以及b)一种半自动的两步方法,该方法将人工内容分析与使用预训练词嵌入模型的自动关键字收集相结合。我们表明,DIKI明显优于可比较的词典,这些词典已被用作衡量不文明的替代工具(例如,LIWC),以及用于文本分类的基本机器学习方法。此外,我们提供的证据表明,预训练的词嵌入可以有效地用于创建字典的探索阶段。尽管如此,DIKI的人工评估证实,检测复杂和依赖于上下文的不文明形式仍然具有挑战性,需要不断更新以保持性能。最后,对DIKI的发展和评价过程的详细记录可以作为进一步研究的指导。因此,我们提供DIKI作为一个免费的工具,也将适用于拖放数据分析的web界面(diki.limitedminds.org)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Developing an Incivility Dictionary for German Online Discussions – a Semi-Automated Approach Combining Human and Artificial Knowledge
ABSTRACT Incivility in online discussions has become an important issue in political communication research. Instruments and tools for the automated analysis of uncivil content, however, are rare, especially for non-English user-generated text. In this study, we present a) an extensive dictionary (DIKI - Diktionär für Inzivilität, English: Dictionary for Incivility) to detect incivility in German-language online discussions, and b) a semi-automated two-step-approach that combines manual content analysis with automated keyword collection using a pre-trained word embedding model. We show that DIKI clearly outperforms comparable dictionaries that have been used as alternative instruments to measure incivility (e.g., the LIWC) as well as basic machine learning approaches to text classification. Further, we provide evidence that pre-trained word embeddings can fruitfully be employed in the explorative phase of creating dictionaries. Still, the manual evaluation of DIKI confirms that detecting complex and context-dependent forms of incivility remains challenging and constant update would be needed to maintain performance. Finally, the detailed documentation of the developing and evaluation process of DIKI may serve as a guideline for further research. We therefore provide DIKI as a freely available instrument that also will be applicable in a web interface for drag-and-drop data analysis (diki.limitedminds.org).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
21.10
自引率
1.80%
发文量
9
期刊介绍: Communication Methods and Measures aims to achieve several goals in the field of communication research. Firstly, it aims to bring attention to and showcase developments in both qualitative and quantitative research methodologies to communication scholars. This journal serves as a platform for researchers across the field to discuss and disseminate methodological tools and approaches. Additionally, Communication Methods and Measures seeks to improve research design and analysis practices by offering suggestions for improvement. It aims to introduce new methods of measurement that are valuable to communication scientists or enhance existing methods. The journal encourages submissions that focus on methods for enhancing research design and theory testing, employing both quantitative and qualitative approaches. Furthermore, the journal is open to articles devoted to exploring the epistemological aspects relevant to communication research methodologies. It welcomes well-written manuscripts that demonstrate the use of methods and articles that highlight the advantages of lesser-known or newer methods over those traditionally used in communication. In summary, Communication Methods and Measures strives to advance the field of communication research by showcasing and discussing innovative methodologies, improving research practices, and introducing new measurement methods.
期刊最新文献
JST and rJST: joint estimation of sentiment and topics in textual data using a semi-supervised approach Using State Space Grids to Quantify and Examine Dynamics of Dyadic Conversation Bootstrapping public entities. Domain-specific NER for public speakers On Measurement Validity and Language Models: Increasing Validity and Decreasing Bias with Instructions Googling Politics? Comparing Five Computational Methods to Identify Political and News-related Searches from Web Browser Histories
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1