{"title":"A Quadrilogy for (Big) Data Reliabilities","authors":"K. Krippendorff","doi":"10.1080/19312458.2020.1861592","DOIUrl":null,"url":null,"abstract":"ABSTRACT This paper responds to the challenge of testing the reliabilities of really big data and proposes a quadrilogy of four measures of the reliability of data, applicable quite generally. These measures grew out of the recognition that crowd coded data contest big data scientists’ conviction that the social contexts and meanings of data become irrelevant in the face of their sheer volumes. Bigness has also challenged available inter–coder agreement coefficients and available software, which are either too restricted regarding the forms of data they accept or exceed computational limits when data become very large. In the course of tailoring Krippendorff’s alpha to very large data, the possibility emerged of dividing the concept of reliability into four separate kinds, serving different methodological aims in social research. They respectively assess the replicability of the process of generating data, the accuracy of generating data, the surrogacy of proposed theories, coders, formulas, or algorithms to serve as a substitute for human coders, and the decisiveness among several human judgements. Their mathematical relationships assure comparability. The paper develops this quadrilogy of agreement measures first for binary data, provides a link to software for computing it, but then extends it to nominal data – a first step towards further generalizations. It also proposes a computational path to estimate the confidence limits for each of these measures and the probabilities of accepting data as reliable when there is a chance of being below a tolerable level. It ends with a discussion of how to select reliability benchmarks appropriate for the quadrilogy of agreement measures.","PeriodicalId":47552,"journal":{"name":"Communication Methods and Measures","volume":"15 1","pages":"165 - 189"},"PeriodicalIF":6.3000,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/19312458.2020.1861592","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communication Methods and Measures","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/19312458.2020.1861592","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMMUNICATION","Score":null,"Total":0}
引用次数: 2
Abstract
ABSTRACT This paper responds to the challenge of testing the reliabilities of really big data and proposes a quadrilogy of four measures of the reliability of data, applicable quite generally. These measures grew out of the recognition that crowd coded data contest big data scientists’ conviction that the social contexts and meanings of data become irrelevant in the face of their sheer volumes. Bigness has also challenged available inter–coder agreement coefficients and available software, which are either too restricted regarding the forms of data they accept or exceed computational limits when data become very large. In the course of tailoring Krippendorff’s alpha to very large data, the possibility emerged of dividing the concept of reliability into four separate kinds, serving different methodological aims in social research. They respectively assess the replicability of the process of generating data, the accuracy of generating data, the surrogacy of proposed theories, coders, formulas, or algorithms to serve as a substitute for human coders, and the decisiveness among several human judgements. Their mathematical relationships assure comparability. The paper develops this quadrilogy of agreement measures first for binary data, provides a link to software for computing it, but then extends it to nominal data – a first step towards further generalizations. It also proposes a computational path to estimate the confidence limits for each of these measures and the probabilities of accepting data as reliable when there is a chance of being below a tolerable level. It ends with a discussion of how to select reliability benchmarks appropriate for the quadrilogy of agreement measures.
期刊介绍:
Communication Methods and Measures aims to achieve several goals in the field of communication research. Firstly, it aims to bring attention to and showcase developments in both qualitative and quantitative research methodologies to communication scholars. This journal serves as a platform for researchers across the field to discuss and disseminate methodological tools and approaches.
Additionally, Communication Methods and Measures seeks to improve research design and analysis practices by offering suggestions for improvement. It aims to introduce new methods of measurement that are valuable to communication scientists or enhance existing methods. The journal encourages submissions that focus on methods for enhancing research design and theory testing, employing both quantitative and qualitative approaches.
Furthermore, the journal is open to articles devoted to exploring the epistemological aspects relevant to communication research methodologies. It welcomes well-written manuscripts that demonstrate the use of methods and articles that highlight the advantages of lesser-known or newer methods over those traditionally used in communication.
In summary, Communication Methods and Measures strives to advance the field of communication research by showcasing and discussing innovative methodologies, improving research practices, and introducing new measurement methods.