Investigating Author Research Relatedness through Crowdsourcing: A Replication Study on MTurk

IF 2 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer Supported Cooperative Work-The Journal of Collaborative Computing Pub Date : 2023-05-24 DOI:10.1109/CSCWD57460.2023.10152707

Dennis Paulino H. Paredes D. Guimaraes D. Schneider Benjamim Fonseca António Correia

{"title":"Investigating Author Research Relatedness through Crowdsourcing: A Replication Study on MTurk","authors":"António Correia, Dennis Paulino, H. Paredes, D. Guimaraes, D. Schneider, Benjamim Fonseca","doi":"10.1109/CSCWD57460.2023.10152707","DOIUrl":null,"url":null,"abstract":"Determining the relatedness of publications by detecting similarities and connections between researchers and their outputs can help science stakeholders worldwide to find areas of common interest and potential collaboration. To this end, many studies have tried to explore authorship attribution and research similarity detection through the use of automatic approaches. Nonetheless, inferring author research relatedness from imperfect data containing errors and multiple references to the same entities is a long-standing challenge. In a previous study, we conducted an experiment where a homogeneous crowd of volunteers contributed to a set of author name disambiguation tasks. The results demonstrated an overall accuracy higher than 75% and we also found important effects tied to the confidence level indicated by participants in correct answers. However, this study left many open questions regarding the comparative accuracy of a large heterogeneous crowd with monetary rewards involved. This paper seeks to address some of these unanswered questions by repeating the experiment with a crowd of 140 online paid workers recruited via MTurk’s microtask crowdsourcing platform. Our replication study shows high accuracy for name disambiguation tasks based on authorship-level information and content features. These findings can be of greater informative value since they also explore hints of crowd behavior activity in terms of time duration and mean proportion of clicks per worker with implications for interface and interaction design.","PeriodicalId":51008,"journal":{"name":"Computer Supported Cooperative Work-The Journal of Collaborative Computing","volume":"26 3","pages":"77-82"},"PeriodicalIF":2.0000,"publicationDate":"2023-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Supported Cooperative Work-The Journal of Collaborative Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/CSCWD57460.2023.10152707","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Determining the relatedness of publications by detecting similarities and connections between researchers and their outputs can help science stakeholders worldwide to find areas of common interest and potential collaboration. To this end, many studies have tried to explore authorship attribution and research similarity detection through the use of automatic approaches. Nonetheless, inferring author research relatedness from imperfect data containing errors and multiple references to the same entities is a long-standing challenge. In a previous study, we conducted an experiment where a homogeneous crowd of volunteers contributed to a set of author name disambiguation tasks. The results demonstrated an overall accuracy higher than 75% and we also found important effects tied to the confidence level indicated by participants in correct answers. However, this study left many open questions regarding the comparative accuracy of a large heterogeneous crowd with monetary rewards involved. This paper seeks to address some of these unanswered questions by repeating the experiment with a crowd of 140 online paid workers recruited via MTurk’s microtask crowdsourcing platform. Our replication study shows high accuracy for name disambiguation tasks based on authorship-level information and content features. These findings can be of greater informative value since they also explore hints of crowd behavior activity in terms of time duration and mean proportion of clicks per worker with implications for interface and interaction design.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过众包调查作者研究相关性:MTurk的复制研究

通过检测研究人员及其产出之间的相似性和联系来确定出版物的相关性，可以帮助全世界的科学利益相关者找到共同感兴趣的领域和潜在的合作。为此，许多研究试图通过使用自动方法来探索作者归属和研究相似度检测。尽管如此，从包含错误和对同一实体的多次引用的不完整数据中推断作者研究的相关性是一个长期的挑战。在之前的一项研究中，我们进行了一项实验，让一群同质的志愿者参与一组作者姓名消歧任务。结果表明，总体准确率高于75%，我们还发现，参与者对正确答案的信心水平也有重要影响。然而，这项研究留下了许多悬而未决的问题，涉及到金钱奖励的大量异质人群的相对准确性。本文试图通过MTurk的微任务众包平台招募的140名在线付费员工来重复这个实验，以解决其中一些悬而未决的问题。我们的复制研究表明，基于作者级别信息和内容特征的名称消歧任务具有很高的准确性。这些发现可能具有更大的信息价值，因为它们还从持续时间和每个工作人员的平均点击比例方面探索了人群行为活动的线索，这对界面和交互设计有影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

IF 0 The Proceedings of the Annual Convention of the Japanese Psychological AssociationPub Date : 2018-09-25 DOI: 10.4992/PACJPA.82.0_1AM-087

T. Tatsumi, Ben Ambridge, Laura Doherty, Ramya Maitreyee

来源期刊

Computer Supported Cooperative Work-The Journal of Collaborative Computing COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

6.40

自引率

4.20%

发文量

审稿时长

>12 weeks

期刊介绍： Computer Supported Cooperative Work (CSCW): The Journal of Collaborative Computing and Work Practices is devoted to innovative research in computer-supported cooperative work (CSCW). It provides an interdisciplinary and international forum for the debate and exchange of ideas concerning theoretical, practical, technical, and social issues in CSCW. The CSCW Journal arose in response to the growing interest in the design, implementation and use of technical systems (including computing, information, and communications technologies) which support people working cooperatively, and its scope remains to encompass the multifarious aspects of research within CSCW and related areas. The CSCW Journal focuses on research oriented towards the development of collaborative computing technologies on the basis of studies of actual cooperative work practices (where ‘work’ is used in the wider sense). That is, it welcomes in particular submissions that (a) report on findings from ethnographic or similar kinds of in-depth fieldwork of work practices with a view to their technological implications, (b) report on empirical evaluations of the use of extant or novel technical solutions under real-world conditions, and/or (c) develop technical or conceptual frameworks for practice-oriented computing research based on previous fieldwork and evaluations.