通过计算缺陷评估支持请求者撰写清晰的众包任务描述

Proceedings of the 28th International Conference on Intelligent User Interfaces Pub Date : 2023-03-27 DOI:10.1145/3581641.3584039

Zahra Nouri, N. Prakash, U. Gadiraju, Henning Wachsmuth

{"title":"通过计算缺陷评估支持请求者撰写清晰的众包任务描述","authors":"Zahra Nouri, N. Prakash, U. Gadiraju, Henning Wachsmuth","doi":"10.1145/3581641.3584039","DOIUrl":null,"url":null,"abstract":"Quality control is an, if not the, essential challenge in crowdsourcing. Unsatisfactory responses from crowd workers have been found to particularly result from ambiguous and incomplete task descriptions, often from inexperienced task requesters. However, creating clear task descriptions with sufficient information is a complex process for requesters in crowdsourcing marketplaces. In this paper, we investigate the extent to which requesters can be supported effectively in this process through computational techniques. To this end, we developed a tool that enables requesters to iteratively identify and correct eight common clarity flaws in their task descriptions before deployment on the platform. The tool can be used to write task descriptions from scratch or to assess and improve the clarity of prepared descriptions. It employs machine learning-based natural language processing models trained on real-world task descriptions that score a given task description for the eight clarity flaws. On this basis, the requester can iteratively revise and reassess the task description until it reaches a sufficient level of clarity. In a first user study, we let requesters create task descriptions using the tool and rate the tool’s different aspects of helpfulness thereafter. We then carried out a second user study with crowd workers, as those who are confronted with such descriptions in practice, to rate the clarity of the created task descriptions. According to our results, 65% of the requesters classified the helpfulness of the information provided by the tool high or very high (only 12% as low or very low). The requesters saw some room for improvement though, for example, concerning the display of bad examples. Nevertheless, 76% of the crowd workers believe that the overall clarity of the task descriptions created by the requesters using the tool improves over the initial version. In line with this, the automatically-computed clarity scores of the edited task descriptions were generally higher than those of the initial descriptions, indicating that the tool reliably predicts the clarity of task descriptions in overall terms.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Supporting Requesters in Writing Clear Crowdsourcing Task Descriptions Through Computational Flaw Assessment\",\"authors\":\"Zahra Nouri, N. Prakash, U. Gadiraju, Henning Wachsmuth\",\"doi\":\"10.1145/3581641.3584039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quality control is an, if not the, essential challenge in crowdsourcing. Unsatisfactory responses from crowd workers have been found to particularly result from ambiguous and incomplete task descriptions, often from inexperienced task requesters. However, creating clear task descriptions with sufficient information is a complex process for requesters in crowdsourcing marketplaces. In this paper, we investigate the extent to which requesters can be supported effectively in this process through computational techniques. To this end, we developed a tool that enables requesters to iteratively identify and correct eight common clarity flaws in their task descriptions before deployment on the platform. The tool can be used to write task descriptions from scratch or to assess and improve the clarity of prepared descriptions. It employs machine learning-based natural language processing models trained on real-world task descriptions that score a given task description for the eight clarity flaws. On this basis, the requester can iteratively revise and reassess the task description until it reaches a sufficient level of clarity. In a first user study, we let requesters create task descriptions using the tool and rate the tool’s different aspects of helpfulness thereafter. We then carried out a second user study with crowd workers, as those who are confronted with such descriptions in practice, to rate the clarity of the created task descriptions. According to our results, 65% of the requesters classified the helpfulness of the information provided by the tool high or very high (only 12% as low or very low). The requesters saw some room for improvement though, for example, concerning the display of bad examples. Nevertheless, 76% of the crowd workers believe that the overall clarity of the task descriptions created by the requesters using the tool improves over the initial version. In line with this, the automatically-computed clarity scores of the edited task descriptions were generally higher than those of the initial descriptions, indicating that the tool reliably predicts the clarity of task descriptions in overall terms.\",\"PeriodicalId\":118159,\"journal\":{\"name\":\"Proceedings of the 28th International Conference on Intelligent User Interfaces\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 28th International Conference on Intelligent User Interfaces\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3581641.3584039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3581641.3584039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在众包中，质量控制即使不是最根本的挑战，也是最基本的挑战。人们发现，群体工作者的不满意反应主要来自模棱两可和不完整的任务描述，通常来自没有经验的任务请求者。然而，对于众包市场中的请求者来说，创建带有足够信息的清晰任务描述是一个复杂的过程。在本文中，我们通过计算技术研究了在此过程中请求者可以得到有效支持的程度。为此，我们开发了一个工具，使请求者能够在平台上部署之前迭代地识别和纠正任务描述中的八个常见清晰度缺陷。该工具可用于从头开始编写任务描述，或评估和改进准备好的描述的清晰度。它采用了基于机器学习的自然语言处理模型，这些模型经过了真实世界任务描述的训练，可以对给定任务描述的八个清晰度缺陷进行评分。在此基础上，请求者可以迭代地修改和重新评估任务描述，直到它达到足够的清晰度。在第一个用户研究中，我们让请求者使用该工具创建任务描述，然后对该工具的不同方面的有用性进行评级。然后，我们对人群工作者进行了第二次用户研究，作为那些在实践中面临此类描述的人，来评估创建的任务描述的清晰度。根据我们的结果，65%的请求者将工具提供的信息的有用性分类为高或非常高(只有12%的人认为低或非常低)。然而，请求者看到了一些改进的空间，例如，关于坏例子的显示。尽管如此，76%的众工认为，请求者使用该工具创建的任务描述的总体清晰度比初始版本有所提高。与此相一致的是，编辑后的任务描述的自动计算的清晰度得分通常高于初始描述的清晰度，这表明该工具在总体上可靠地预测了任务描述的清晰度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Supporting Requesters in Writing Clear Crowdsourcing Task Descriptions Through Computational Flaw Assessment

Quality control is an, if not the, essential challenge in crowdsourcing. Unsatisfactory responses from crowd workers have been found to particularly result from ambiguous and incomplete task descriptions, often from inexperienced task requesters. However, creating clear task descriptions with sufficient information is a complex process for requesters in crowdsourcing marketplaces. In this paper, we investigate the extent to which requesters can be supported effectively in this process through computational techniques. To this end, we developed a tool that enables requesters to iteratively identify and correct eight common clarity flaws in their task descriptions before deployment on the platform. The tool can be used to write task descriptions from scratch or to assess and improve the clarity of prepared descriptions. It employs machine learning-based natural language processing models trained on real-world task descriptions that score a given task description for the eight clarity flaws. On this basis, the requester can iteratively revise and reassess the task description until it reaches a sufficient level of clarity. In a first user study, we let requesters create task descriptions using the tool and rate the tool’s different aspects of helpfulness thereafter. We then carried out a second user study with crowd workers, as those who are confronted with such descriptions in practice, to rate the clarity of the created task descriptions. According to our results, 65% of the requesters classified the helpfulness of the information provided by the tool high or very high (only 12% as low or very low). The requesters saw some room for improvement though, for example, concerning the display of bad examples. Nevertheless, 76% of the crowd workers believe that the overall clarity of the task descriptions created by the requesters using the tool improves over the initial version. In line with this, the automatically-computed clarity scores of the edited task descriptions were generally higher than those of the initial descriptions, indicating that the tool reliably predicts the clarity of task descriptions in overall terms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 28th International Conference on Intelligent User Interfaces

自引率

0.00%

发文量