Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems

D. Difallah, Gianluca Demartini, P. Cudré-Mauroux
{"title":"Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems","authors":"D. Difallah, Gianluca Demartini, P. Cudré-Mauroux","doi":"10.1145/2872427.2883030","DOIUrl":null,"url":null,"abstract":"Micro-task crowdsourcing has become a popular approach to effectively tackle complex data management problems such as data linkage, missing values, or schema matching. However, the backend crowdsourced operators of crowd-powered systems typically yield higher latencies than the machine-processable operators, this is mainly due to inherent efficiency differences between humans and machines. This problem can be further exacerbated by the lack of workers on the target crowdsourcing platform, or when the workers are shared unequally among a number of competing requesters; including the concurrent users from the same organization who execute crowdsourced queries with different types, priorities and prices. Under such conditions, a crowd-powered system acts mostly as a proxy to the crowdsourcing platform, and hence it is very difficult to provide effiency guarantees to its end-users. Scheduling is the traditional way of tackling such problems in computer science, by prioritizing access to shared resources. In this paper, we propose a new crowdsourcing system architecture that leverages scheduling algorithms to optimize task execution in a shared resources environment, in this case a crowdsourcing platform. Our study aims at assessing the efficiency of the crowd in settings where multiple types of tasks are run concurrently. We present extensive experimental results comparing i) different multi-tenant crowdsourcing jobs, including a workload derived from real traces, and ii) different scheduling techniques tested with real crowd workers. Our experimental results show that task scheduling can be leveraged to achieve fairness and reduce query latency in multi-tenant crowd-powered systems, although with very different tradeoffs compared to traditional settings not including human factors.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872427.2883030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

Abstract

Micro-task crowdsourcing has become a popular approach to effectively tackle complex data management problems such as data linkage, missing values, or schema matching. However, the backend crowdsourced operators of crowd-powered systems typically yield higher latencies than the machine-processable operators, this is mainly due to inherent efficiency differences between humans and machines. This problem can be further exacerbated by the lack of workers on the target crowdsourcing platform, or when the workers are shared unequally among a number of competing requesters; including the concurrent users from the same organization who execute crowdsourced queries with different types, priorities and prices. Under such conditions, a crowd-powered system acts mostly as a proxy to the crowdsourcing platform, and hence it is very difficult to provide effiency guarantees to its end-users. Scheduling is the traditional way of tackling such problems in computer science, by prioritizing access to shared resources. In this paper, we propose a new crowdsourcing system architecture that leverages scheduling algorithms to optimize task execution in a shared resources environment, in this case a crowdsourcing platform. Our study aims at assessing the efficiency of the crowd in settings where multiple types of tasks are run concurrently. We present extensive experimental results comparing i) different multi-tenant crowdsourcing jobs, including a workload derived from real traces, and ii) different scheduling techniques tested with real crowd workers. Our experimental results show that task scheduling can be leveraged to achieve fairness and reduce query latency in multi-tenant crowd-powered systems, although with very different tradeoffs compared to traditional settings not including human factors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在多租户群体动力系统中调度人类智能任务
微任务众包已经成为一种流行的方法,可以有效地解决复杂的数据管理问题,如数据链接、缺失值或模式匹配。然而,群体驱动系统的后端众包操作通常比机器可处理的操作产生更高的延迟,这主要是由于人类和机器之间固有的效率差异。如果目标众包平台上缺少工作人员,或者工作人员在多个相互竞争的请求者之间分配不均,则会进一步加剧这一问题;包括来自同一组织的并发用户,他们执行不同类型、优先级和价格的众包查询。在这种情况下,众包系统在很大程度上充当了众包平台的代理,很难为最终用户提供效率保障。在计算机科学中,调度是解决这类问题的传统方法,通过优先访问共享资源。在本文中,我们提出了一种新的众包系统架构,该架构利用调度算法在共享资源环境中优化任务执行,在这种情况下是一个众包平台。我们的研究旨在评估在多种任务同时运行的情况下人群的效率。我们提供了广泛的实验结果,比较i)不同的多租户众包工作,包括来自真实轨迹的工作量,以及ii)用真实人群工作人员测试的不同调度技术。我们的实验结果表明,在多租户人群驱动的系统中,可以利用任务调度来实现公平性并减少查询延迟,尽管与不包括人为因素的传统设置相比,需要进行非常不同的权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MapWatch: Detecting and Monitoring International Border Personalization on Online Maps Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora Learning Global Term Weights for Content-based Recommender Systems From Freebase to Wikidata: The Great Migration GoCAD: GPU-Assisted Online Content-Adaptive Display Power Saving for Mobile Devices in Internet Streaming
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1