Measurement and Modeling of Computer Systems最新文献

英文中文

The design space of probing algorithms for network-performance measurement 网络性能测量探测算法的设计空间

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465765

A. D. Jaggard, Swara Kopparty, V. Ramachandran, R. Wright

We present a framework for the design and analysis of probing methods to monitor network performance, an important technique for collecting measurements in tasks such as fault detection. We use this framework to study the interaction among numerous, possibly conflicting, optimization goals in the design of a probing algorithm. We present a rigorous definition of a probing-algorithm design problem that can apply broadly to network-measurement scenarios. We also present several metrics relevant to the analysis of probing algorithms, including probing frequency and network coverage, communication and computational overhead, and the amount of algorithm state required. We show inherent tradeoffs among optimization goals and give hardness results for achieving some combinations of optimization goals. We also consider the possibility of developing approximation algorithms for achieving some of the goals and describe a randomized approach as an alternative, evaluating it using our framework. Our work aids future development of low-overhead probing techniques and introduces principles from IP-based networking to theoretically grounded approaches for concurrent path-selection problems.

我们提出了一个框架，用于设计和分析监测网络性能的探测方法，这是在故障检测等任务中收集测量数据的重要技术。我们使用这个框架来研究探测算法设计中众多可能相互冲突的优化目标之间的相互作用。我们提出了探测算法设计问题的严格定义，可以广泛应用于网络测量场景。我们还提出了与探测算法分析相关的几个指标，包括探测频率和网络覆盖、通信和计算开销以及所需算法状态的数量。我们展示了优化目标之间的内在权衡，并给出了实现某些优化目标组合的困难结果。我们还考虑了为实现某些目标而开发近似算法的可能性，并描述了一种随机方法作为替代方案，使用我们的框架对其进行评估。我们的工作有助于低开销探测技术的未来发展，并介绍了从基于ip的网络到并发路径选择问题的理论基础方法的原理。

{"title":"The design space of probing algorithms for network-performance measurement","authors":"A. D. Jaggard, Swara Kopparty, V. Ramachandran, R. Wright","doi":"10.1145/2465529.2465765","DOIUrl":"https://doi.org/10.1145/2465529.2465765","url":null,"abstract":"We present a framework for the design and analysis of probing methods to monitor network performance, an important technique for collecting measurements in tasks such as fault detection. We use this framework to study the interaction among numerous, possibly conflicting, optimization goals in the design of a probing algorithm. We present a rigorous definition of a probing-algorithm design problem that can apply broadly to network-measurement scenarios. We also present several metrics relevant to the analysis of probing algorithms, including probing frequency and network coverage, communication and computational overhead, and the amount of algorithm state required. We show inherent tradeoffs among optimization goals and give hardness results for achieving some combinations of optimization goals. We also consider the possibility of developing approximation algorithms for achieving some of the goals and describe a randomized approach as an alternative, evaluating it using our framework. Our work aids future development of low-overhead probing techniques and introduces principles from IP-based networking to theoretically grounded approaches for concurrent path-selection problems.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"38 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126604690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Elastic paging 弹性分页

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2479781

Enoch Peserico

We study a generalization of the classic paging problem where memory capacity can vary over time - a property of many modern computing realities, from cloud computing to multi-core and energy-optimized processors. We show that good performance in the "classic" case provides no performance guarantees when memory capacity fluctuates: roughly speaking, moving from static to dynamic capacity can mean the difference between optimality within a factor 2 in space, time and energy, and suboptimality by an arbitrarily large factor. Surprisingly, several classic paging algorithms still perform remarkably well, maintaining that factor 2 optimality even if faced with adversarial capacity fluctuations - without taking those fluctuations into explicit account!

我们研究了经典分页问题的一般化，其中内存容量可能随时间变化——这是许多现代计算现实(从云计算到多核和能量优化处理器)的特性。我们表明，当内存容量波动时，“经典”情况下的良好性能不能提供性能保证:粗略地说，从静态容量移动到动态容量可能意味着在空间、时间和能量方面的最优性在因子2以内，而次优性在因子任意大的范围内。令人惊讶的是，一些经典的分页算法仍然表现得非常好，即使面对对抗性的容量波动，也能保持因子2的最优性——而无需将这些波动明确考虑在内!

引用次数: 7

Delays and mixing times in random-access networks 随机存取网络中的延迟和混合时间

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465759

N. Bouman, S. Borst, J. V. Leeuwaarden

We explore the achievable delay performance in wireless random-access networks. While relatively simple and inherently distributed in nature, suitably designed backlog-based random-access schemes provide the striking capability to match the optimal throughput performance of centralized scheduling mechanisms. The specific type of activation rules for which throughput optimality has been established, may however yield excessive backlogs and delays. Motivated by that issue, we examine whether the poor delay performance is inherent to the basic operation of these schemes, or caused by the specific kind of activation rules. We derive delay lower bounds for backlog-based activation rules, which offer fundamental insight in the cause of the excessive delays. For fixed activation rates we obtain lower bounds indicating that delays and mixing times can grow dramatically with the load in certain topologies as well.

我们探讨了无线随机接入网络中可实现的延迟性能。适当设计的基于积压的随机访问方案相对简单且本质上是分布式的，它提供了惊人的能力来匹配集中式调度机制的最佳吞吐量性能。然而，为其建立吞吐量最优性的特定类型的激活规则可能会产生过多的积压和延迟。基于这一问题，我们研究了延迟性能差是这些方案的基本操作所固有的，还是由特定类型的激活规则引起的。我们导出了基于待定激活规则的延迟下界，这为过度延迟的原因提供了基本的见解。对于固定的激活率，我们得到的下界表明，在某些拓扑中，延迟和混合时间也会随着负载的增加而急剧增长。

引用次数: 14

Multipath TCP algorithms: theory and design 多路径TCP算法:理论与设计

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2466585

Qiuyu Peng, A. Elwalid, S. Low

Multi-path TCP (MP-TCP) has the potential to greatly improve application performance by using multiple paths transparently. We propose a fluid model for a large class of MP-TCP algorithms and identify design criteria that guarantee the existence, uniqueness, and stability of system equilibrium. We characterize algorithm parameters for TCP-friendliness and prove an inevitable tradeoff between responsiveness and friendliness. We discuss the implications of these properties on the behavior of existing algorithms and motivate a new design that generalizes existing algorithms. We use ns2 simulations to evaluate the proposed algorithm and illustrate its superior overall performance.

多路径TCP (MP-TCP)通过透明地使用多条路径，有可能极大地提高应用程序的性能。我们提出了一种大型MP-TCP算法的流体模型，并确定了保证系统平衡存在性、唯一性和稳定性的设计准则。我们描述了tcp友好的算法参数，并证明了响应性和友好性之间不可避免的权衡。我们讨论了这些属性对现有算法行为的影响，并激发了一种概括现有算法的新设计。我们使用ns2模拟来评估所提出的算法，并说明其优越的整体性能。

引用次数: 59

Accelerating GPGPU architecture simulation 加速GPGPU架构仿真

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465540

Zhibin Yu, L. Eeckhout, Nilanjan Goswami, Tao Li, L. John, Hai Jin, Chengzhong Xu

Recently, graphics processing units (GPUs) have opened up new opportunities for speeding up general-purpose parallel applications due to their massive computational power and up to hundreds of thousands of threads enabled by programming models such as CUDA. However, due to the serial nature of existing micro-architecture simulators, these massively parallel architectures and workloads need to be simulated sequentially. As a result, simulating GPGPU architectures with typical benchmarks and input data sets is extremely time-consuming. This paper addresses the GPGPU architecture simulation challenge by generating miniature, yet representative GPGPU kernels. We first summarize the static characteristics of an existing GPGPU kernel in a profile, and analyze its dynamic behavior using the novel concept of the divergence flow statistics graph (DFSG). We subsequently use a GPGPU kernel synthesizing framework to generate a miniature proxy of the original kernel, which can reduce simulation time significantly. The key idea is to reduce the number of simulated instructions by decreasing per-thread iteration counts of loops. Our experimental results show that our approach can accelerate GPGPU architecture simulation by a factor of 88X on average and up to 589X with an average IPC relative error of 5.6%.

最近，图形处理单元(gpu)为加速通用并行应用程序开辟了新的机会，因为它们具有巨大的计算能力，并且通过CUDA等编程模型支持多达数十万个线程。然而，由于现有微架构模拟器的串行特性，这些大规模并行架构和工作负载需要顺序模拟。因此，使用典型基准测试和输入数据集模拟GPGPU架构非常耗时。本文通过生成具有代表性的微型GPGPU内核来解决GPGPU架构仿真的挑战。我们首先总结了现有GPGPU内核的静态特征，并利用发散流统计图(DFSG)的新概念分析了其动态行为。随后，我们使用GPGPU内核合成框架生成原始内核的微型代理，从而大大减少了仿真时间。关键思想是通过减少每个线程的循环迭代次数来减少模拟指令的数量。实验结果表明，该方法可以将GPGPU架构仿真的平均速度提高88倍，最高可达589X，平均IPC相对误差为5.6%。

{"title":"Accelerating GPGPU architecture simulation","authors":"Zhibin Yu, L. Eeckhout, Nilanjan Goswami, Tao Li, L. John, Hai Jin, Chengzhong Xu","doi":"10.1145/2465529.2465540","DOIUrl":"https://doi.org/10.1145/2465529.2465540","url":null,"abstract":"Recently, graphics processing units (GPUs) have opened up new opportunities for speeding up general-purpose parallel applications due to their massive computational power and up to hundreds of thousands of threads enabled by programming models such as CUDA. However, due to the serial nature of existing micro-architecture simulators, these massively parallel architectures and workloads need to be simulated sequentially. As a result, simulating GPGPU architectures with typical benchmarks and input data sets is extremely time-consuming. This paper addresses the GPGPU architecture simulation challenge by generating miniature, yet representative GPGPU kernels. We first summarize the static characteristics of an existing GPGPU kernel in a profile, and analyze its dynamic behavior using the novel concept of the divergence flow statistics graph (DFSG). We subsequently use a GPGPU kernel synthesizing framework to generate a miniature proxy of the original kernel, which can reduce simulation time significantly. The key idea is to reduce the number of simulated instructions by decreasing per-thread iteration counts of loops. Our experimental results show that our approach can accelerate GPGPU architecture simulation by a factor of 88X on average and up to 589X with an average IPC relative error of 5.6%.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127271583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Characterizing the impact of process variation on write endurance enhancing techniques for non-volatile memory systems 表征进程变化对非易失性存储系统写入持久性增强技术的影响

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465755

M. Cintra, Niklas Linkewitsch

Much attention has been given recently to a set of promising non-volatile memory technologies, such as PCM, STT-MRAM, and ReRAM. These, however, have limited endurance relative to DRAM. Potential solutions to this endurance challenge exist in the form of fine-grain wear leveling techniques and aggressive error tolerance approaches. While the existing approaches to wear leveling and error tolerance are sound and demonstrate true potential, their studies have been limited in that i) they have not considered the interactions between wear leveling and error tolerance and ii) they have assumed a simple write endurance failure model where all cells fail uniformly. In this paper we perform a thorough study and characterize such interactions and the effects of more realistic non-uniform endurance models under various workloads, both synthetic and derived from benchmarks. This study shows that, for instance, variability in the endurance of cells significantly affects wear leveling and error tolerance mechanisms and the values of their tuning parameters. It also shows that these mechanisms interact in subtle ways, sometimes cancelling and sometimes boosting each other's impact on overall endurance of the device.

近年来，一系列有前途的非易失性存储技术，如PCM、STT-MRAM和ReRAM，受到了广泛的关注。然而，与DRAM相比，它们的耐用性有限。这种耐久性挑战的潜在解决方案存在于细颗粒磨损平衡技术和积极的误差容忍方法中。虽然现有的磨平和容错方法是合理的，并且显示出真正的潜力，但他们的研究受到以下方面的限制:1)他们没有考虑磨平和容错之间的相互作用;2)他们假设了一个简单的写入耐久性失效模型，其中所有单元都均匀失效。在本文中，我们进行了深入的研究，并描述了这种相互作用以及在各种工作负载下更现实的非均匀耐力模型的影响，包括合成的和来自基准的。这项研究表明，例如，电池耐久性的可变性显著影响磨损平衡和误差容忍机制及其调谐参数的值。研究还表明，这些机制以微妙的方式相互作用，有时会抵消，有时会增强彼此对设备整体续航能力的影响。

{"title":"Characterizing the impact of process variation on write endurance enhancing techniques for non-volatile memory systems","authors":"M. Cintra, Niklas Linkewitsch","doi":"10.1145/2465529.2465755","DOIUrl":"https://doi.org/10.1145/2465529.2465755","url":null,"abstract":"Much attention has been given recently to a set of promising non-volatile memory technologies, such as PCM, STT-MRAM, and ReRAM. These, however, have limited endurance relative to DRAM. Potential solutions to this endurance challenge exist in the form of fine-grain wear leveling techniques and aggressive error tolerance approaches. While the existing approaches to wear leveling and error tolerance are sound and demonstrate true potential, their studies have been limited in that i) they have not considered the interactions between wear leveling and error tolerance and ii) they have assumed a simple write endurance failure model where all cells fail uniformly. In this paper we perform a thorough study and characterize such interactions and the effects of more realistic non-uniform endurance models under various workloads, both synthetic and derived from benchmarks. This study shows that, for instance, variability in the endurance of cells significantly affects wear leveling and error tolerance mechanisms and the values of their tuning parameters. It also shows that these mechanisms interact in subtle ways, sometimes cancelling and sometimes boosting each other's impact on overall endurance of the device.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133951088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

FaRNet: fast recognition of high multi-dimensional network traffic patterns FaRNet:快速识别高多维网络流量模式

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465743

Ignasi Paredes-Oliva, P. Barlet-Ros, X. Dimitropoulos

Extracting knowledge from big network traffic data is a matter of foremost importance for multiple purposes ranging from trend analysis or network troubleshooting to capacity planning or traffic classification. An extremely useful approach to profile traffic is to extract and display to a network administrator the multi-dimensional hierarchical heavy hitters (HHHs) of a dataset. However, existing schemes for computing HHHs have several limitations: 1) they require significant computational overhead; 2) they do not scale to high dimensional data; and 3) they are not easily extensible. In this paper, we introduce a fundamentally new approach for extracting HHHs based on generalized frequent item-set mining (FIM), which allows to process traffic data much more efficiently and scales to much higher dimensional data than present schemes. Based on generalized FIM, we build and evaluate a traffic profiling system we call FaRNet. Our comparison with AutoFocus, which is the most related tool of similar nature, shows that FaRNet is up to three orders of magnitude faster.

从大型网络流量数据中提取知识对于从趋势分析或网络故障排除到容量规划或流量分类等多种用途至关重要。分析流量的一个非常有用的方法是提取并向网络管理员显示数据集的多维分层重击者(HHHs)。然而，现有的HHHs计算方案有几个局限性:1)它们需要大量的计算开销;2)它们不能扩展到高维数据;3)它们不容易扩展。在本文中，我们介绍了一种基于广义频繁项集挖掘(FIM)的提取HHHs的全新方法，该方法可以比现有方案更有效地处理交通数据并扩展到更高维度的数据。基于广义FIM，我们构建并评估了一个称为FaRNet的流量分析系统。我们与AutoFocus(最相关的类似工具)的比较表明，FaRNet的速度快了三个数量级。

引用次数: 3

Delivering fairness and priority enforcement on asymmetric multicore systems via OS scheduling 通过操作系统调度在非对称多核系统上提供公平性和优先级强制执行

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465532

J. C. Saez, Fernando Castro, D. Chaver, M. Prieto

Symmetric-ISA (instruction set architecture) asymmetric-performance multicore processors (AMPs) were shown to deliver higher performance per watt and area than symmetric CMPs for applications with diverse architectural requirements. So, it is likely that future multicore processors will combine big power-hungry fast cores and small low-power slow ones. In this paper, we propose a novel thread scheduling algorithm that aims to improve the throughput-fairness trade-off on AMP systems. Our experimental evaluation on real hardware and using scheduler implementations on a general-purpose operating system, reveals that our proposal delivers a better throughput-fairness trade-off than previous schedulers for a wide variety of multi-application workloads including single-threaded and multithreaded applications.

对于具有不同架构要求的应用程序，对称isa(指令集架构)非对称性能多核处理器(amp)比对称cmp提供更高的每瓦性能和面积。因此，未来的多核处理器很可能会结合大而耗能的快速内核和小而低功耗的慢核。在本文中，我们提出了一种新的线程调度算法，旨在改善AMP系统的吞吐量-公平性权衡。我们对实际硬件和在通用操作系统上使用调度器实现的实验评估表明，对于包括单线程和多线程应用程序在内的各种多应用程序工作负载，我们的建议比以前的调度器提供了更好的吞吐量公平权衡。

引用次数: 6

Detecting user dissatisfaction and understanding the underlying reasons 发现用户的不满并了解潜在的原因

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-17 DOI: 10.1145/2465529.2465538

Å. Arvidsson, Y. Zhang

Quantifying quality of experience for network applications is challenging as it is a subjective metric with multiple dimensions such as user expectation, satisfaction, and overall experience. Today, despite various techniques to support differentiated Quality of Service (QoS), the operators still lack of automated methods to translate QoS to QoE, especially for general web applications. In this work, we take the approach of identifying unsatisfactory performance by searching for user initiated early terminations of web transactions from passive monitoring. However, user early abortions can be caused by other factors such as loss of interests. Therefore, naively using them to represent user dissatisfaction will result in large false positives. In this paper, we propose a systematic method for inferring user dissatisfaction from the set of early abortion behaviors observed from identifying the traffic traces. We conduct a comprehensive analysis on the user acceptance of throughput and response time, and compare them with the traditional MOS metric. Then we present the characteristics of early cancelation from dimensions like the types of URLs and objects. We evaluate our approach on four data sets collected in both wireline network and a wireless cellular network.

对网络应用程序的体验质量进行量化是具有挑战性的，因为它是一个带有多个维度(如用户期望、满意度和整体体验)的主观度量。今天，尽管有各种技术支持差异化服务质量(QoS)，运营商仍然缺乏将QoS转换为QoE的自动化方法，特别是对于一般的web应用程序。在这项工作中，我们通过从被动监控中搜索用户发起的web交易的早期终止来识别不满意的性能。但是，用户早期流产可能是由于利益丧失等其他因素造成的。因此，天真地用它们来表示用户的不满会导致大量的误报。在本文中，我们提出了一种系统的方法，从识别流量轨迹中观察到的早期流产行为集来推断用户不满程度。我们对用户对吞吐量和响应时间的接受度进行了全面分析，并将其与传统的MOS指标进行了比较。然后，我们从url和对象的类型等维度提出了早期取消的特征。我们在有线网络和无线蜂窝网络收集的四个数据集上评估了我们的方法。

{"title":"Detecting user dissatisfaction and understanding the underlying reasons","authors":"Å. Arvidsson, Y. Zhang","doi":"10.1145/2465529.2465538","DOIUrl":"https://doi.org/10.1145/2465529.2465538","url":null,"abstract":"Quantifying quality of experience for network applications is challenging as it is a subjective metric with multiple dimensions such as user expectation, satisfaction, and overall experience. Today, despite various techniques to support differentiated Quality of Service (QoS), the operators still lack of automated methods to translate QoS to QoE, especially for general web applications.\u0000 In this work, we take the approach of identifying unsatisfactory performance by searching for user initiated early terminations of web transactions from passive monitoring. However, user early abortions can be caused by other factors such as loss of interests. Therefore, naively using them to represent user dissatisfaction will result in large false positives. In this paper, we propose a systematic method for inferring user dissatisfaction from the set of early abortion behaviors observed from identifying the traffic traces. We conduct a comprehensive analysis on the user acceptance of throughput and response time, and compare them with the traditional MOS metric. Then we present the characteristics of early cancelation from dimensions like the types of URLs and objects. We evaluate our approach on four data sets collected in both wireline network and a wireless cellular network.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129203591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient crowdsourcing for multi-class labeling 高效众包多类标签

Measurement and Modeling of Computer Systems

Pub Date : 2013-06-01 DOI: 10.1145/2465529.2465761

David R Karger, Sewoong Oh, D. Shah

Crowdsourcing systems like Amazon's Mechanical Turk have emerged as an effective large-scale human-powered platform for performing tasks in domains such as image classification, data entry, recommendation, and proofreading. Since workers are low-paid (a few cents per task) and tasks performed are monotonous, the answers obtained are noisy and hence unreliable. To obtain reliable estimates, it is essential to utilize appropriate inference algorithms (e.g. Majority voting) coupled with structured redundancy through task assignment. Our goal is to obtain the best possible trade-off between reliability and redundancy. In this paper, we consider a general probabilistic model for noisy observations for crowd-sourcing systems and pose the problem of minimizing the total price (i.e. redundancy) that must be paid to achieve a target overall reliability. Concretely, we show that it is possible to obtain an answer to each task correctly with probability 1-ε as long as the redundancy per task is O((K/q) log (K/ε)), where each task can have any of the $K$ distinct answers equally likely, q is the crowd-quality parameter that is defined through a probabilistic model. Further, effectively this is the best possible redundancy-accuracy trade-off any system design can achieve. Such a single-parameter crisp characterization of the (order-)optimal trade-off between redundancy and reliability has various useful operational consequences. Further, we analyze the robustness of our approach in the presence of adversarial workers and provide a bound on their influence on the redundancy-accuracy trade-off. Unlike recent prior work [GKM11, KOS11, KOS11], our result applies to non-binary (i.e. K>2) tasks. In effect, we utilize algorithms for binary tasks (with inhomogeneous error model unlike that in [GKM11, KOS11, KOS11]) as key subroutine to obtain answers for K-ary tasks. Technically, the algorithm is based on low-rank approximation of weighted adjacency matrix for a random regular bipartite graph, weighted according to the answers provided by the workers.

像亚马逊的Mechanical Turk这样的众包系统已经成为一个有效的大规模人力平台，可以在图像分类、数据输入、推荐和校对等领域执行任务。由于工人的工资很低(每项任务几美分)，而且执行的任务单调乏味，因此得到的答案是嘈杂的，因此是不可靠的。为了获得可靠的估计，必须利用适当的推理算法(例如多数投票)以及通过任务分配的结构化冗余。我们的目标是在可靠性和冗余之间取得最好的平衡。在本文中，我们考虑了众包系统的噪声观测的一般概率模型，并提出了最小化总价格(即冗余)的问题，该问题必须支付以实现目标总体可靠性。具体地说，我们表明，只要每个任务的冗余度为O((K/q) log (K/ε))，就有可能以1-ε的概率正确地获得每个任务的答案，其中每个任务可以具有$K$个不同答案中的任何一个等可能，q是通过概率模型定义的人群质量参数。此外，这实际上是任何系统设计都可以实现的最佳冗余-准确性权衡。这种对冗余和可靠性之间(顺序)最优权衡的单参数清晰表征具有各种有用的操作结果。此外，我们分析了我们的方法在对抗工人的存在下的鲁棒性，并提供了他们对冗余-精度权衡的影响的界限。与最近的先前工作[GKM11, KOS11, KOS11]不同，我们的结果适用于非二进制(即K>2)任务。实际上，我们利用二元任务算法(与[GKM11, KOS11, KOS11]中的非均匀误差模型不同)作为关键子程序来获取K-ary任务的答案。从技术上讲，该算法基于随机规则二部图的加权邻接矩阵的低秩逼近，根据工作人员提供的答案进行加权。

{"title":"Efficient crowdsourcing for multi-class labeling","authors":"David R Karger, Sewoong Oh, D. Shah","doi":"10.1145/2465529.2465761","DOIUrl":"https://doi.org/10.1145/2465529.2465761","url":null,"abstract":"Crowdsourcing systems like Amazon's Mechanical Turk have emerged as an effective large-scale human-powered platform for performing tasks in domains such as image classification, data entry, recommendation, and proofreading. Since workers are low-paid (a few cents per task) and tasks performed are monotonous, the answers obtained are noisy and hence unreliable. To obtain reliable estimates, it is essential to utilize appropriate inference algorithms (e.g. Majority voting) coupled with structured redundancy through task assignment. Our goal is to obtain the best possible trade-off between reliability and redundancy. In this paper, we consider a general probabilistic model for noisy observations for crowd-sourcing systems and pose the problem of minimizing the total price (i.e. redundancy) that must be paid to achieve a target overall reliability. Concretely, we show that it is possible to obtain an answer to each task correctly with probability 1-ε as long as the redundancy per task is O((K/q) log (K/ε)), where each task can have any of the $K$ distinct answers equally likely, q is the crowd-quality parameter that is defined through a probabilistic model. Further, effectively this is the best possible redundancy-accuracy trade-off any system design can achieve. Such a single-parameter crisp characterization of the (order-)optimal trade-off between redundancy and reliability has various useful operational consequences. Further, we analyze the robustness of our approach in the presence of adversarial workers and provide a bound on their influence on the redundancy-accuracy trade-off.\u0000 Unlike recent prior work [GKM11, KOS11, KOS11], our result applies to non-binary (i.e. K>2) tasks. In effect, we utilize algorithms for binary tasks (with inhomogeneous error model unlike that in [GKM11, KOS11, KOS11]) as key subroutine to obtain answers for K-ary tasks. Technically, the algorithm is based on low-rank approximation of weighted adjacency matrix for a random regular bipartite graph, weighted according to the answers provided by the workers.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129918768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 165

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Measurement and Modeling of Computer Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀