首页 > 最新文献

CrowdMM '12最新文献

英文 中文
A closer look at photographers' intentions: a test dataset 仔细观察摄影师的意图:一个测试数据集
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390811
M. Lux, M. Taschwer, Oge Marques
Taking a photo is a process typically triggered by an intention. Some people want to document the progress of a task, others just want to capture the moment to re-visit the situation later on. In this contribution we present a novel, openly available dataset with 1,309 photos and annotations specifying the intentions of the photographers, which were eventually validated using Amazon Mechanical Turk.
拍照通常是一个由意图触发的过程。有些人想要记录任务的进度,有些人只是想要抓住这个时刻,以便以后重新审视这个情况。在这篇文章中,我们提出了一个新颖的、公开可用的数据集,其中包含1309张照片和说明摄影师意图的注释,最终使用Amazon Mechanical Turk进行了验证。
{"title":"A closer look at photographers' intentions: a test dataset","authors":"M. Lux, M. Taschwer, Oge Marques","doi":"10.1145/2390803.2390811","DOIUrl":"https://doi.org/10.1145/2390803.2390811","url":null,"abstract":"Taking a photo is a process typically triggered by an intention. Some people want to document the progress of a task, others just want to capture the moment to re-visit the situation later on. In this contribution we present a novel, openly available dataset with 1,309 photos and annotations specifying the intentions of the photographers, which were eventually validated using Amazon Mechanical Turk.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115496072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Crowdsourced user interface testing for multimedia applications 多媒体应用程序的众包用户界面测试
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390813
Raynor Vliegendhart, E. Dolstra, J. Pouwelse
Conducting a conventional experiment to test an application's user interface in a lab environment is a costly and time-consuming process. In this paper, we show that it is feasible to carry out A/B tests for a multimedia application through Amazon's crowdsourcing platform Mechanical Turk involving hundreds of workers at low costs. We let workers test user interfaces within a remote virtual machine that is embedded within the HIT and we show that technical issues that arise in this approach can be overcome.
在实验室环境中进行常规实验来测试应用程序的用户界面是一个昂贵且耗时的过程。在本文中,我们证明了通过亚马逊的众包平台Mechanical Turk以低成本对多媒体应用程序进行A/B测试是可行的,涉及数百名工人。我们让工作人员在嵌入到HIT中的远程虚拟机中测试用户界面,并展示了在这种方法中出现的技术问题是可以克服的。
{"title":"Crowdsourced user interface testing for multimedia applications","authors":"Raynor Vliegendhart, E. Dolstra, J. Pouwelse","doi":"10.1145/2390803.2390813","DOIUrl":"https://doi.org/10.1145/2390803.2390813","url":null,"abstract":"Conducting a conventional experiment to test an application's user interface in a lab environment is a costly and time-consuming process. In this paper, we show that it is feasible to carry out A/B tests for a multimedia application through Amazon's crowdsourcing platform Mechanical Turk involving hundreds of workers at low costs. We let workers test user interfaces within a remote virtual machine that is embedded within the HIT and we show that technical issues that arise in this approach can be overcome.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115697259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface 众包微级多媒体注释:评价与界面的挑战
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390816
Sunghyun Park, Gelareh Mohammadi, Ron Artstein, Louis-Philippe Morency
This paper presents a new evaluation procedure and tool for crowdsourcing micro-level multimedia annotations and shows that such annotations can achieve a quality comparable to that of expert annotations. We propose a new evaluation procedure, called MM-Eval (Micro-level Multimedia Evaluation), which compares fine time-aligned annotations using Krippendorff's alpha metric and introduce two new metrics to evaluate the types of disagreement between coders. We also introduce OCTAB (Online Crowdsourcing Tool for Annotations of Behaviors), a web-based annotation tool that allows precise and convenient multimedia behavior annotations, directly from Amazon Mechanical Turk interface. With an experiment using the above tool and evaluation procedure, we show that a majority vote among annotations from 3 crowdsource workers leads to a quality comparable to that of local expert annotations.
本文提出了一种新的众包微级多媒体注释评价方法和评价工具,并表明该方法可以达到与专家注释相当的质量。我们提出了一种新的评价方法,称为MM-Eval (Micro-level Multimedia evaluation),它使用Krippendorff的alpha度量来比较精细的时间排列注释,并引入了两个新的度量来评价编码员之间的分歧类型。我们还介绍了OCTAB (Online Crowdsourcing Tool for Annotations of Behaviors),这是一个基于web的注释工具,可以直接从Amazon Mechanical Turk界面进行精确和方便的多媒体行为注释。通过使用上述工具和评估程序的实验,我们表明,来自3名众包工作者的多数投票导致的注释质量与当地专家的注释相当。
{"title":"Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface","authors":"Sunghyun Park, Gelareh Mohammadi, Ron Artstein, Louis-Philippe Morency","doi":"10.1145/2390803.2390816","DOIUrl":"https://doi.org/10.1145/2390803.2390816","url":null,"abstract":"This paper presents a new evaluation procedure and tool for crowdsourcing micro-level multimedia annotations and shows that such annotations can achieve a quality comparable to that of expert annotations. We propose a new evaluation procedure, called MM-Eval (Micro-level Multimedia Evaluation), which compares fine time-aligned annotations using Krippendorff's alpha metric and introduce two new metrics to evaluate the types of disagreement between coders. We also introduce OCTAB (Online Crowdsourcing Tool for Annotations of Behaviors), a web-based annotation tool that allows precise and convenient multimedia behavior annotations, directly from Amazon Mechanical Turk interface. With an experiment using the above tool and evaluation procedure, we show that a majority vote among annotations from 3 crowdsource workers leads to a quality comparable to that of local expert annotations.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124928345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Tagging tagged images: on the impact of existing annotations on image tagging 标记标记的图像:关于现有注释对图像标记的影响
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390807
César Moltedo, H. Astudillo, Marcelo Mendoza
Crowdsourcing has been widely used to generate metadata for multimedia resources. By presenting partially described resources to human annotators, resources are tagged yielding better descriptions. Although significant improvements in metadata quality have been reported, as yet there is no understanding of how taggers are biased by previously acquired resource tags. We hypothesize that the number of existing annotations, which we take here to reflect the tag completeness degree, influence taggers: rather empty descriptions (initial tagging stages) encourage creating more tags, but better tags are created for fuller descriptions (later tagging stages). We explore empirically the relationship between tag quality/quantity and completeness degree by conducting a study on a set of human crowdsourcing annotators over a collection of images with different completeness degrees. Experimental results show a significant relation between completeness and image tagging. To the best of our knowledge, this study is the first to explore the impact of existing annotations on image tagging.
众包已被广泛应用于多媒体资源元数据的生成。通过将部分描述的资源呈现给人类注释者,可以对资源进行标记,从而产生更好的描述。尽管元数据质量有了显著的改进,但目前还不清楚标记器是如何被先前获得的资源标记所影响的。我们假设现有注释的数量(这里我们用它来反映标签的完整程度)会影响标签者:空白的描述(初始标签阶段)鼓励创建更多的标签,但更好的标签是为更完整的描述(后期标签阶段)创建的。我们通过对一组具有不同完备度的图像集合的人类众包注释器进行研究,实证地探讨了标签质量/数量与完备度之间的关系。实验结果表明,完整性与图像标注之间存在显著的关系。据我们所知,本研究首次探讨了现有标注对图像标注的影响。
{"title":"Tagging tagged images: on the impact of existing annotations on image tagging","authors":"César Moltedo, H. Astudillo, Marcelo Mendoza","doi":"10.1145/2390803.2390807","DOIUrl":"https://doi.org/10.1145/2390803.2390807","url":null,"abstract":"Crowdsourcing has been widely used to generate metadata for multimedia resources. By presenting partially described resources to human annotators, resources are tagged yielding better descriptions. Although significant improvements in metadata quality have been reported, as yet there is no understanding of how taggers are biased by previously acquired resource tags. We hypothesize that the number of existing annotations, which we take here to reflect the tag completeness degree, influence taggers: rather empty descriptions (initial tagging stages) encourage creating more tags, but better tags are created for fuller descriptions (later tagging stages). We explore empirically the relationship between tag quality/quantity and completeness degree by conducting a study on a set of human crowdsourcing annotators over a collection of images with different completeness degrees. Experimental results show a significant relation between completeness and image tagging. To the best of our knowledge, this study is the first to explore the impact of existing annotations on image tagging.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127162650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Ground truth generation in medical imaging: a crowdsourcing-based iterative approach 医学成像中的地面真相生成:基于众包的迭代方法
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390808
A. Foncubierta-Rodríguez, H. Müller
As in many other scientific domains where computer--based tools need to be evaluated, also medical imaging often requires the expensive generation of manual ground truth. For some specific tasks medical doctors can be required to guarantee high quality and valid results, whereas other tasks such as the image modality classification described in this text can in sufficiently high quality be performed with simple domain experts. Crowdsourcing has received much attention in many domains recently as volunteers perform so--called human intelligence tasks for often small amounts of money, allowing to reduce the cost of creating manually annotated data sets and ground truth in evaluation tasks. On the other hand there has often been a discussion on the quality when using unknown experts. Controlling task quality has remained one of the main challenges in crowdsourcing approaches as potentially the persons performing the tasks may not be interested in results quality but rather their payment. On the other hand several crowdsourcing platforms such as Crowdflower that we used allow creating interfaces and sharing them with only a limited number of known persons. The text describes the interfaces developed and the quality obtained through manual annotation of several domain experts and one medical doctor. Particularly the feedback loop of semi--automatic tools is explained. The results of an initial crowdsourcing round classifying medical images into a set of image categories were manually controlled by domain experts and then used to train an automatic system that visually classified these images. The automatic classification results were then used to manually confirm or refuse the automatic classes, reducing the time for the initial tasks. Crowdsourcing platforms allow creating a large variety of interfaces for judgements. Whether used among known experts or paying for unknown persons, they allow increasing the speed of ground truth creation and limit the amount of money to be paid.
正如许多其他科学领域需要评估基于计算机的工具一样,医学成像通常也需要昂贵的人工生成地面真相。对于某些特定任务,可以要求医生保证高质量和有效的结果,而其他任务,如本文中描述的图像模态分类,可以由简单的领域专家以足够高的质量执行。众包最近在许多领域受到了广泛关注,因为志愿者通常以很少的钱执行所谓的人类智能任务,从而减少了创建手动注释数据集和评估任务中基本事实的成本。另一方面,当使用未知专家时,经常会有关于质量的讨论。控制任务质量仍然是众包方法的主要挑战之一,因为执行任务的人可能对结果质量不感兴趣,而是对报酬感兴趣。另一方面,我们使用的一些众包平台(如Crowdflower)允许我们创建界面并与有限数量的已知人员共享。本文描述了通过几位领域专家和一位医生的手工注释所开发的接口和所获得的质量。特别说明了半自动刀具的反馈回路。将医学图像分类为一组图像类别的初始众包轮的结果由领域专家手动控制,然后用于训练对这些图像进行视觉分类的自动系统。然后使用自动分类结果手动确认或拒绝自动分类,减少初始任务的时间。众包平台允许创建各种各样的判断界面。无论是在已知的专家中使用,还是为不知名的人付费,它们都可以提高地面真理创造的速度,并限制支付的金额。
{"title":"Ground truth generation in medical imaging: a crowdsourcing-based iterative approach","authors":"A. Foncubierta-Rodríguez, H. Müller","doi":"10.1145/2390803.2390808","DOIUrl":"https://doi.org/10.1145/2390803.2390808","url":null,"abstract":"As in many other scientific domains where computer--based tools need to be evaluated, also medical imaging often requires the expensive generation of manual ground truth. For some specific tasks medical doctors can be required to guarantee high quality and valid results, whereas other tasks such as the image modality classification described in this text can in sufficiently high quality be performed with simple domain experts. Crowdsourcing has received much attention in many domains recently as volunteers perform so--called human intelligence tasks for often small amounts of money, allowing to reduce the cost of creating manually annotated data sets and ground truth in evaluation tasks. On the other hand there has often been a discussion on the quality when using unknown experts. Controlling task quality has remained one of the main challenges in crowdsourcing approaches as potentially the persons performing the tasks may not be interested in results quality but rather their payment.\u0000 On the other hand several crowdsourcing platforms such as Crowdflower that we used allow creating interfaces and sharing them with only a limited number of known persons. The text describes the interfaces developed and the quality obtained through manual annotation of several domain experts and one medical doctor. Particularly the feedback loop of semi--automatic tools is explained. The results of an initial crowdsourcing round classifying medical images into a set of image categories were manually controlled by domain experts and then used to train an automatic system that visually classified these images. The automatic classification results were then used to manually confirm or refuse the automatic classes, reducing the time for the initial tasks.\u0000 Crowdsourcing platforms allow creating a large variety of interfaces for judgements. Whether used among known experts or paying for unknown persons, they allow increasing the speed of ground truth creation and limit the amount of money to be paid.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124083219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Enhancing online 3D products through crowdsourcing 通过众包提升在线3D产品
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390820
Thi Phuong Nghiem, A. Carlier, Géraldine Morin, V. Charvillat
In this paper, we propose to build semantic links between a product's textual description and its corresponding 3D visualization. These links help gathering knowledge about a product and ease browsing its 3D model. Our goal is to support the common behavior that when reading a textual information of a product, users naturally imagine how it looks like in real life. We generate the association between a textual description and a 3D feature from crowdsourcing. A user study of 82 people assesses the usefulness of the association for subsequent users, both for correctness and efficiency. Users are asked to perform the identification of features on 3D models; from the traces, associations leading to recommended views are derived. This information (recommended view) is proposed to subsequent users for performing the same task. Whereas the associations could be simply given by an expert, crowdsourcing offers advantages: we have inexpensive experts in the crowd as well as a natural access to users' (eg. customers') preferences and opinions.
在本文中,我们建议在产品的文本描述和相应的三维可视化之间建立语义链接。这些链接有助于收集有关产品的知识,并方便浏览其3D模型。我们的目标是支持一种常见的行为,即当用户阅读产品的文本信息时,会自然地想象它在现实生活中的样子。我们从众包中生成文本描述和3D特征之间的关联。一项82人的用户研究评估了该关联对后续用户的有用性,包括正确性和效率。用户被要求对3D模型进行特征识别;从这些轨迹中,推导出导致推荐视图的关联。此信息(推荐视图)建议后续用户执行相同的任务。尽管专家可以简单地给出关联,但众包提供了优势:我们在人群中拥有廉价的专家,以及与用户的自然接触。顾客的偏好和意见。
{"title":"Enhancing online 3D products through crowdsourcing","authors":"Thi Phuong Nghiem, A. Carlier, Géraldine Morin, V. Charvillat","doi":"10.1145/2390803.2390820","DOIUrl":"https://doi.org/10.1145/2390803.2390820","url":null,"abstract":"In this paper, we propose to build semantic links between a product's textual description and its corresponding 3D visualization. These links help gathering knowledge about a product and ease browsing its 3D model. Our goal is to support the common behavior that when reading a textual information of a product, users naturally imagine how it looks like in real life. We generate the association between a textual description and a 3D feature from crowdsourcing. A user study of 82 people assesses the usefulness of the association for subsequent users, both for correctness and efficiency. Users are asked to perform the identification of features on 3D models; from the traces, associations leading to recommended views are derived. This information (recommended view) is proposed to subsequent users for performing the same task. Whereas the associations could be simply given by an expert, crowdsourcing offers advantages: we have inexpensive experts in the crowd as well as a natural access to users' (eg. customers') preferences and opinions.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131181123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Crowdsourcing user interactions within web video through pulse modeling 通过脉冲建模在网络视频中众包用户交互
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390812
M. Avlonitis, K. Chorianopoulos, David A. Shamma
Semantic video research has employed crowdsourcing techniques on social web video data sets such as comments, tags, and annotations, but these data sets require an extra effort on behalf of the user. We propose a pulse modeling method, which analyzes implicit user interactions within web video, such as rewind. In particular, we have modeled the user information seeking behavior as a time series and the semantic regions as a discrete pulse of fixed width. We constructed these pulses from user interactions with a documentary video that has a very rich visual style with too many cuts and camera angles/frames for the same scene. Next, we calculated the correlation coefficient between dynamically detected user pulses at the local maximums and the reference pulse. We have found when people are actively seeking for information in a video, their activity (these pulses) significantly matches the semantics of the video. This proposed pulse analysis method complements previous work in content-based information retrieval and provides an additional user-based dimension for modeling the semantics of a web video.
语义视频研究在社交网络视频数据集(如评论、标签和注释)上采用了众包技术,但这些数据集需要代表用户进行额外的努力。我们提出了一种脉冲建模方法,该方法分析了网络视频中的隐式用户交互,如倒带。特别是,我们将用户信息搜索行为建模为时间序列,将语义区域建模为固定宽度的离散脉冲。我们从用户与纪录片视频的互动中构建了这些脉冲,该视频具有非常丰富的视觉风格,具有相同场景的太多剪切和相机角度/帧。接下来,我们计算了在局部最大值处动态检测到的用户脉冲与参考脉冲之间的相关系数。我们发现,当人们在视频中积极寻找信息时,他们的活动(这些脉冲)与视频的语义显著匹配。提出的脉冲分析方法补充了先前基于内容的信息检索工作,并为网络视频的语义建模提供了一个额外的基于用户的维度。
{"title":"Crowdsourcing user interactions within web video through pulse modeling","authors":"M. Avlonitis, K. Chorianopoulos, David A. Shamma","doi":"10.1145/2390803.2390812","DOIUrl":"https://doi.org/10.1145/2390803.2390812","url":null,"abstract":"Semantic video research has employed crowdsourcing techniques on social web video data sets such as comments, tags, and annotations, but these data sets require an extra effort on behalf of the user. We propose a pulse modeling method, which analyzes implicit user interactions within web video, such as rewind. In particular, we have modeled the user information seeking behavior as a time series and the semantic regions as a discrete pulse of fixed width. We constructed these pulses from user interactions with a documentary video that has a very rich visual style with too many cuts and camera angles/frames for the same scene. Next, we calculated the correlation coefficient between dynamically detected user pulses at the local maximums and the reference pulse. We have found when people are actively seeking for information in a video, their activity (these pulses) significantly matches the semantics of the video. This proposed pulse analysis method complements previous work in content-based information retrieval and provides an additional user-based dimension for modeling the semantics of a web video.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130335542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pushing the limits of mechanical turk: qualifying the crowd for video geo-location 突破土耳其机器人的极限:让人群具备视频地理定位的资格
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390815
L. Gottlieb, Jaeyoung Choi, P. Kelm, T. Sikora, G. Friedland
In this article we review the methods we have developed for finding Mechanical Turk participants for the manual annotation of the geo-location of random videos from the web. We require high quality annotations for this project, as we are attempting to establish a human baseline for future comparison to machine systems. This task is different from a standard Mechanical Turk task in that it is difficult for both humans and machines, whereas a standard Mechanical Turk task is usually easy for humans and difficult or impossible for machines. This article discusses the varied difficulties we encountered while qualifying annotators and the steps that we took to select the individuals most likely to do well at our annotation task in the future.
在这篇文章中,我们回顾了我们开发的方法,用于从网络上随机视频的地理位置的手动注释中找到土耳其机械参与者。这个项目需要高质量的注释,因为我们正试图建立一个人类基线,以便将来与机器系统进行比较。这个任务不同于标准的Mechanical Turk任务,因为它对人类和机器来说都很困难,而标准的Mechanical Turk任务通常对人类来说很容易,对机器来说很难或不可能。本文讨论了我们在筛选注释者时遇到的各种困难,以及我们为选择将来最有可能完成注释任务的人所采取的步骤。
{"title":"Pushing the limits of mechanical turk: qualifying the crowd for video geo-location","authors":"L. Gottlieb, Jaeyoung Choi, P. Kelm, T. Sikora, G. Friedland","doi":"10.1145/2390803.2390815","DOIUrl":"https://doi.org/10.1145/2390803.2390815","url":null,"abstract":"In this article we review the methods we have developed for finding Mechanical Turk participants for the manual annotation of the geo-location of random videos from the web. We require high quality annotations for this project, as we are attempting to establish a human baseline for future comparison to machine systems. This task is different from a standard Mechanical Turk task in that it is difficult for both humans and machines, whereas a standard Mechanical Turk task is usually easy for humans and difficult or impossible for machines. This article discusses the varied difficulties we encountered while qualifying annotators and the steps that we took to select the individuals most likely to do well at our annotation task in the future.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127958732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
PodCastle and songle: crowdsourcing-based web services for spoken content retrieval and active music listening PodCastle和songle:基于众包的网络服务,用于口语内容检索和主动音乐收听
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390805
Masataka Goto, J. Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano
In this keynote talk, we describe two crowdsourcing-based web services, PodCastle (http://en.podcastle.jp for the English version and http://podcastle.jp for the Japanese version) and Songle (http://songle.jp). PodCastle and Songle collect voluntary contributions by anonymous users in order to improve the experiences of users listening to speech and music content available on the web. These services use automatic speech-recognition and music-understanding technologies to provide content analysis results, such as full-text speech transcriptions and music scene descriptions, that let users enjoy content-based multimedia retrieval and active browsing of speech and music signals without relying on metadata. When automatic content analysis is used, however, errors are inevitable. PodCastle and Songle therefore provide an efficient error correction interface that let users easily correct errors by selecting from a list of candidate alternatives. Through these corrections, users gain a real sense of contributing for their own benefit and that of others and can be further motivated to contribute by seeing corrections made by other users. Our services promote the popularization and use of speech-recognition and music-understanding technologies by raising user awareness. Users can grasp the nature of those technologies just by seeing results obtained when the technologies applied to speech data and songs available on the web.
在这个主题演讲中,我们将介绍两个基于众包的网络服务,PodCastle(英文版本http://en.podcastle.jp,日文版本http://podcastle.jp)和Songle (http://songle.jp)。PodCastle和Songle收集匿名用户的自愿捐款,以改善用户在网络上收听语音和音乐内容的体验。这些服务使用自动语音识别和音乐理解技术来提供内容分析结果,如全文语音转录和音乐场景描述,让用户享受基于内容的多媒体检索和语音和音乐信号的主动浏览,而不依赖于元数据。然而,当使用自动内容分析时,错误是不可避免的。因此,PodCastle和Songle提供了一个有效的纠错界面,让用户可以通过从候选选项列表中进行选择来轻松纠正错误。通过这些纠正,用户获得了为自己和他人的利益做出贡献的真正意义,并且可以通过看到其他用户所做的纠正来进一步激励他们做出贡献。我们的服务通过提高用户意识来促进语音识别和音乐理解技术的普及和使用。用户只要看到将这些技术应用于网络上的语音数据和歌曲所获得的结果,就能掌握这些技术的本质。
{"title":"PodCastle and songle: crowdsourcing-based web services for spoken content retrieval and active music listening","authors":"Masataka Goto, J. Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano","doi":"10.1145/2390803.2390805","DOIUrl":"https://doi.org/10.1145/2390803.2390805","url":null,"abstract":"In this keynote talk, we describe two crowdsourcing-based web services, PodCastle (http://en.podcastle.jp for the English version and http://podcastle.jp for the Japanese version) and Songle (http://songle.jp). PodCastle and Songle collect voluntary contributions by anonymous users in order to improve the experiences of users listening to speech and music content available on the web. These services use automatic speech-recognition and music-understanding technologies to provide content analysis results, such as full-text speech transcriptions and music scene descriptions, that let users enjoy content-based multimedia retrieval and active browsing of speech and music signals without relying on metadata.\u0000 When automatic content analysis is used, however, errors are inevitable. PodCastle and Songle therefore provide an efficient error correction interface that let users easily correct errors by selecting from a list of candidate alternatives. Through these corrections, users gain a real sense of contributing for their own benefit and that of others and can be further motivated to contribute by seeing corrections made by other users.\u0000 Our services promote the popularization and use of speech-recognition and music-understanding technologies by raising user awareness. Users can grasp the nature of those technologies just by seeing results obtained when the technologies applied to speech data and songs available on the web.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124416526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Tag suggestion on youtube by personalizing content-based auto-annotation 通过个性化基于内容的自动注释在youtube上标记建议
Pub Date : 2012-10-29 DOI: 10.1145/2390803.2390819
Dominik Henter, Damian Borth, A. Ulges
We address the challenge of tag recommendation for web video clips on portals such as YouTube. In a quantitative study on 23,000 YouTube videos, we first evaluate different tag suggestion strategies employing user profiling (using tags from the user's upload history) as well as social signals (the channels a user subscribed to) and content analysis. Our results confirm earlier findings that --~at least when employing users' original tags as ground truth~-- a history-based approach outperforms other techniques. Second, we suggest a novel approach that integrates the strengths of history-based tag suggestion with a content matching crowd-sourced from a large repository of user generated videos. Our approach performs a visual similarity matching and merges neighbors found in a large-scale reference dataset of user-tagged content with others from the user's personal history. This way, signals gained by crowd-sourcing can help to disambiguate tag suggestions, for example in cases of heterogeneous user interest profiles or non-existing user history. Our quantitative experiments indicate that such a personalized tag transfer gives strong improvements over a standard content matching, and moderate ones over a content-free history-based ranking.
我们解决了诸如YouTube等门户网站视频剪辑的标签推荐问题。在对23,000个YouTube视频的定量研究中,我们首先评估了不同的标签建议策略,采用用户分析(使用用户上传历史中的标签)以及社会信号(用户订阅的频道)和内容分析。我们的结果证实了早期的发现——至少当使用用户的原始标签作为基础事实时——基于历史的方法优于其他技术。其次,我们提出了一种新颖的方法,该方法将基于历史的标签建议的优势与来自用户生成视频的大型存储库的内容匹配众包相结合。我们的方法执行视觉相似性匹配,并将在用户标记内容的大规模参考数据集中发现的邻居与用户个人历史中的其他邻居合并。通过这种方式,通过众包获得的信号可以帮助消除标签建议的歧义,例如在异构用户兴趣概况或不存在的用户历史的情况下。我们的定量实验表明,这种个性化的标签转移比标准的内容匹配有很大的改进,比基于无内容历史的排名有适度的改进。
{"title":"Tag suggestion on youtube by personalizing content-based auto-annotation","authors":"Dominik Henter, Damian Borth, A. Ulges","doi":"10.1145/2390803.2390819","DOIUrl":"https://doi.org/10.1145/2390803.2390819","url":null,"abstract":"We address the challenge of tag recommendation for web video clips on portals such as YouTube. In a quantitative study on 23,000 YouTube videos, we first evaluate different tag suggestion strategies employing user profiling (using tags from the user's upload history) as well as social signals (the channels a user subscribed to) and content analysis. Our results confirm earlier findings that --~at least when employing users' original tags as ground truth~-- a history-based approach outperforms other techniques. Second, we suggest a novel approach that integrates the strengths of history-based tag suggestion with a content matching crowd-sourced from a large repository of user generated videos. Our approach performs a visual similarity matching and merges neighbors found in a large-scale reference dataset of user-tagged content with others from the user's personal history. This way, signals gained by crowd-sourcing can help to disambiguate tag suggestions, for example in cases of heterogeneous user interest profiles or non-existing user history. Our quantitative experiments indicate that such a personalized tag transfer gives strong improvements over a standard content matching, and moderate ones over a content-free history-based ranking.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129684562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
CrowdMM '12
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1