你看到我看到的了吗?测量图像识别服务输出的语义差异

IF 2.8 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of the Association for Information Science and Technology Pub Date : 2023-09-05 DOI:10.1002/asi.24827

Anton Berg, Matti Nelimarkka

{"title":"你看到我看到的了吗?测量图像识别服务输出的语义差异","authors":"Anton Berg, Matti Nelimarkka","doi":"10.1002/asi.24827","DOIUrl":null,"url":null,"abstract":"<p>As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"74 11","pages":"1307-1324"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24827","citationCount":"0","resultStr":"{\"title\":\"Do you see what I see? Measuring the semantic differences in image-recognition services' outputs\",\"authors\":\"Anton Berg, Matti Nelimarkka\",\"doi\":\"10.1002/asi.24827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.</p>\",\"PeriodicalId\":48810,\"journal\":{\"name\":\"Journal of the Association for Information Science and Technology\",\"volume\":\"74 11\",\"pages\":\"1307-1324\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24827\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Association for Information Science and Technology\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/asi.24827\",\"RegionNum\":2,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Association for Information Science and Technology","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asi.24827","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

随着学者们越来越多地对视觉材料进行大规模分析，先进的计算工具有望为这一过程提供信息。工具箱中的一项技术是图像识别，可以通过谷歌视觉AI、微软Azure计算机视觉和亚马逊的Rekognition服务轻松访问。然而，对偏见因素和低可靠性等问题的担忧导致了对使用它的研究的警告。一项针对跨服务标签协议的系统研究具体化了这些问题：使用八个数据集，涵盖专业制作和用户生成的图像，这项工作表明，图像识别服务在最适合图像的标签上存在分歧。除了支持先前文献中表达的警告外，该报告还阐述了两种缓解策略，均涉及使用多种图像识别服务：高度探索性的研究可以包括所有标签，接受噪声较大但限制较少的分析输出。或者，学者们可以采用基于单词嵌入的方法来识别与其目的足够相似的概念，然后专注于过滤进来的标签。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Do you see what I see? Measuring the semantic differences in image-recognition services' outputs

As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the Association for Information Science and Technology COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

8.30

自引率

8.60%

发文量

115

期刊介绍： The Journal of the Association for Information Science and Technology (JASIST) is a leading international forum for peer-reviewed research in information science. For more than half a century, JASIST has provided intellectual leadership by publishing original research that focuses on the production, discovery, recording, storage, representation, retrieval, presentation, manipulation, dissemination, use, and evaluation of information and on the tools and techniques associated with these processes. The Journal welcomes rigorous work of an empirical, experimental, ethnographic, conceptual, historical, socio-technical, policy-analytic, or critical-theoretical nature. JASIST also commissions in-depth review articles (“Advances in Information Science”) and reviews of print and other media.

期刊最新文献

Cover Image Issue Information Embodied and dialogical basis for understanding humans with information: A sustainable view Cover Image Issue Information