你看到我看到的了吗?测量图像识别服务输出的语义差异

IF 2.8 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of the Association for Information Science and Technology Pub Date : 2023-09-05 DOI:10.1002/asi.24827
Anton Berg, Matti Nelimarkka
{"title":"你看到我看到的了吗?测量图像识别服务输出的语义差异","authors":"Anton Berg,&nbsp;Matti Nelimarkka","doi":"10.1002/asi.24827","DOIUrl":null,"url":null,"abstract":"<p>As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"74 11","pages":"1307-1324"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24827","citationCount":"0","resultStr":"{\"title\":\"Do you see what I see? Measuring the semantic differences in image-recognition services' outputs\",\"authors\":\"Anton Berg,&nbsp;Matti Nelimarkka\",\"doi\":\"10.1002/asi.24827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.</p>\",\"PeriodicalId\":48810,\"journal\":{\"name\":\"Journal of the Association for Information Science and Technology\",\"volume\":\"74 11\",\"pages\":\"1307-1324\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24827\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Association for Information Science and Technology\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/asi.24827\",\"RegionNum\":2,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Association for Information Science and Technology","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asi.24827","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

随着学者们越来越多地对视觉材料进行大规模分析,先进的计算工具有望为这一过程提供信息。工具箱中的一项技术是图像识别,可以通过谷歌视觉AI、微软Azure计算机视觉和亚马逊的Rekognition服务轻松访问。然而,对偏见因素和低可靠性等问题的担忧导致了对使用它的研究的警告。一项针对跨服务标签协议的系统研究具体化了这些问题:使用八个数据集,涵盖专业制作和用户生成的图像,这项工作表明,图像识别服务在最适合图像的标签上存在分歧。除了支持先前文献中表达的警告外,该报告还阐述了两种缓解策略,均涉及使用多种图像识别服务:高度探索性的研究可以包括所有标签,接受噪声较大但限制较少的分析输出。或者,学者们可以采用基于单词嵌入的方法来识别与其目的足够相似的概念,然后专注于过滤进来的标签。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Do you see what I see? Measuring the semantic differences in image-recognition services' outputs

As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.30
自引率
8.60%
发文量
115
期刊介绍: The Journal of the Association for Information Science and Technology (JASIST) is a leading international forum for peer-reviewed research in information science. For more than half a century, JASIST has provided intellectual leadership by publishing original research that focuses on the production, discovery, recording, storage, representation, retrieval, presentation, manipulation, dissemination, use, and evaluation of information and on the tools and techniques associated with these processes. The Journal welcomes rigorous work of an empirical, experimental, ethnographic, conceptual, historical, socio-technical, policy-analytic, or critical-theoretical nature. JASIST also commissions in-depth review articles (“Advances in Information Science”) and reviews of print and other media.
期刊最新文献
Cover Image Issue Information Embodied and dialogical basis for understanding humans with information: A sustainable view Cover Image Issue Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1