The Photographic Pipeline of Machine Vision; or, Machine Vision's Latent Photographic Theory

Critical AI Pub Date : 2023-10-01 DOI:10.1215/2834703x-10734066

Nicolas Malevé, Katrina Sluis

{"title":"The Photographic Pipeline of Machine Vision; or, Machine Vision's Latent Photographic Theory","authors":"Nicolas Malevé, Katrina Sluis","doi":"10.1215/2834703x-10734066","DOIUrl":null,"url":null,"abstract":"Abstract Despite computer vision's extensive mobilization of cameras, photographers, and viewing subjects, photography's place in machine vision remains undertheorized. This article illuminates an operative theory of photography that exists in a latent form, embedded in the tools, practices, and discourses of machine vision research and enabling the methodological imperatives of dataset production. Focusing on the development of the canonical object recognition dataset ImageNet, the article analyzes how the dataset pipeline translates the radical polysemy of the photographic image into a stable and transparent form of data that can be portrayed as a proxy of human vision. Reflecting on the prominence of the photographic snapshot in machine vision discourse, the article traces the path that made this popular cultural practice amenable to the dataset. Following the evolution from nineteenth-century scientific photography to the acquisition of massive sets of online photos, the article shows how dataset creators inherit and transform a form of “instrumental realism,” a photographic enterprise that aims to establish a generalized look from contingent instances in the pursuit of statistical truth. The article concludes with a reflection on how the latent photographic theory of machine vision we have advanced relates to the large image models built for generative AI today.","PeriodicalId":500906,"journal":{"name":"Critical AI","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Critical AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1215/2834703x-10734066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Despite computer vision's extensive mobilization of cameras, photographers, and viewing subjects, photography's place in machine vision remains undertheorized. This article illuminates an operative theory of photography that exists in a latent form, embedded in the tools, practices, and discourses of machine vision research and enabling the methodological imperatives of dataset production. Focusing on the development of the canonical object recognition dataset ImageNet, the article analyzes how the dataset pipeline translates the radical polysemy of the photographic image into a stable and transparent form of data that can be portrayed as a proxy of human vision. Reflecting on the prominence of the photographic snapshot in machine vision discourse, the article traces the path that made this popular cultural practice amenable to the dataset. Following the evolution from nineteenth-century scientific photography to the acquisition of massive sets of online photos, the article shows how dataset creators inherit and transform a form of “instrumental realism,” a photographic enterprise that aims to establish a generalized look from contingent instances in the pursuit of statistical truth. The article concludes with a reflection on how the latent photographic theory of machine vision we have advanced relates to the large image models built for generative AI today.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

机器视觉中的摄影流水线;或者机器视觉的潜在摄影理论

尽管计算机视觉广泛地调动了相机、摄影师和观看对象，但摄影在机器视觉中的地位仍然缺乏理论化。本文阐明了一种潜在形式存在的摄影操作理论，它嵌入在机器视觉研究的工具、实践和话语中，并使数据集生产的方法必要性成为可能。本文以标准对象识别数据集ImageNet的开发为重点，分析了数据集管道如何将摄影图像的多义性转化为稳定透明的数据形式，可以作为人类视觉的代理来描绘。考虑到照片快照在机器视觉话语中的突出地位，本文追溯了使这种流行文化实践适合数据集的路径。随着从19世纪科学摄影到获取大量在线照片的演变，文章展示了数据集创建者如何继承和转变一种形式的“工具现实主义”，这是一种摄影企业，旨在从偶然的实例中建立一种普遍的外观，以追求统计真理。文章最后反思了我们提出的机器视觉的潜在摄影理论如何与今天为生成式人工智能构建的大型图像模型相关联。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Critical AI

自引率

0.00%

发文量