What to do with 2.000.000 Historical Press Photos? The Challenges and Opportunities of Applying a Scene Detection Algorithm to a Digitised Press Photo Collection

M. Wevers, N. Vriend, Alexander De Bruin
{"title":"What to do with 2.000.000 Historical Press Photos? The Challenges and Opportunities of Applying a Scene Detection Algorithm to a Digitised Press Photo Collection","authors":"M. Wevers, N. Vriend, Alexander De Bruin","doi":"10.18146/tmg.815","DOIUrl":null,"url":null,"abstract":"In 1962, Dutch celebrity Ria Kuyken was attacked by a circus bear. Cees de Boer captured this moment, for which he was awarded both a World Press Photo and the Silver Camera (Zilveren Camera). Though this photo popularised Fotopersbureau De Boer, which Cees had founded in 1945, the importance of the collection lies in its scale. Approximately 2,000,000 photos taken of about 250,000 events in sixty years, accompanied by extensive metadata. Not only major nationwide events are represented, but also subjects of small scale, human interest, such as the shopkeeper around the corner. Our aim is not only the digitisation and publication of all 2,000,000 photo negatives of Fotopersbureau De Boer but also to explore how artificial intelligence can enrich this collection, benefiting both users of the archive and cultural historians studying historical photographs. One of our efforts focuses on scene detection, a method to detect the ‘scene’ represented in an image (Zhou et al, 2018). We will rely on transfer learning to adapt existing computer vision models to our collection and the needs of our users. Existing models can generate labels with high accuracy, however, these labels are ahistorical and more often than not irrelevant to our collection. We will label subsets of the images via crowdsourcing to train and improve existing models. As such, we can add labels relevant to our collection to the model, which are absent in existing models. In this paper, we will highlight the opportunities and challenges of applying artificial intelligence to a collection of historical photographs.","PeriodicalId":187553,"journal":{"name":"TMG Journal for Media History","volume":"75 17","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TMG Journal for Media History","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18146/tmg.815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In 1962, Dutch celebrity Ria Kuyken was attacked by a circus bear. Cees de Boer captured this moment, for which he was awarded both a World Press Photo and the Silver Camera (Zilveren Camera). Though this photo popularised Fotopersbureau De Boer, which Cees had founded in 1945, the importance of the collection lies in its scale. Approximately 2,000,000 photos taken of about 250,000 events in sixty years, accompanied by extensive metadata. Not only major nationwide events are represented, but also subjects of small scale, human interest, such as the shopkeeper around the corner. Our aim is not only the digitisation and publication of all 2,000,000 photo negatives of Fotopersbureau De Boer but also to explore how artificial intelligence can enrich this collection, benefiting both users of the archive and cultural historians studying historical photographs. One of our efforts focuses on scene detection, a method to detect the ‘scene’ represented in an image (Zhou et al, 2018). We will rely on transfer learning to adapt existing computer vision models to our collection and the needs of our users. Existing models can generate labels with high accuracy, however, these labels are ahistorical and more often than not irrelevant to our collection. We will label subsets of the images via crowdsourcing to train and improve existing models. As such, we can add labels relevant to our collection to the model, which are absent in existing models. In this paper, we will highlight the opportunities and challenges of applying artificial intelligence to a collection of historical photographs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
如何处理200万张历史新闻照片?场景检测算法应用于数字化新闻照片集的挑战与机遇
1962年,荷兰名人Ria Kuyken被马戏团的熊袭击。Cees de Boer捕捉到了这一刻,并因此获得了世界新闻摄影奖和银相机奖(Zilveren Camera)。尽管这张照片使Fotopersbureau De Boer (Cees于1945年创立)流行起来,但该系列的重要性在于它的规模。60年来拍摄了大约200万张照片,记录了大约25万个事件,并附有大量的元数据。不仅有全国性的重大事件,也有小尺度的、人类感兴趣的主题,比如街角的店主。我们的目标不仅是将Fotopersbureau De Boer的所有200万张照片底片数字化和出版,而且还探索人工智能如何丰富这些收藏,使档案用户和研究历史照片的文化历史学家都受益。我们的工作之一集中在场景检测上,这是一种检测图像中表示的“场景”的方法(Zhou等人,2018)。我们将依靠迁移学习来调整现有的计算机视觉模型,以适应我们的集合和用户的需求。现有的模型可以以很高的准确性生成标签,然而,这些标签是非历史的,并且通常与我们的集合无关。我们将通过众包的方式标记图像的子集,以训练和改进现有的模型。因此,我们可以向模型添加与我们的集合相关的标签,这些标签在现有模型中是不存在的。在本文中,我们将重点介绍将人工智能应用于历史照片集合的机遇和挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Pleitbezorger van de voetbalkijker: Het discours over voetbal op televisie in de Nederlandse dagbladen tussen 1950 en 1980 Retrospective Technological Mythmaking: Media Discourses of Furby and Artificial Intelligence De filmfabriek. Profilti, de Haagse concurrent van Polygoon 1929-1933 Kunstzinnig vermaak in Amsterdam. Het Panoramagebouw in de Plantage 1880-1935 A New Numbering Plan Intended to Develop a Telephone Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1