Neural Encoding for Image Recall: Human-Like Memory

arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2024-09-18 DOI:arxiv-2409.11750

Virgile Foussereau, Robin Dumas

{"title":"Neural Encoding for Image Recall: Human-Like Memory","authors":"Virgile Foussereau, Robin Dumas","doi":"arxiv-2409.11750","DOIUrl":null,"url":null,"abstract":"Achieving human-like memory recall in artificial systems remains a\nchallenging frontier in computer vision. Humans demonstrate remarkable ability\nto recall images after a single exposure, even after being shown thousands of\nimages. However, this capacity diminishes significantly when confronted with\nnon-natural stimuli such as random textures. In this paper, we present a method\ninspired by human memory processes to bridge this gap between artificial and\nbiological memory systems. Our approach focuses on encoding images to mimic the\nhigh-level information retained by the human brain, rather than storing raw\npixel data. By adding noise to images before encoding, we introduce variability\nakin to the non-deterministic nature of human memory encoding. Leveraging\npre-trained models' embedding layers, we explore how different architectures\nencode images and their impact on memory recall. Our method achieves impressive\nresults, with 97% accuracy on natural images and near-random performance (52%)\non textures. We provide insights into the encoding process and its implications\nfor machine learning memory systems, shedding light on the parallels between\nhuman and artificial intelligence memory mechanisms.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Achieving human-like memory recall in artificial systems remains a challenging frontier in computer vision. Humans demonstrate remarkable ability to recall images after a single exposure, even after being shown thousands of images. However, this capacity diminishes significantly when confronted with non-natural stimuli such as random textures. In this paper, we present a method inspired by human memory processes to bridge this gap between artificial and biological memory systems. Our approach focuses on encoding images to mimic the high-level information retained by the human brain, rather than storing raw pixel data. By adding noise to images before encoding, we introduce variability akin to the non-deterministic nature of human memory encoding. Leveraging pre-trained models' embedding layers, we explore how different architectures encode images and their impact on memory recall. Our method achieves impressive results, with 97% accuracy on natural images and near-random performance (52%) on textures. We provide insights into the encoding process and its implications for machine learning memory systems, shedding light on the parallels between human and artificial intelligence memory mechanisms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

图像再现的神经编码：类人记忆

在人工系统中实现类似人类的记忆回忆能力，仍然是计算机视觉领域的一个挑战性前沿领域。人类在一次曝光后就能回忆起图像，即使是在展示了数千张图像之后，也能表现出非凡的能力。然而，当面对随机纹理等非自然刺激时，这种能力就会大大减弱。在本文中，我们提出了一种受人类记忆过程启发的方法，以弥合人工记忆系统与生物记忆系统之间的差距。我们的方法侧重于对图像进行编码，以模仿人脑保留的高层次信息，而不是存储原始像素数据。通过在编码前为图像添加噪声，我们将人类记忆编码的非确定性引入了可变性。利用预先训练好的模型嵌入层，我们探索了不同架构如何编码图像及其对记忆回忆的影响。我们的方法取得了令人印象深刻的结果，在自然图像上的准确率为 97%，在纹理上的准确率接近随机表现（52%）。我们深入探讨了编码过程及其对机器学习记忆系统的影响，揭示了人类记忆机制与人工智能记忆机制之间的相似之处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey