Classification of Confidential Images Using Neural Hash

NaUKMA Research Papers. Computer Science Pub Date : 2023-02-24 DOI:10.18523/2617-3808.2022.5.68-71

Olena Buchko, San Byn Nhuien

{"title":"Classification of Confidential Images Using Neural Hash","authors":"Olena Buchko, San Byn Nhuien","doi":"10.18523/2617-3808.2022.5.68-71","DOIUrl":null,"url":null,"abstract":"Humanity generates considerable information using its devices – smartphones, laptops, and tablets. Users upload images to different platforms, such as social networks, messengers, web services and other applications, which greatly endanger their personal information. User privacy has been exploited on the Internet for a long time. Interested parties lure potential customers into a trap of offers and services using such information as age, weight, nationality, religion and preferences. The sensitive information that may be contained in personal images is sometimes not recognized by their users as dangerous to share and, therefore, can easily be shared online by the owner without a second thought.This article inspects a neural hash algorithm for solving image classification tasks of confidential information and evaluates it via basic metrics. The main idea of the algorithm is to find similar images that will serve as an example for defining classes. The algorithm uses hash codes, ensuring users’ privacy. The evaluation of the algorithm is based on “The Visual Privacy (VISPR) Dataset”. The main components of the algorithm are a neural network that generates vectors of extracted features for images and an indexed set of images (hash tables) that store knowledge about a particular domain.The critical aspect of the algorithm involves collisions of hash codes for similar images due to the similarity of their vectors of extracted features. The resulting hash codes can be identical or differ by a specific value of Hamming distance. Multiple hash tables with different hash functions are used to increase the recall or precision of the results. The effect of imperfect taxonomy was analyzed, which led to further filtration of abstract classes and increasing overall scores.Also, the article investigates the “pseudo-adaptivity” of the algorithm - the ability to classify new classes and add new cases to existing classes that were not included in the training stages. Such ability may be crucial for domains with many image instances or classes.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NaUKMA Research Papers. Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18523/2617-3808.2022.5.68-71","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Humanity generates considerable information using its devices – smartphones, laptops, and tablets. Users upload images to different platforms, such as social networks, messengers, web services and other applications, which greatly endanger their personal information. User privacy has been exploited on the Internet for a long time. Interested parties lure potential customers into a trap of offers and services using such information as age, weight, nationality, religion and preferences. The sensitive information that may be contained in personal images is sometimes not recognized by their users as dangerous to share and, therefore, can easily be shared online by the owner without a second thought.This article inspects a neural hash algorithm for solving image classification tasks of confidential information and evaluates it via basic metrics. The main idea of the algorithm is to find similar images that will serve as an example for defining classes. The algorithm uses hash codes, ensuring users’ privacy. The evaluation of the algorithm is based on “The Visual Privacy (VISPR) Dataset”. The main components of the algorithm are a neural network that generates vectors of extracted features for images and an indexed set of images (hash tables) that store knowledge about a particular domain.The critical aspect of the algorithm involves collisions of hash codes for similar images due to the similarity of their vectors of extracted features. The resulting hash codes can be identical or differ by a specific value of Hamming distance. Multiple hash tables with different hash functions are used to increase the recall or precision of the results. The effect of imperfect taxonomy was analyzed, which led to further filtration of abstract classes and increasing overall scores.Also, the article investigates the “pseudo-adaptivity” of the algorithm - the ability to classify new classes and add new cases to existing classes that were not included in the training stages. Such ability may be crucial for domains with many image instances or classes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于神经哈希的机密图像分类

人类使用智能手机、笔记本电脑和平板电脑等设备产生大量信息。用户将图片上传到不同的平台，如社交网络、信使、web服务和其他应用程序，这极大地危害了他们的个人信息。长期以来，用户隐私在互联网上被利用。利益相关方利用年龄、体重、国籍、宗教信仰和偏好等信息，引诱潜在客户落入提供优惠和服务的陷阱。个人图像中可能包含的敏感信息有时不会被用户认为是危险的，因此，所有者可以毫不犹豫地在网上分享。本文研究了一种用于解决机密信息图像分类任务的神经哈希算法，并通过基本指标对其进行了评价。该算法的主要思想是找到类似的图像，作为定义类的示例。该算法使用哈希码，保证了用户的隐私。算法的评估基于“视觉隐私(VISPR)数据集”。该算法的主要组成部分是生成图像提取特征向量的神经网络和存储特定领域知识的索引图像集(哈希表)。该算法的关键方面涉及到相似图像的哈希码的碰撞，因为它们的提取特征向量的相似性。生成的哈希码可以相同，也可以不同于汉明距离的特定值。使用具有不同哈希函数的多个哈希表来提高结果的召回率或精度。分析了分类不完善的影响，进一步过滤抽象类，提高总分。此外，本文还研究了算法的“伪自适应”——对新类进行分类并向未包含在训练阶段的现有类添加新案例的能力。这种能力对于具有许多映像实例或类的域可能是至关重要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

NaUKMA Research Papers. Computer Science

自引率

0.00%

发文量

期刊最新文献

Bicycle Protection System Using GPS/GSM Modules аnd Radio Protocol Parking Spot Occupancy Classification Using Deep Learning Information System Assessment of the Creditworthiness of an Individual Transdisciplinary Information and Analytical Platform Supporting Evaluation Processes Two-Stage Transportation Problem with Unknown Consumer Demands