Source Attribution of Online News Images by Compression Analysis

2021 IEEE International Workshop on Information Forensics and Security (WIFS) Pub Date : 2021-12-07 DOI:10.1109/WIFS53200.2021.9648385

Michael Albright, Nitesh Menon, Kristy Roschke, Arslan Basharat

{"title":"Source Attribution of Online News Images by Compression Analysis","authors":"Michael Albright, Nitesh Menon, Kristy Roschke, Arslan Basharat","doi":"10.1109/WIFS53200.2021.9648385","DOIUrl":null,"url":null,"abstract":"The rapid increase in the amount of online disinformation warrants new and robust digital forensics methods for validating purported sources of multimodal news articles. We conducted a survey of news photojournalists for insights into their workflows. A high percentage (91%) of respondents reported standardized photo publishing procedures, which we hypothesize facilitates source verification. In this work, we demonstrate that the online news sites leave predictable and discernible patterns in the compression settings of the images they publish. We propose novel, simple, and very efficient algorithms to analyze the image compression profiles for news source verification and identification. We evaluate the algorithms' effectiveness through extensive experiments on a newly-released dataset of over 64K images from over 34K articles collected from 30 news sites. The image compression features are modeled by Naive Bayes variants or XGBoost classifiers for source attribution and verification. For these news sources we are able to achieve very strong performance with the proposed algorithms resulting in 0.92–0.94 average AUC for source verification under a closed set scenario, and compelling open set generalization with only 0.0–0.04 reduction in the average AUC.","PeriodicalId":196985,"journal":{"name":"2021 IEEE International Workshop on Information Forensics and Security (WIFS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Workshop on Information Forensics and Security (WIFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIFS53200.2021.9648385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The rapid increase in the amount of online disinformation warrants new and robust digital forensics methods for validating purported sources of multimodal news articles. We conducted a survey of news photojournalists for insights into their workflows. A high percentage (91%) of respondents reported standardized photo publishing procedures, which we hypothesize facilitates source verification. In this work, we demonstrate that the online news sites leave predictable and discernible patterns in the compression settings of the images they publish. We propose novel, simple, and very efficient algorithms to analyze the image compression profiles for news source verification and identification. We evaluate the algorithms' effectiveness through extensive experiments on a newly-released dataset of over 64K images from over 34K articles collected from 30 news sites. The image compression features are modeled by Naive Bayes variants or XGBoost classifiers for source attribution and verification. For these news sources we are able to achieve very strong performance with the proposed algorithms resulting in 0.92–0.94 average AUC for source verification under a closed set scenario, and compelling open set generalization with only 0.0–0.04 reduction in the average AUC.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于压缩分析的在线新闻图像来源归属

在线虚假信息数量的迅速增加需要新的和强大的数字取证方法来验证多式联运新闻文章的据称来源。我们对新闻摄影记者进行了一项调查，以了解他们的工作流程。高百分比(91%)的受访者报告了标准化的照片发布程序，我们假设这有助于来源验证。在这项工作中，我们证明了在线新闻网站在他们发布的图像的压缩设置中留下了可预测和可识别的模式。我们提出新颖、简单、高效的算法来分析新闻源验证和识别的图像压缩配置文件。我们在一个新发布的数据集上进行了广泛的实验，该数据集包含来自30个新闻网站的34K多篇文章的64K多张图像，从而评估了算法的有效性。图像压缩特征由朴素贝叶斯变体或XGBoost分类器建模，用于源属性和验证。对于这些新闻源，我们能够通过所提出的算法获得非常强的性能，在封闭集场景下，源验证的平均AUC为0.92-0.94，而开放集泛化的平均AUC仅降低了0.0-0.04。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE International Workshop on Information Forensics and Security (WIFS)

自引率

0.00%

发文量

期刊最新文献

CNN Steganalyzers Leverage Local Embedding Artifacts Unsupervised JPEG Domain Adaptation for Practical Digital Image Forensics 3D Print-Scan Resilient Localized Mesh Watermarking Secure Collaborative Editing Using Secret Sharing How are PDF files published in the Scientific Community?