{"title":"Session details: Keynote Talks","authors":"B. S. Manjunath","doi":"10.1145/3545210","DOIUrl":"https://doi.org/10.1145/3545210","url":null,"abstract":"","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125868224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the problem of batch steganography when the senders use feedback from a steganography detector. This brings an additional level of complexity to the table due to the highly non-linear and non-Gaussian response of modern steganalysis detectors as well as the necessity to study the impact of the inevitable mismatch between senders' and Warden's detectors. Two payload spreaders are considered based on the oracle generating possible cover images. Three different pooling strategies are devised and studied for a more comprehensive assessment of security. Substantial security gains are observed with respect to previous art - the detector-agnostic image-merging sender. Close attention is paid to the impact of the information available to the Warden on security.
{"title":"Detector-Informed Batch Steganography and Pooled Steganalysis","authors":"Yassine Yousfi, Eli Dworetzky, J. Fridrich","doi":"10.1145/3531536.3532951","DOIUrl":"https://doi.org/10.1145/3531536.3532951","url":null,"abstract":"We study the problem of batch steganography when the senders use feedback from a steganography detector. This brings an additional level of complexity to the table due to the highly non-linear and non-Gaussian response of modern steganalysis detectors as well as the necessity to study the impact of the inevitable mismatch between senders' and Warden's detectors. Two payload spreaders are considered based on the oracle generating possible cover images. Three different pooling strategies are devised and studied for a more comprehensive assessment of security. Substantial security gains are observed with respect to previous art - the detector-agnostic image-merging sender. Close attention is paid to the impact of the information available to the Warden on security.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124143833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article presents a first study on the possibility of hiding data using the UEFI NVRAM of today's computer systems as a storage channel. Embedding and extraction of executable data as well as media data are discussed and demonstrated as a proof of concept. This is successfully evaluated using 10 different systems. This paper further explores the implications of data hiding within UEFI NVRAM for computer forensic investigations and provides forensics measures to address this new challenge.
{"title":"Hidden in Plain Sight - Persistent Alternative Mass Storage Data Streams as a Means for Data Hiding With the Help of UEFI NVRAM and Implications for IT Forensics","authors":"Stefan Kiltz, R. Altschaffel, J. Dittmann","doi":"10.1145/3531536.3532965","DOIUrl":"https://doi.org/10.1145/3531536.3532965","url":null,"abstract":"This article presents a first study on the possibility of hiding data using the UEFI NVRAM of today's computer systems as a storage channel. Embedding and extraction of executable data as well as media data are discussed and demonstrated as a proof of concept. This is successfully evaluated using 10 different systems. This paper further explores the implications of data hiding within UEFI NVRAM for computer forensic investigations and provides forensics measures to address this new challenge.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124701867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This talk focuses on end-to-end protection of the present and emerging Deep Learning (DL) and Federated Learning (FL) models. On the one hand, DL and FL models are usually trained by allocating significant computational resources to process massive training data. The built models are therefore considered as the owner's IP and need to be protected. On the other hand, malicious attackers may take advantage of the models for illegal usages. IP protection needs to be considered during the design and training of the DL models before the owners make their models publicly available. The tremendous parameter space of DL models allows them to learn hidden features automatically. We explore the 'over-parameterization' of DL models and demonstrate how to hide additional information within DL. Particularly, we discuss a number of our end-to-end automated frameworks over the past few years that leverage information hiding for IP protection, including: DeepSigns[5] and DeepMarks[2], the first DL watermarking and fingerprinting frameworks that work by embedding the owner's signature in the dynamic activations and output behaviors of the DL model; DeepAttest[1], the first hardware-based attestation framework for verifying the legitimacy of the deployed model via on-device attestation. We also develop a multi-bit black-box DNN watermarking scheme[3] and demonstrate spread spectrum-based DL watermarking[4]. In the context of Federated Learning (FL), we show how these results can be leveraged for the design of a novel holistic covert communication framework that allows stealthy information sharing between local clients while preserving FL convergence. We conclude by outlining the open challenges and emerging directions.
{"title":"Intellectual Property (IP) Protection for Deep Learning and Federated Learning Models","authors":"F. Koushanfar","doi":"10.1145/3531536.3532957","DOIUrl":"https://doi.org/10.1145/3531536.3532957","url":null,"abstract":"This talk focuses on end-to-end protection of the present and emerging Deep Learning (DL) and Federated Learning (FL) models. On the one hand, DL and FL models are usually trained by allocating significant computational resources to process massive training data. The built models are therefore considered as the owner's IP and need to be protected. On the other hand, malicious attackers may take advantage of the models for illegal usages. IP protection needs to be considered during the design and training of the DL models before the owners make their models publicly available. The tremendous parameter space of DL models allows them to learn hidden features automatically. We explore the 'over-parameterization' of DL models and demonstrate how to hide additional information within DL. Particularly, we discuss a number of our end-to-end automated frameworks over the past few years that leverage information hiding for IP protection, including: DeepSigns[5] and DeepMarks[2], the first DL watermarking and fingerprinting frameworks that work by embedding the owner's signature in the dynamic activations and output behaviors of the DL model; DeepAttest[1], the first hardware-based attestation framework for verifying the legitimacy of the deployed model via on-device attestation. We also develop a multi-bit black-box DNN watermarking scheme[3] and demonstrate spread spectrum-based DL watermarking[4]. In the context of Federated Learning (FL), we show how these results can be leveraged for the design of a novel holistic covert communication framework that allows stealthy information sharing between local clients while preserving FL convergence. We conclude by outlining the open challenges and emerging directions.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128714870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With current advancements in deep learning technology, it is becoming easier to create high-quality face forgery videos, causing concerns about the misuse of deepfake technology. In recent years, research on deepfake detection has become a popular topic. Many detection methods have been proposed, most of which focus on exploiting image artifacts or frequency domain features for detection. In this work, we propose using real images of the same identity as a reference to improve detection performance. Specifically, a real image of the same identity is used as a reference image and input into the model together with the image to be tested to learn the distinguishable identity representation, which is achieved by contrastive learning. Our method achieves superior performance on both FaceForensics++ and Celeb-DF with relatively little training data, and also achieves very competitive results on cross-manipulation and cross-dataset evaluations, demonstrating the effectiveness of our solution.
{"title":"Identity-Referenced Deepfake Detection with Contrastive Learning","authors":"Dongyao Shen, Youjian Zhao, Chengbin Quan","doi":"10.1145/3531536.3532964","DOIUrl":"https://doi.org/10.1145/3531536.3532964","url":null,"abstract":"With current advancements in deep learning technology, it is becoming easier to create high-quality face forgery videos, causing concerns about the misuse of deepfake technology. In recent years, research on deepfake detection has become a popular topic. Many detection methods have been proposed, most of which focus on exploiting image artifacts or frequency domain features for detection. In this work, we propose using real images of the same identity as a reference to improve detection performance. Specifically, a real image of the same identity is used as a reference image and input into the model together with the image to be tested to learn the distinguishable identity representation, which is achieved by contrastive learning. Our method achieves superior performance on both FaceForensics++ and Celeb-DF with relatively little training data, and also achieves very competitive results on cross-manipulation and cross-dataset evaluations, demonstrating the effectiveness of our solution.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132861488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Text steganalysis is a technique to distinguish between steganographic text and normal text via statistical features. Current state-of-the-art text steganalysis models have two limitations. First, they need sufficient amounts of labeled data for training. Second, they lack the generalization ability on different detection tasks. In this paper, we propose a meta-learning framework for text steganalysis in the few-shot scenario to ensure model fast-adaptation between tasks. A general feature extractor based on BERT is applied to extract universal features among tasks, and a meta-learner based on attentional Bi-LSTM is employed to learn task-specific representations. A classifier trained on the support set calculates the prediction loss on the query set with a few samples to update the meta-learner. Extensive experiments show that our model can adapt fast among different steganalysis tasks through extremely few-shot samples, significantly improving detection performance compared with the state-of-the-art steganalysis models and other meta-learning methods.
{"title":"Few-shot Text Steganalysis Based on Attentional Meta-learner","authors":"Juan Wen, Ziwei Zhang, Y. Yang, Yiming Xue","doi":"10.1145/3531536.3532949","DOIUrl":"https://doi.org/10.1145/3531536.3532949","url":null,"abstract":"Text steganalysis is a technique to distinguish between steganographic text and normal text via statistical features. Current state-of-the-art text steganalysis models have two limitations. First, they need sufficient amounts of labeled data for training. Second, they lack the generalization ability on different detection tasks. In this paper, we propose a meta-learning framework for text steganalysis in the few-shot scenario to ensure model fast-adaptation between tasks. A general feature extractor based on BERT is applied to extract universal features among tasks, and a meta-learner based on attentional Bi-LSTM is employed to learn task-specific representations. A classifier trained on the support set calculates the prediction loss on the query set with a few samples to update the meta-learner. Extensive experiments show that our model can adapt fast among different steganalysis tasks through extremely few-shot samples, significantly improving detection performance compared with the state-of-the-art steganalysis models and other meta-learning methods.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131505696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduced in 1991, libjpeg has become a well-established library for processing JPEG images. Many libraries in high-level languages use libjpeg under the hood. So far, little attention has been paid to the fact that different versions of the library produce different outputs for the same input. This may have implications on security-related applications, such as image forensics or steganalysis, where evidence is generated by tracking small, imperceptible changes in JPEG-compressed signals. This paper systematically analyses all libjpeg versions since 1998, including the forked libjpeg-turbo (in its latest version). It compares the outputs of compression and decompression operations for a range of parameter settings. We identify up to three distinct behaviors for compression and up to six for decompression.
{"title":"Know Your Library: How the libjpeg Version Influences Compression and Decompression Results","authors":"Martin Benes, Nora Hofer, Rainer Böhme","doi":"10.1145/3531536.3532962","DOIUrl":"https://doi.org/10.1145/3531536.3532962","url":null,"abstract":"Introduced in 1991, libjpeg has become a well-established library for processing JPEG images. Many libraries in high-level languages use libjpeg under the hood. So far, little attention has been paid to the fact that different versions of the library produce different outputs for the same input. This may have implications on security-related applications, such as image forensics or steganalysis, where evidence is generated by tracking small, imperceptible changes in JPEG-compressed signals. This paper systematically analyses all libjpeg versions since 1998, including the forked libjpeg-turbo (in its latest version). It compares the outputs of compression and decompression operations for a range of parameter settings. We identify up to three distinct behaviors for compression and up to six for decompression.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128348780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 5: Security & Privacy II","authors":"Daniel Chew","doi":"10.1145/3545215","DOIUrl":"https://doi.org/10.1145/3545215","url":null,"abstract":"","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123002513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 1: Forensics","authors":"Rainer Böhme","doi":"10.1145/3545211","DOIUrl":"https://doi.org/10.1145/3545211","url":null,"abstract":"","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126300191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Árpád Berta, Gábor Danner, István Hegedüs, Márk Jelasity
Machine learning models are vulnerable to adversarial attacks, where a small, invisible, malicious perturbation of the input changes the predicted label. A large area of research is concerned with verification techniques that attempt to decide whether a given model has adversarial inputs close to a given benign input. Here, we show that current approaches to verification have a key vulnerability: we construct a model that is not robust but passes current verifiers. The idea is to insert artificial adversarial perturbations by adding a backdoor to a robust neural network model. In our construction, the adversarial input subspace that triggers the backdoor has a very small volume, and outside this subspace the gradient of the model is identical to that of the clean model. In other words, we seek to create a "needle in a haystack" search problem. For practical purposes, we also require that the adversarial samples be robust to JPEG compression. Large "needle in the haystack" problems are practically impossible to solve with any search algorithm. Formal verifiers can handle this in principle, but they do not scale up to real-world networks at the moment, and achieving this is a challenge because the verification problem is NP-complete. Our construction is based on training a hiding and a revealing network using deep steganography. Using the revealing network, we create a separate backdoor network and integrate it into the target network. We train our deep steganography networks over the CIFAR-10 dataset. We then evaluate our construction using state-of-the-art adversarial attacks and backdoor detectors over the CIFAR-10 and the ImageNet datasets. We made the code and models publicly available at https://github.com/szegedai/hiding-needles-in-a-haystack.
{"title":"Hiding Needles in a Haystack: Towards Constructing Neural Networks that Evade Verification","authors":"Árpád Berta, Gábor Danner, István Hegedüs, Márk Jelasity","doi":"10.1145/3531536.3532966","DOIUrl":"https://doi.org/10.1145/3531536.3532966","url":null,"abstract":"Machine learning models are vulnerable to adversarial attacks, where a small, invisible, malicious perturbation of the input changes the predicted label. A large area of research is concerned with verification techniques that attempt to decide whether a given model has adversarial inputs close to a given benign input. Here, we show that current approaches to verification have a key vulnerability: we construct a model that is not robust but passes current verifiers. The idea is to insert artificial adversarial perturbations by adding a backdoor to a robust neural network model. In our construction, the adversarial input subspace that triggers the backdoor has a very small volume, and outside this subspace the gradient of the model is identical to that of the clean model. In other words, we seek to create a \"needle in a haystack\" search problem. For practical purposes, we also require that the adversarial samples be robust to JPEG compression. Large \"needle in the haystack\" problems are practically impossible to solve with any search algorithm. Formal verifiers can handle this in principle, but they do not scale up to real-world networks at the moment, and achieving this is a challenge because the verification problem is NP-complete. Our construction is based on training a hiding and a revealing network using deep steganography. Using the revealing network, we create a separate backdoor network and integrate it into the target network. We train our deep steganography networks over the CIFAR-10 dataset. We then evaluate our construction using state-of-the-art adversarial attacks and backdoor detectors over the CIFAR-10 and the ImageNet datasets. We made the code and models publicly available at https://github.com/szegedai/hiding-needles-in-a-haystack.","PeriodicalId":164949,"journal":{"name":"Proceedings of the 2022 ACM Workshop on Information Hiding and Multimedia Security","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114600573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}