In recent years, several techniques have been developed to detect deepfake images, with particular success of approaches that exploit analytical traces (e.g. frequency domain features), such as those derived from the Discrete Cosine Transform (DCT). Despite their effectiveness, these detectors remain vulnerable to adversarial attacks. In this paper, we introduce a novel gray-box adversarial attack specifically designed to evade DCT-based deepfake detectors. Our method accurately tunes the AC coefficient statistics of synthetic images to closely match those of real ones, while preserving high visual quality. The attack assumes full knowledge of the DCT feature extraction process, but not access to the internal parameters of the classifiers. We evaluate the proposed method against a set of DCT-based detectors using deepfakes generated from both Generative Adversarial Networks (GANs) and Diffusion Models (DMs). Experimental results show significant degradation in detection performance, exposing critical weaknesses in systems traditionally considered interpretable and robust. This work raises important concerns about the reliability of frequency domain detectors in forensic and cybersecurity applications.

A Novel Adversarial Gray-Box Attack on DCT-based Face Deepfake Detectors

Guarnera F.;Guarnera L.;Ortis A.;Battiato S.;
2025-01-01

Abstract

In recent years, several techniques have been developed to detect deepfake images, with particular success of approaches that exploit analytical traces (e.g. frequency domain features), such as those derived from the Discrete Cosine Transform (DCT). Despite their effectiveness, these detectors remain vulnerable to adversarial attacks. In this paper, we introduce a novel gray-box adversarial attack specifically designed to evade DCT-based deepfake detectors. Our method accurately tunes the AC coefficient statistics of synthetic images to closely match those of real ones, while preserving high visual quality. The attack assumes full knowledge of the DCT feature extraction process, but not access to the internal parameters of the classifiers. We evaluate the proposed method against a set of DCT-based detectors using deepfakes generated from both Generative Adversarial Networks (GANs) and Diffusion Models (DMs). Experimental results show significant degradation in detection performance, exposing critical weaknesses in systems traditionally considered interpretable and robust. This work raises important concerns about the reliability of frequency domain detectors in forensic and cybersecurity applications.
2025
Adversarial imaging
DCT analysis
Deepfake images
gray-box attack
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/687430
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact