short-paper

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

Authors:
Irene Amerini

Sapienza University of Rome, Roma, Italy

Sapienza University of Rome, Roma, Italy
View Profile

,
Roberto Caldelli

National Inter-University Consortium for Telecommunications (CNIT) & Mercatorum University, Parma, Italy

National Inter-University Consortium for Telecommunications (CNIT) & Mercatorum University, Parma, Italy
View Profile

IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia SecurityJune 2020Pages 97–102https://doi.org/10.1145/3369412.3395070

Published:23 June 2020Publication History

IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security

Pages 97–102

ABSTRACT

The ability of artificial intelligence techniques to build synthesized brand new videos or to alter the facial expression of already existing ones has been efficiently demonstrated in the literature. The identification of such new threat generally known as Deepfake, but consisting of different techniques, is fundamental in multimedia forensics. In fact this kind of manipulated information could undermine and easily distort the public opinion on a certain person or about a specific event. Thus, in this paper, a new technique able to distinguish synthetic generated portrait videos from natural ones is introduced by exploiting inconsistencies due to the prediction error in the re-encoding phase. In particular, features based on inter-frame prediction error have been investigated jointly with a Long Short-Term Memory (LSTM) model network able to learn the temporal correlation among consecutive frames. Preliminary results have demonstrated that such sequence-based approach, used to distinguish between original and manipulated videos, highlights promising performances.

References

Darius Afchar, Vincent Nozick, Junichi Yamagishi, and I Echizen. 2018. MesoNet: a Compact Facial Video Forgery Detection Network. 1--7. https://doi.org/10.1109/WIFS.2018.8630761Google Scholar
Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. 2019. Protecting World Leaders Against Deep Fakes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google Scholar
I. Amerini, R. Caldelli, V. Cappellini, F. Picchioni, and A. Piva. 2009. Analysis of denoising filters for photo response non uniformity noise extraction in source camera identification. In 2009 16th International Conference on Digital Signal Processing. 1--7. https://doi.org/10.1109/ICDSP.2009.5201240Google ScholarCross Ref
I. Amerini, C. Li, and R. Caldelli. 2019. Social Network Identification Through Image Classification With CNN. IEEE Access, Vol. 7 (2019), 35264--35273. https://doi.org/10.1109/ACCESS.2019.2903876Google ScholarCross Ref
Belhassen Bayar and Matthew C. Stamm. 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security (Vigo, Galicia, Spain) (MMSec '16). New York, NY, USA, 5--10. https://doi.org/10.1145/2909827.2930786Google Scholar
A. Bharati, R. Singh, M. Vatsa, and K. W. Bowyer. 2016. Detecting Facial Retouching Using Supervised Deep Learning. IEEE Transactions on Information Forensics and Security, Vol. 11, 9 (Sep. 2016), 1903--1913. https://doi.org/10.1109/TIFS.2016.2561898Google ScholarDigital Library
Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A. Efros. 2018. Everybody Dance Now. CoRR, Vol. abs/1808.07371 (2018). arxiv: 1808.07371 http://arxiv.org/abs/1808.07371Google Scholar
M. Chen, J. Fridrich, M. Goljan, and J. Lukas. 2008. Determining Image Origin and Integrity Using Sensor Noise. IEEE Transactions on Information Forensics and Security, Vol. 3, 1 (2008), 74--90.Google ScholarDigital Library
Francois Chollet. 2016. Xception: Deep Learning with Depthwise Separable Convolutions. arxiv: cs.CV/1610.02357Google Scholar
V. Conotter, E. Bodnari, G. Boato, and H. Farid. 2014. Physiologically-based detection of computer generated faces in video. In 2014 IEEE International Conference on Image Processing (ICIP). 248--252. https://doi.org/10.1109/ICIP.2014.7025049Google ScholarCross Ref
Davide Cozzolino, Giovanni Poggi, and Luisa Verdoliva. 2017. Recasting Residual-based Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection. arxiv: cs.CV/1703.04615Google Scholar
D. Dang-Nguyen, G. Boato, and F. G. B. De Natale. 2012. Discrimination between computer generated and natural human faces based on asymmetry information. In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO). 1234--1238.Google Scholar
Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2014. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. CoRR, Vol. abs/1411.4389 (2014). arxiv: 1411.4389 http://arxiv.org/abs/1411.4389Google Scholar
J. Fridrich and J. Kodovsky. 2012. Rich Models for Steganalysis of Digital Images. IEEE Transactions on Information Forensics and Security, Vol. 7, 3 (June 2012), 868--882. https://doi.org/10.1109/TIFS.2012.2190402Google ScholarDigital Library
K. Greff, R. K. Srivastava, J. KoutnÃk, B. R. Steunebrink, and J. Schmidhuber. 2017. LS™: A Search Space Odyssey. IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, 10 (Oct 2017), 2222--2232. https://doi.org/10.1109/TNNLS.2016.2582924Google ScholarCross Ref
D. Guera and E. J. Delp. 2018. Deepfake Video Detection Using Recurrent Neural Networks. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1--6. https://doi.org/10.1109/AVSS.2018.8639163Google Scholar
N. Khanna, G. T.-C. Chiu, J. P. Allebach, and E. J. Delp. 2008. Forensic techniques for classifying scanner, computer generated and digital camera images. In Proc. of IEEE ICASSP. Las Vegas, USA.Google Scholar
Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep Video Portraits. ACM Trans. Graph., Vol. 37, 4, Article 163 (July 2018), 14 pages. https://doi.org/10.1145/3197517.3201283Google ScholarDigital Library
Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2016. Fast Face-swap Using Convolutional Neural Networks. arxiv: cs.CV/1611.09577Google Scholar
S. Lyu and H. Farid. 2005. How realistic is photorealistic? IEEE Transactions on Signal Processing, Vol. 53, 2 (2005), 845--850.Google ScholarDigital Library
Francesco Marra, Cristiano Saltori, Giulia Boato, and Luisa Verdoliva. 2019. Incremental learning for the detection and classification of GAN-generated images. arxiv: cs.CV/1910.01568Google Scholar
F. Matern, C. Riess, and M. Stamminger. 2019. Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). 83--92. https://doi.org/10.1109/WACVW.2019.00020Google ScholarCross Ref
Tian-Tsong Ng, Shih-Fu Chang, Jessie Hsu, Lexing Xie, and Mao-Pei Tsui. 2005. Physics-motivated Features for Distinguishing Photographic Images and Computer Graphics. In Proceedings of the 13th Annual ACM International Conference on Multimedia (Hilton, Singapore) (MULTIMEDIA '05). ACM, New York, NY, USA, 239--248. https://doi.org/10.1145/1101149.1101192Google ScholarDigital Library
Feng Pan and Jiwu Huang. 2011. Discriminating Computer Graphics Images and Natural Images Using Hidden Markov Tree Model. In Digital Watermarking, Hyoung-Joong Kim, Yun Qing Shi, and Mauro Barni (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 23--28.Google Scholar
N. Rahmouni, V. Nozick, J. Yamagishi, and I. Echizen. 2017. Distinguishing computer graphics from natural images using convolution neural networks. In 2017 IEEE Workshop on Information Forensics and Security (WIFS). 1--6. https://doi.org/10.1109/WIFS.2017.8267647Google ScholarCross Ref
Andreas Rö ssler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2018. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces. CoRR, Vol. abs/1803.09179 (2018). arxiv: 1803.09179 http://arxiv.org/abs/1803.09179Google Scholar
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Niessner. 2019. FaceForensics+: Learning to Detect Manipulated Facial Images. In The IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
Ekraam Sabir, Jiaxin Cheng, Ayush Jaiswal, Wael AbdAlmageed, Iacopo Masi, and Prem Natarajan. 2019. Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. arxiv: cs.CV/1905.00582Google Scholar
M. C. Stamm, W. S. Lin, and K. J. R. Liu. 2012. Temporal Forensics and Anti-Forensics for Motion Compensated Video. IEEE Transactions on Information Forensics and Security, Vol. 7, 4 (Aug 2012), 1315--1329. https://doi.org/10.1109/TIFS.2012.2205568Google ScholarDigital Library
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Niessner. 2016. Demo of Face2Face: Real-time Face Capture and Reenactment of RGB Videos. In ACM SIGGRAPH 2016 Emerging Technologies (Anaheim, California) (SIGGRAPH '16). ACM, New York, NY, USA, Article 5, 2 pages. https://doi.org/10.1145/2929464.2929475Google ScholarDigital Library
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2019. CNN-generated images are surprisingly easy to spot... for now. arxiv: cs.CV/1912.11035Google Scholar
Weihong Wang and Hany Farid. 2006. Exposing Digital Forgeries in Video by Detecting Double MPEG Compression. MM and Sec, Vol. 2006, 37--47. https://doi.org/10.1145/1161366.1161375Google Scholar
X. Yang, Y. Li, and S. Lyu. 2019. Exposing Deep Fakes Using Inconsistent Head Poses. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8261--8265. https://doi.org/10.1109/ICASSP.2019.8683164Google ScholarCross Ref

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Soil moisture prediction model based on LSTM and Elman neural network
AISS '22: Proceedings of the 4th International Conference on Advanced Information Science and System

China is a large agricultural country, and in the process of agricultural production, it is very important to make accurate prediction of soil moisture. To address the problems of local minimization and slow convergence of traditional BP (back ...
Read More
Research on financial assets transaction prediction model based on LSTM neural network
Abstract
In recent years, with the breakthrough of big data and deep learning technology in various fields, many scholars have begun to study the stock market time series by using deep learning technology. In the process of model training, the selection of ...
Read More
Layered Exchange Rate Prediction Model Based on LSTM
ICMAI '20: Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence

The prediction of exchange rate is very important for both countries and enterprises. At present, the latest prediction technology is training BP neural network through the recent exchange rate data, and it takes effect in some degree. In view of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security
June 2020
177 pages
ISBN:9781450370509
DOI:10.1145/3369412
General Chairs:
Christian Riess
University of Erlangen-Nuremberg, Germany
,
Franziska Schirrmacher
University of Erlangen-Nuremberg, Germany
,
Program Chairs:
Irene Amerini
Sapienzia University Rome, Italy
,
Paolo Bestagini
Politecnico di Milano, Italy
,
Tomas Pevny
Czech Technical University in Prague, Czech Republic
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 June 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
LSTM
deep learning
multimedia forensics
prediction error
synthetic video
video manipulation
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate128of318submissions,40%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 656
  Total Downloads
- Downloads (Last 12 months)70
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security

ABSTRACT

References

Cited By

Recommendations

Soil moisture prediction model based on LSTM and Elman neural network

Research on financial assets transaction prediction model based on LSTM neural network

Layered Exchange Rate Prediction Model Based on LSTM

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security

ABSTRACT

References

Cited By

Recommendations

Soil moisture prediction model based on LSTM and Elman neural network

Research on financial assets transaction prediction model based on LSTM neural network

Layered Exchange Rate Prediction Model Based on LSTM

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media