ABSTRACT
The ability of artificial intelligence techniques to build synthesized brand new videos or to alter the facial expression of already existing ones has been efficiently demonstrated in the literature. The identification of such new threat generally known as Deepfake, but consisting of different techniques, is fundamental in multimedia forensics. In fact this kind of manipulated information could undermine and easily distort the public opinion on a certain person or about a specific event. Thus, in this paper, a new technique able to distinguish synthetic generated portrait videos from natural ones is introduced by exploiting inconsistencies due to the prediction error in the re-encoding phase. In particular, features based on inter-frame prediction error have been investigated jointly with a Long Short-Term Memory (LSTM) model network able to learn the temporal correlation among consecutive frames. Preliminary results have demonstrated that such sequence-based approach, used to distinguish between original and manipulated videos, highlights promising performances.
- Darius Afchar, Vincent Nozick, Junichi Yamagishi, and I Echizen. 2018. MesoNet: a Compact Facial Video Forgery Detection Network. 1--7. https://doi.org/10.1109/WIFS.2018.8630761Google Scholar
- Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. 2019. Protecting World Leaders Against Deep Fakes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google Scholar
- I. Amerini, R. Caldelli, V. Cappellini, F. Picchioni, and A. Piva. 2009. Analysis of denoising filters for photo response non uniformity noise extraction in source camera identification. In 2009 16th International Conference on Digital Signal Processing. 1--7. https://doi.org/10.1109/ICDSP.2009.5201240Google ScholarCross Ref
- I. Amerini, C. Li, and R. Caldelli. 2019. Social Network Identification Through Image Classification With CNN. IEEE Access, Vol. 7 (2019), 35264--35273. https://doi.org/10.1109/ACCESS.2019.2903876Google ScholarCross Ref
- Belhassen Bayar and Matthew C. Stamm. 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security (Vigo, Galicia, Spain) (MMSec '16). New York, NY, USA, 5--10. https://doi.org/10.1145/2909827.2930786Google Scholar
- A. Bharati, R. Singh, M. Vatsa, and K. W. Bowyer. 2016. Detecting Facial Retouching Using Supervised Deep Learning. IEEE Transactions on Information Forensics and Security, Vol. 11, 9 (Sep. 2016), 1903--1913. https://doi.org/10.1109/TIFS.2016.2561898Google ScholarDigital Library
- Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A. Efros. 2018. Everybody Dance Now. CoRR, Vol. abs/1808.07371 (2018). arxiv: 1808.07371 http://arxiv.org/abs/1808.07371Google Scholar
- M. Chen, J. Fridrich, M. Goljan, and J. Lukas. 2008. Determining Image Origin and Integrity Using Sensor Noise. IEEE Transactions on Information Forensics and Security, Vol. 3, 1 (2008), 74--90.Google ScholarDigital Library
- Francois Chollet. 2016. Xception: Deep Learning with Depthwise Separable Convolutions. arxiv: cs.CV/1610.02357Google Scholar
- V. Conotter, E. Bodnari, G. Boato, and H. Farid. 2014. Physiologically-based detection of computer generated faces in video. In 2014 IEEE International Conference on Image Processing (ICIP). 248--252. https://doi.org/10.1109/ICIP.2014.7025049Google ScholarCross Ref
- Davide Cozzolino, Giovanni Poggi, and Luisa Verdoliva. 2017. Recasting Residual-based Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection. arxiv: cs.CV/1703.04615Google Scholar
- D. Dang-Nguyen, G. Boato, and F. G. B. De Natale. 2012. Discrimination between computer generated and natural human faces based on asymmetry information. In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO). 1234--1238.Google Scholar
- Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2014. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. CoRR, Vol. abs/1411.4389 (2014). arxiv: 1411.4389 http://arxiv.org/abs/1411.4389Google Scholar
- J. Fridrich and J. Kodovsky. 2012. Rich Models for Steganalysis of Digital Images. IEEE Transactions on Information Forensics and Security, Vol. 7, 3 (June 2012), 868--882. https://doi.org/10.1109/TIFS.2012.2190402Google ScholarDigital Library
- K. Greff, R. K. Srivastava, J. KoutnÃk, B. R. Steunebrink, and J. Schmidhuber. 2017. LS™: A Search Space Odyssey. IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, 10 (Oct 2017), 2222--2232. https://doi.org/10.1109/TNNLS.2016.2582924Google ScholarCross Ref
- D. Guera and E. J. Delp. 2018. Deepfake Video Detection Using Recurrent Neural Networks. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1--6. https://doi.org/10.1109/AVSS.2018.8639163Google Scholar
- N. Khanna, G. T.-C. Chiu, J. P. Allebach, and E. J. Delp. 2008. Forensic techniques for classifying scanner, computer generated and digital camera images. In Proc. of IEEE ICASSP. Las Vegas, USA.Google Scholar
- Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep Video Portraits. ACM Trans. Graph., Vol. 37, 4, Article 163 (July 2018), 14 pages. https://doi.org/10.1145/3197517.3201283Google ScholarDigital Library
- Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2016. Fast Face-swap Using Convolutional Neural Networks. arxiv: cs.CV/1611.09577Google Scholar
- S. Lyu and H. Farid. 2005. How realistic is photorealistic? IEEE Transactions on Signal Processing, Vol. 53, 2 (2005), 845--850.Google ScholarDigital Library
- Francesco Marra, Cristiano Saltori, Giulia Boato, and Luisa Verdoliva. 2019. Incremental learning for the detection and classification of GAN-generated images. arxiv: cs.CV/1910.01568Google Scholar
- F. Matern, C. Riess, and M. Stamminger. 2019. Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). 83--92. https://doi.org/10.1109/WACVW.2019.00020Google ScholarCross Ref
- Tian-Tsong Ng, Shih-Fu Chang, Jessie Hsu, Lexing Xie, and Mao-Pei Tsui. 2005. Physics-motivated Features for Distinguishing Photographic Images and Computer Graphics. In Proceedings of the 13th Annual ACM International Conference on Multimedia (Hilton, Singapore) (MULTIMEDIA '05). ACM, New York, NY, USA, 239--248. https://doi.org/10.1145/1101149.1101192Google ScholarDigital Library
- Feng Pan and Jiwu Huang. 2011. Discriminating Computer Graphics Images and Natural Images Using Hidden Markov Tree Model. In Digital Watermarking, Hyoung-Joong Kim, Yun Qing Shi, and Mauro Barni (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 23--28.Google Scholar
- N. Rahmouni, V. Nozick, J. Yamagishi, and I. Echizen. 2017. Distinguishing computer graphics from natural images using convolution neural networks. In 2017 IEEE Workshop on Information Forensics and Security (WIFS). 1--6. https://doi.org/10.1109/WIFS.2017.8267647Google ScholarCross Ref
- Andreas Rö ssler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2018. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces. CoRR, Vol. abs/1803.09179 (2018). arxiv: 1803.09179 http://arxiv.org/abs/1803.09179Google Scholar
- Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Niessner. 2019. FaceForensics+: Learning to Detect Manipulated Facial Images. In The IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
- Ekraam Sabir, Jiaxin Cheng, Ayush Jaiswal, Wael AbdAlmageed, Iacopo Masi, and Prem Natarajan. 2019. Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. arxiv: cs.CV/1905.00582Google Scholar
- M. C. Stamm, W. S. Lin, and K. J. R. Liu. 2012. Temporal Forensics and Anti-Forensics for Motion Compensated Video. IEEE Transactions on Information Forensics and Security, Vol. 7, 4 (Aug 2012), 1315--1329. https://doi.org/10.1109/TIFS.2012.2205568Google ScholarDigital Library
- Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Niessner. 2016. Demo of Face2Face: Real-time Face Capture and Reenactment of RGB Videos. In ACM SIGGRAPH 2016 Emerging Technologies (Anaheim, California) (SIGGRAPH '16). ACM, New York, NY, USA, Article 5, 2 pages. https://doi.org/10.1145/2929464.2929475Google ScholarDigital Library
- Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2019. CNN-generated images are surprisingly easy to spot... for now. arxiv: cs.CV/1912.11035Google Scholar
- Weihong Wang and Hany Farid. 2006. Exposing Digital Forgeries in Video by Detecting Double MPEG Compression. MM and Sec, Vol. 2006, 37--47. https://doi.org/10.1145/1161366.1161375Google Scholar
- X. Yang, Y. Li, and S. Lyu. 2019. Exposing Deep Fakes Using Inconsistent Head Poses. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8261--8265. https://doi.org/10.1109/ICASSP.2019.8683164Google ScholarCross Ref
- Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos
Recommendations
Soil moisture prediction model based on LSTM and Elman neural network
AISS '22: Proceedings of the 4th International Conference on Advanced Information Science and SystemChina is a large agricultural country, and in the process of agricultural production, it is very important to make accurate prediction of soil moisture. To address the problems of local minimization and slow convergence of traditional BP (back ...
Research on financial assets transaction prediction model based on LSTM neural network
AbstractIn recent years, with the breakthrough of big data and deep learning technology in various fields, many scholars have begun to study the stock market time series by using deep learning technology. In the process of model training, the selection of ...
Layered Exchange Rate Prediction Model Based on LSTM
ICMAI '20: Proceedings of the 2020 5th International Conference on Mathematics and Artificial IntelligenceThe prediction of exchange rate is very important for both countries and enterprises. At present, the latest prediction technology is training BP neural network through the recent exchange rate data, and it takes effect in some degree. In view of the ...
Comments