Abstract
Recent progress of artificial intelligence makes it easier to edit facial movements in videos or create face substitutions, bringing new challenges to anti-fake-faces techniques. Although multimedia forensics provides many detection algorithms from a traditional point of view, it is increasingly hard to discriminate the fake videos from real ones while they become more sophisticated and plausible with updated forgery technologies. In this paper, we introduce a motion discrepancy based method that can effectively differentiate AI-generated fake videos from real ones. The amplitude of face motions in videos is first magnified, and fake videos will show more serious distortion or flicker than the pristine videos. We pre-trained a deep CNN on frames extracted from the training videos and the output vectors of the frame sequences are used as input of an LSTM at secondary training stage. Our approach is evaluated over a large fake video dataset Faceforensics++ produced by various advanced generation technologies, it shows superior performance contrasted to existing pixel-based fake video forensics approaches.
Similar content being viewed by others
References
Afchar D, V Nozick, J Yamagishi, I Echizen (2018) Mesonet: a compact facial video forgery detection network. IEEE Int Worksh Inform Forens Sec (WIFS). 1–7: IEEE
Alexander O, Rogers M, Lambeth W, Chiang M, Debevec P, (2009) The Digital Emily project: photoreal facial modeling and animation.Acm Siggraph 2009 courses. 12: ACM
Alexander O, M Rogers, W Lambeth, M Chiang, P Debevec (2009) Creating a photoreal digital actor: The digital emily project. Conf Vis Med Prod. 176–187: IEEE
Bayar B, Stamm MC (2018) Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans Inform Forens Sec 13(11):2691–2706
Blanz V, Vetter T (1999) A morphable model for the synthesis of 3D faces. Siggraph 99(1999):187–194
Booth J, Roussos A, Ponniah A, Dunaway D, Zafeiriou S (2018) Large scale 3D morphable models. Int J Comput Vis 126(2–4):233–254
Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2013) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Graph 20(3):413–425
Cozzolino D, L Verdoliva (2019) Noiseprint: a CNN-based camera model fingerprint. IEEE Trans Inform Forens Sec
Cozzolino D, G Poggi, L Verdoliva (2019) Extracting camera-based fingerprints for video forensics. Proc IEEE Conf Comput Vis Patt Recog Worksh 130–137
Engelsma JJ, Kai C, Jain AK (2018) RaspiReader: Open Source Fingerprint Reader. IEEE Trans Patt Anal Mach Intel:1–1
Fei J, Xia Z, Yu P, Xiao F (2020) Adversarial attacks on fingerprint liveness detection. EURASIP J Image Video Proc 1(2020):1
Garrido P, L Valgaerts, O Rehmsen, T Thormahlen, P Perez, C Theobalt (2014) Automatic face reenactment. Proc IEEE Conf Comput Vis Patt Recog 4217–4224
Goodfellow I et al. (2014) Generative adversarial nets. Adv Neural Inf Proces Syst, 2672–2680.
Güera D, Delp EJ (2018) Deepfake video detection using recurrent neural networks. 15th IEEE Int Conf Adv Video Sign Based Surveil (AVSS). 1–6: IEEE
Güera D, S Baireddy, P Bestagini, S Tubaro, EJ Delp (2019) We Need No Pixels: Video Manipulation Detection Using Stream Descriptors. arXiv preprint arXiv:1906.08743
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. Proc IEEE Conf Comput Vis Patt Recog:1125–1134
Kim H et al (2018) Deep video portraits. ACM Trans Graph (TOG) 37(4):163
Koopman M, AM Rodriguez, Z Geradts (2018) detection of Deepfake video manipulation. Conference: IMVIP
Korshunova I, Shi W, Dambre J, Theis L (2017) Fast face-swap using convolutional neural networks. Proc IEEE Int Conf Comput Vision 3677–3685
Korus P, Huang J (2016) Multi-scale analysis strategies in PRNU-based tampering localization. IEEE Trans Inform Forens Sec 12(4):809–824
Lassner C, G Pons-Moll, PV Gehler (2017) A generative model of people in clothing. Proc IEEE Int Conf Comput Vis 853–862
Li Y, S Lyu (2018) Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656, 2
Li Y, M-C Chang, S Lyu (2018) In ictu oculi: Exposing ai created fake videos by detecting eye blinking. IEEE Int Worksh Inform Forens Sec (WIFS). 1–7: IEEE
Liu C, Torralba A, Freeman WT, Durand F, Adelson EH (2005) Motion magnification. ACM Trans Graph (TOG) 24(3):519–526
Ma L, X Jia, Q Sun, B Schiele, T Tuytelaars, L Van Gool (2017) Pose guided person image generation. Adv Neural Inform Proc Syst 406–416
Matern F, C Riess, M Stamminger (2019) Exploiting visual artifacts to expose deepfakes and face manipulations. IEEE Winter Appl Comput Vis Worksh (WACVW). 83–92: IEEE
Nguyen TT, CM Nguyen, DT Nguyen, DT Nguyen, S Nahavandi (2019) Deep Learning for Deepfakes Creation and Detection. arXiv preprint arXiv:1909.11573
Oh T-H et al. (2018) Learning-based video motion magnification. Proc Euro Conf Comput Vis (ECCV) 633–648
Olszewski K et al. (2017) Realistic dynamic facial textures from a single image using gans. Proc IEEE Int Conf Comput Vis 5429–5438
Peng B, W Wang, J Dong, T Tan (2016) Automatic detection of 3d lighting inconsistencies via a facial landmark based morphable model. IEEE Int Conf Image Proc (ICIP). 3932–3936: IEEE
Richardson E, M Sela, R Kimmel (2016) 3D face reconstruction by learning from synthetic data. Fourth Int Conf 3D Vis (3DV). 460–469: IEEE
Richardson E, M Sela, R Or-El, R Kimmel (2017) Learning detailed face reconstruction from a single image. Proc IEEE Conf Comput Vis Patt Recog 1259–1268
Rössler A, D Cozzolino, L Verdoliva, C Riess, J Thies, M Nießner (2019) Faceforensics++: Learning to detect manipulated facial images. arXiv preprint arXiv:1901.08971
Sabir E, Cheng J, Jaiswal A, AbdAlmageed W, Masi I, Natarajan P (2019) Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. Interfaces (GUI) 3:1
Scherhag U, Debiasi L, Rathgeb C, Busch C, Uhl A (2019) Detection of face morphing attacks based on PRNU analysis. IEEE Trans Biomet Behav Ident Sci 1(4):302–317
Siddiqui TA et al (2016) Face anti-spoofing with multifeature videolet aggregation. 2016 23rd Int Conf Patt Recog (ICPR). IEEE
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. Proc IEEE Conf Comput Vis Patt Recog 2818–2826
Tarasiou M, Zafeiriou S (2019) Using Fully Convolutional Neural Networks to detect manipulated images in videos. arXiv preprint arXiv:1911.13269
Thies J, M Zollhofer, M Stamminger, C Theobalt, M Nießner (2016) Face2face: Real-time face capture and reenactment of rgb videos. Proc IEEE Conf Comput Vis Patt Recog 2387–2395
Thies J, Zollhöfer M, Nießner M (2019) Deferred Neural Rendering: Image Synthesis using Neural Textures. arXiv preprint arXiv:1904.12356
Tu X, Zhang H, Xie M, et al. (2019) Enhance the motion cues for face anti-spoofing using CNN-LSTM architecture [J]. arXiv preprint arXiv:1901.05635
Wadhwa N, M Rubinstein, F Durand, WT Freeman (2014) Riesz pyramids for fast phase-based video magnification. IEEE Int Conf Comput Photograp (ICCP). 1–10: IEEE
Wu H-Y, M Rubinstein, E Shih, J Guttag, F Durand, W Freeman (2012) Eulerian video magnification for revealing subtle changes in the world. Eulerian video magnification for revealing subtle changes in the world
Yang X, Y Li, S Lyu (2019) Exposing deep fakes using inconsistent head poses. ICASSP 2019–2019 IEEE Int Conf Acoust, Speech Signal Proce (ICASSP). 8261–8265: IEEE
Yuan C, Chen X, Yu P, Meng R, Cheng W, Wu QMJ, Sun X (2020) Semi-supervised stacked autoencoder-based deep hierarchical semantic feature for real-time fingerprint liveness detection [J]. J Real-Time Image Proc 17(1):55–71
Zakharov E, A Shysheya, E Burkov, V Lempitsky (2019) Few-Shot Adversarial Learning of Realistic Neural Talking Head Models. arXiv preprint arXiv:1905.08233
Zhou P, X Han, VI Morariu, LS Davis (2017) Two-stream neural networks for tampered face detection. IEEE Conf Comput Vis Patt Recog Worksh (CVPRW). 1831–1839: IEEE
Zhou P, X Han, VI Morariu, LS Davis (2018) Learning rich features for image manipulation detection. Proc IEEE Conf Comput Vis Patt Recog 1053–1061
ZhuJ-Y, T Park, P Isola, AA Efros (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. Proc IEEE Int Conf Comput Vis 2223–2232
Zollhöfer M et al. (2018) State of the art on monocular 3D face reconstruction, tracking, and applications. Comput Graph Forum 37 (2): 523–550. Wiley Online Library
Acknowledgements
This work is supported in part by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20181407, in part by the National Natural Science Foundation of China under grant numbers 61672294, in part by Six peak talent project of Jiangsu Province (R2016L13), Qing Lan Project of Jiangsu Province and “333” project of Jiangsu Province, in part by the National Natural Science Foundation of China under grant numbers U1836208, 61502242, 61702276, U1536206, 61772283, 61602253, 61601236, and 61572258, in part by National Key R&D Program of China under grant 2018YFB1003205, in part by NRF-2016R1D1A1B03933294, in part by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20150925 and BK20151530, in part by Humanity and Social Science Youth foundation of Ministry of Education of China (15YJC870021), in part by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) fund, in part by the Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET) fund, China. Zhihua Xia is supported by BK21+ program from the Ministry of Education of Kore.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Jianwei Fei, Zhihua Xia, Peipeng Yu and Fengjun Xiao. The evaluation was done by Jianwei and Peipeng Yu. The first draft of the manuscript was written by Jianwei Fei, Zhihua Xia and Fengjun Xiao, all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
There are no potential conflicts of interest in this work, and no human participants or animals are involved.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fei, J., Xia, Z., Yu, P. et al. Exposing AI-generated videos with motion magnification. Multimed Tools Appl 80, 30789–30802 (2021). https://doi.org/10.1007/s11042-020-09147-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09147-3