Skip to main content
Log in

Exposing AI-generated videos with motion magnification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recent progress of artificial intelligence makes it easier to edit facial movements in videos or create face substitutions, bringing new challenges to anti-fake-faces techniques. Although multimedia forensics provides many detection algorithms from a traditional point of view, it is increasingly hard to discriminate the fake videos from real ones while they become more sophisticated and plausible with updated forgery technologies. In this paper, we introduce a motion discrepancy based method that can effectively differentiate AI-generated fake videos from real ones. The amplitude of face motions in videos is first magnified, and fake videos will show more serious distortion or flicker than the pristine videos. We pre-trained a deep CNN on frames extracted from the training videos and the output vectors of the frame sequences are used as input of an LSTM at secondary training stage. Our approach is evaluated over a large fake video dataset Faceforensics++ produced by various advanced generation technologies, it shows superior performance contrasted to existing pixel-based fake video forensics approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Afchar D, V Nozick, J Yamagishi, I Echizen (2018) Mesonet: a compact facial video forgery detection network. IEEE Int Worksh Inform Forens Sec (WIFS). 1–7: IEEE

  2. Alexander O, Rogers M, Lambeth W, Chiang M, Debevec P, (2009) The Digital Emily project: photoreal facial modeling and animation.Acm Siggraph 2009 courses. 12: ACM

  3. Alexander O, M Rogers, W Lambeth, M Chiang, P Debevec (2009) Creating a photoreal digital actor: The digital emily project. Conf Vis Med Prod. 176–187: IEEE

  4. Bayar B, Stamm MC (2018) Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans Inform Forens Sec 13(11):2691–2706

    Article  Google Scholar 

  5. Blanz V, Vetter T (1999) A morphable model for the synthesis of 3D faces. Siggraph 99(1999):187–194

    Google Scholar 

  6. Booth J, Roussos A, Ponniah A, Dunaway D, Zafeiriou S (2018) Large scale 3D morphable models. Int J Comput Vis 126(2–4):233–254

    Article  MathSciNet  Google Scholar 

  7. Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2013) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Graph 20(3):413–425

    Google Scholar 

  8. Cozzolino D, L Verdoliva (2019) Noiseprint: a CNN-based camera model fingerprint. IEEE Trans Inform Forens Sec

  9. Cozzolino D, G Poggi, L Verdoliva (2019) Extracting camera-based fingerprints for video forensics. Proc IEEE Conf Comput Vis Patt Recog Worksh 130–137

  10. Engelsma JJ, Kai C, Jain AK (2018) RaspiReader: Open Source Fingerprint Reader. IEEE Trans Patt Anal Mach Intel:1–1

  11. Fei J, Xia Z, Yu P, Xiao F (2020) Adversarial attacks on fingerprint liveness detection. EURASIP J Image Video Proc 1(2020):1

    Google Scholar 

  12. Garrido P, L Valgaerts, O Rehmsen, T Thormahlen, P Perez, C Theobalt (2014) Automatic face reenactment. Proc IEEE Conf Comput Vis Patt Recog 4217–4224

  13. Goodfellow I et al. (2014) Generative adversarial nets. Adv Neural Inf Proces Syst, 2672–2680.

  14. Güera D, Delp EJ (2018) Deepfake video detection using recurrent neural networks. 15th IEEE Int Conf Adv Video Sign Based Surveil (AVSS). 1–6: IEEE

  15. Güera D, S Baireddy, P Bestagini, S Tubaro, EJ Delp (2019) We Need No Pixels: Video Manipulation Detection Using Stream Descriptors. arXiv preprint arXiv:1906.08743

  16. Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. Proc IEEE Conf Comput Vis Patt Recog:1125–1134

  17. Kim H et al (2018) Deep video portraits. ACM Trans Graph (TOG) 37(4):163

    Google Scholar 

  18. Koopman M, AM Rodriguez, Z Geradts (2018) detection of Deepfake video manipulation. Conference: IMVIP

  19. Korshunova I, Shi W, Dambre J, Theis L (2017) Fast face-swap using convolutional neural networks. Proc IEEE Int Conf Comput Vision 3677–3685

  20. Korus P, Huang J (2016) Multi-scale analysis strategies in PRNU-based tampering localization. IEEE Trans Inform Forens Sec 12(4):809–824

    Article  Google Scholar 

  21. Lassner C, G Pons-Moll, PV Gehler (2017) A generative model of people in clothing. Proc IEEE Int Conf Comput Vis 853–862

  22. Li Y, S Lyu (2018) Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656, 2

  23. Li Y, M-C Chang, S Lyu (2018) In ictu oculi: Exposing ai created fake videos by detecting eye blinking. IEEE Int Worksh Inform Forens Sec (WIFS). 1–7: IEEE

  24. Liu C, Torralba A, Freeman WT, Durand F, Adelson EH (2005) Motion magnification. ACM Trans Graph (TOG) 24(3):519–526

    Article  Google Scholar 

  25. Ma L, X Jia, Q Sun, B Schiele, T Tuytelaars, L Van Gool (2017) Pose guided person image generation. Adv Neural Inform Proc Syst 406–416

  26. Matern F, C Riess, M Stamminger (2019) Exploiting visual artifacts to expose deepfakes and face manipulations. IEEE Winter Appl Comput Vis Worksh (WACVW). 83–92: IEEE

  27. Nguyen TT, CM Nguyen, DT Nguyen, DT Nguyen, S Nahavandi (2019) Deep Learning for Deepfakes Creation and Detection. arXiv preprint arXiv:1909.11573

  28. Oh T-H et al. (2018) Learning-based video motion magnification. Proc Euro Conf Comput Vis (ECCV) 633–648

  29. Olszewski K et al. (2017) Realistic dynamic facial textures from a single image using gans. Proc IEEE Int Conf Comput Vis 5429–5438

  30. Peng B, W Wang, J Dong, T Tan (2016) Automatic detection of 3d lighting inconsistencies via a facial landmark based morphable model. IEEE Int Conf Image Proc (ICIP). 3932–3936: IEEE

  31. Richardson E, M Sela, R Kimmel (2016) 3D face reconstruction by learning from synthetic data. Fourth Int Conf 3D Vis (3DV). 460–469: IEEE

  32. Richardson E, M Sela, R Or-El, R Kimmel (2017) Learning detailed face reconstruction from a single image. Proc IEEE Conf Comput Vis Patt Recog 1259–1268

  33. Rössler A, D Cozzolino, L Verdoliva, C Riess, J Thies, M Nießner (2019) Faceforensics++: Learning to detect manipulated facial images. arXiv preprint arXiv:1901.08971

  34. Sabir E, Cheng J, Jaiswal A, AbdAlmageed W, Masi I, Natarajan P (2019) Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. Interfaces (GUI) 3:1

    Google Scholar 

  35. Scherhag U, Debiasi L, Rathgeb C, Busch C, Uhl A (2019) Detection of face morphing attacks based on PRNU analysis. IEEE Trans Biomet Behav Ident Sci 1(4):302–317

    Article  Google Scholar 

  36. Siddiqui TA et al (2016) Face anti-spoofing with multifeature videolet aggregation. 2016 23rd Int Conf Patt Recog (ICPR). IEEE

  37. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. Proc IEEE Conf Comput Vis Patt Recog 2818–2826

  38. Tarasiou M, Zafeiriou S (2019) Using Fully Convolutional Neural Networks to detect manipulated images in videos. arXiv preprint arXiv:1911.13269

  39. Thies J, M Zollhofer, M Stamminger, C Theobalt, M Nießner (2016) Face2face: Real-time face capture and reenactment of rgb videos. Proc IEEE Conf Comput Vis Patt Recog 2387–2395

  40. Thies J, Zollhöfer M, Nießner M (2019) Deferred Neural Rendering: Image Synthesis using Neural Textures. arXiv preprint arXiv:1904.12356

  41. Tu X, Zhang H, Xie M, et al. (2019) Enhance the motion cues for face anti-spoofing using CNN-LSTM architecture [J]. arXiv preprint arXiv:1901.05635

  42. Wadhwa N, M Rubinstein, F Durand, WT Freeman (2014) Riesz pyramids for fast phase-based video magnification. IEEE Int Conf Comput Photograp (ICCP). 1–10: IEEE

  43. Wu H-Y, M Rubinstein, E Shih, J Guttag, F Durand, W Freeman (2012) Eulerian video magnification for revealing subtle changes in the world. Eulerian video magnification for revealing subtle changes in the world

  44. Yang X, Y Li, S Lyu (2019) Exposing deep fakes using inconsistent head poses. ICASSP 2019–2019 IEEE Int Conf Acoust, Speech Signal Proce (ICASSP). 8261–8265: IEEE

  45. Yuan C, Chen X, Yu P, Meng R, Cheng W, Wu QMJ, Sun X (2020) Semi-supervised stacked autoencoder-based deep hierarchical semantic feature for real-time fingerprint liveness detection [J]. J Real-Time Image Proc 17(1):55–71

    Article  Google Scholar 

  46. Zakharov E, A Shysheya, E Burkov, V Lempitsky (2019) Few-Shot Adversarial Learning of Realistic Neural Talking Head Models. arXiv preprint arXiv:1905.08233

  47. Zhou P, X Han, VI Morariu, LS Davis (2017) Two-stream neural networks for tampered face detection. IEEE Conf Comput Vis Patt Recog Worksh (CVPRW). 1831–1839: IEEE

  48. Zhou P, X Han, VI Morariu, LS Davis (2018) Learning rich features for image manipulation detection. Proc IEEE Conf Comput Vis Patt Recog 1053–1061

  49. ZhuJ-Y, T Park, P Isola, AA Efros (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. Proc IEEE Int Conf Comput Vis 2223–2232

  50. Zollhöfer M et al. (2018) State of the art on monocular 3D face reconstruction, tracking, and applications. Comput Graph Forum 37 (2): 523–550. Wiley Online Library

Download references

Acknowledgements

This work is supported in part by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20181407, in part by the National Natural Science Foundation of China under grant numbers 61672294, in part by Six peak talent project of Jiangsu Province (R2016L13), Qing Lan Project of Jiangsu Province and “333” project of Jiangsu Province, in part by the National Natural Science Foundation of China under grant numbers U1836208, 61502242, 61702276, U1536206, 61772283, 61602253, 61601236, and 61572258, in part by National Key R&D Program of China under grant 2018YFB1003205, in part by NRF-2016R1D1A1B03933294, in part by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20150925 and BK20151530, in part by Humanity and Social Science Youth foundation of Ministry of Education of China (15YJC870021), in part by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) fund, in part by the Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET) fund, China. Zhihua Xia is supported by BK21+ program from the Ministry of Education of Kore.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Jianwei Fei, Zhihua Xia, Peipeng Yu and Fengjun Xiao. The evaluation was done by Jianwei and Peipeng Yu. The first draft of the manuscript was written by Jianwei Fei, Zhihua Xia and Fengjun Xiao, all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhihua Xia.

Ethics declarations

Conflict of interest

There are no potential conflicts of interest in this work, and no human participants or animals are involved.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fei, J., Xia, Z., Yu, P. et al. Exposing AI-generated videos with motion magnification. Multimed Tools Appl 80, 30789–30802 (2021). https://doi.org/10.1007/s11042-020-09147-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09147-3

Keywords

Navigation