GIAD-ST: Detecting anomalies in human monitoring based on generative inpainting via self-supervised multi-task learning

Dong, Ning; Suzuki, Einoshin

doi:10.1007/s10844-022-00722-8

GIAD-ST: Detecting anomalies in human monitoring based on generative inpainting via self-supervised multi-task learning

Published: 30 June 2022

Volume 59, pages 733–754, (2022)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

418 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we propose a generative inpainting-based method to detect anomalous images in human monitoring via self-supervised multi-task learning. Our previous methods, where a deep captioning model is employed to find salient regions in an image and exploit caption information for each of them, detect anomalies in human monitoring at region level by considering the relations of overlapping regions. Here, we focus on image-level detection, which is preferable when humans prefer an immediate alert and handle them by themselves. However, in such a setting, the methods could show their deficiencies due to their reliance on the salient regions and their neglect of non-overlapping regions. Moreover, they take all regions equally important, which causes the performance to be easily influenced by unimportant regions. To alleviate these problems in image-level detection, we first employ inpainting techniques with a designed local and global loss to better capture the relation between a region and its surrounding area in an image. Then, we propose an attention-based Gaussian weighting anomaly score to combine all the regions by considering their importance for mitigating the influences of unimportant regions. The attention mechanism exploits multi-task learning for higher accuracy. Extensive experiments on two real-world datasets demonstrate the superiority of our method in terms of AUROC, precision, and recall over the baselines. The AUROC has improved from 0.933 to 0.989 and from 0.911 to 0.953 compared with the best baseline on the two datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

GIAD: Generative Inpainting-Based Anomaly Detection via Self-Supervised Learning for Human Monitoring

Inpainting Transformer for Anomaly Detection

Bringing Attention to Image Anomaly Detection

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Notes

Hatae et al. (2020) and Fadjrimiratno et al. (2021) also considered anomalous positions, which are omitted in this paper because our target is human monitoring. We believe that such anomalies are rather uncommon in human monitoring.
Refer to https://github.com/Vious/LBAM_Pytorch/generateMask.py for more details in generating partial masks.
We adopted the standard implementation from their public code, which is available at: https://github.com/jcjohnson/densecap.
https://github.com/JiahuiYu/generative_inpainting
For simplicity, we use such expressions to represent the setting of a layer. e.g., K5S2P1 represents the kernel size is 5, the stride size is 2, and the padding size is 1.
https://pytorch.org/
https://github.com/samet-akcay/ganomaly
https://github.com/samet-akcay/skip-ganomaly
https://github.com/Runinho/pytorch-cutpaste
https://github.com/plutoyuxie/Reconstruction-by-inpainting-for-visual-anomaly-detection
We did not use their real-time detection on an autonomous robot.
Their first target was anomalous image region detection.

References

Akcay, S., Atapour-Abarghouei, A., & Breckon, T.P. (2018). Ganomaly: Semi-supervised anomaly detection via Adversarial training. In Asian conference on computer vision, ACCV (pp. 622–637). https://doi.org/10.1007/978-3-030-20893-6_39
Akcay, S., Atapour-Abarghouei, A., & Breckon, T. P. (2019). Skip-Ganomaly: Skip connected and adversarially trained encoder-decoder anomaly detection. In International joint conference on neural network, IJCNN (pp. 1–8). https://doi.org/10.1109/IJCNN.2019.8851808
Arjovsky, M., & Bottou, L. (2017). Towards principled methods for training generative adversarial networks. In International conference on learning representations, ICLR.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 1–58. https://doi.org/10.1145/1541880.1541882.
Article Google Scholar
Chen, T., Zhai, X., Ritter, M., & et al. (2019). Self-supervised GANs via auxiliary rotation loss. In Conference on computer vision and pattern recognition, CVPR (pp. 12154–12163). https://doi.org/10.1109/CVPR.2019.01243
Choi, M.J., Torralba, A., & Willsky, A.S. (2012). Context models and out-of-context objects. Pattern Recognition Letters, 33(7), 853–862. https://doi.org/10.1016/j.patrec.2011.12.004.
Article Google Scholar
Deguchi, Y., Takayama, D., Takano, S., & et al. (2017). Skeleton clustering by multi-robot monitoring for fall risk discovery. Journal of Intelligent Information Systems, 48(1), 75–115. https://doi.org/10.1007/s10844-015-0392-1.
Article Google Scholar
Dong, N., Hatae, Y., Fadjrimiratno, M.F., & et al. (2020). Experimental evaluation of GAN-based one-class anomaly detection on office monitoring. In International symposium on methodologies for intelligent systems, ISMIS (pp. 214–224). https://doi.org/10.1007/978-3-030-59491-6_20
Dong, N., & Suzuki, E. (2021). GIAD: Generative inpainting-based anomaly detection via self-supervised learning for human monitoring. In Pacific Rim international conference on artificial intelligence, PRICAI, Part II (pp. 418–432). https://doi.org/10.1007/978-3-030-89363-7_32
Esterwood, C., & Robert, L.P. (2020). Personality in healthcare human robot interaction (H-HRI) a literature review and brief critique. In International conference on human-agent interaction, HAI (pp. 87–95). https://doi.org/10.1145/3406499.3415075
Fadjrimiratno, M.F., Hatae, Y., Matsukawa, T., & et al. (2021). Detecting anomalies from human activities by an autonomous mobile robot based on “Fast and Slow” thinking. In International joint conference on computer vision, imaging and computer graphics theory and applications, VISIGRAPP, Subvolume for VISAPP (Vol. 5 pp. 943–953). https://doi.org/10.5220/0010313509430953
Gidaris, S., Singh, P., & Komodakis, N. (2018). Unsupervised representation learning by predicting image rotations. In International conference on learning representations, ICLR.
Godard, C., Mac Aodha, O., & Brostow, G.J. (2017). Unsupervised monocular depth estimation with left-right consistency. In Conference on Computer Vision and Pattern Recognition, CVPR (pp. 270–279). https://doi.org/10.1109/CVPR.2017.699
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., & et al. (2014). Generative adversarial nets. In Neural information processing systems, NIPS (pp. 2672–2680).
Hatae, Y., Yang, Q., Fadjrimiratno, M.F., & et al. (2020). Detecting anomalous regions from an image based on deep captioning. In International joint conference on computer vision, imaging and computer graphics theory and applications, VISIGRAPP, Subvolume for VISAPP (Vol. 5 pp. 326–335). https://doi.org/10.5220/0008949603260335
Johnson, J., Karpathy, A., & Fei-Fei, L. (2016). Densecap: Fully convolutional localization networks for dense captioning. In Conference on computer vision and pattern recognition, CVPR (pp. 4565–4574). https://doi.org/10.1109/CVPR.2016.494
Kahneman, D. (2011). Thinking, fast and slow. New York: Macmillan.
Google Scholar
Kimura, D., Chaudhury, S., Narita, M., & et al. (2020). Adversarial discriminative attention for robust anomaly detection. In Winter conference on applications of computer vision, WACV (pp. 2172–2181). https://doi.org/10.1109/WACV45572.2020.9093428
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations, ICLR.
Krishna, R., Zhu, Y., Groth, O., & et al. (2017). Visual Genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1), 32–73. https://doi.org/10.1007/s11263-016-0981-7.
Article MathSciNet Google Scholar
Lawson, W., Bekele, E., & Sullivan, K. (2017). Finding anomalies with generative adversarial networks for a Patrolbot. In Conference on computer vision and pattern recognition, CVPR Workshops (pp. 12–13). https://doi.org/10.1109/CVPRW.2017.68
Li, C.-L., Sohn, K., Yoon, J., & et al. (2021). CutPaste: Self-supervised learning for anomaly detection and localization. In Conference on computer vision and pattern recognition, CVPR (pp. 9664–9674).
Liu, H., & Hoeber, O. (2011). A Luhn-inspired vector re-weighting approach for improving personalized web search. In International conferences on web intelligence and intelligent agent technology (pp. 301–305). https://doi.org/10.1109/WI-IAT.2011.130
Liu, Z., Nie, Y., Long, C., & et al. (2021). A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In International conference on computer vision, ICCV (pp. 13588–13597). https://doi.org/10.1109/ICCV48922.2021.01333
Liu, G., Reda, F.A., Shih, K.J., & et al. (2018). Image inpainting for irregular holes using partial convolutions. In European conference on computer vision, ECCV (pp. 85–100). https://doi.org/10.1007/978-3-030-01252-6_6
Liu, G., Zhang, Q., Cao, Y., & et al. (2021). Online human action recognition with spatial and temporal skeleton features using a distributed camera network. International Journal of Intelligent Systems, 36(12), 7389–7411. https://doi.org/10.1002/int.22591.
Article Google Scholar
Miyato, T., Kataoka, T., Koyama, M., & et al. (2018). Spectral normalization for generative adversarial networks. In International conference on learning representations, ICLR.
Nguyen, B., Feldman, A., Bethapudi, S., & et al. (2021). Unsupervised region-based anomaly detection in brain MRI with adversarial image inpainting. In International symposium on biomedical imaging, ISBI (pp. 1127–1131). https://doi.org/10.1109/ISBI48211.2021.9434115
Oh, J., Kim, H.-I., & Park, R.-H. (2017). Context-based abnormal object detection using the fully-connected conditional random fields. Pattern Recognition Letters, 98, 16–25. https://doi.org/10.1016/j.patrec.2017.08.003.
Article Google Scholar
Pang, G., Shen, C., Cao, L., & et al. (2021). Deep learning for anomaly detection: A review. ACM Computing Surveys, 54(2), 1–38. https://doi.org/10.1145/3439950.
Article Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., & et al. (2016). Context encoders: Feature learning by inpainting. In Conference on computer vision and pattern recognition, CVPR (pp. 2536–2544). https://doi.org/10.1109/CVPR.2016.278
Ravanbakhsh, M., Nabi, M., Sangineto, E., & et al. (2017). Abnormal event detection in videos using generative adversarial nets. In International conference on image processing, ICIP (pp. 1577–1581). https://doi.org/10.1109/ICIP.2017.8296547
Ravanbakhsh, M., Sangineto, E., Nabi, M., & et al. (2019). Training adversarial discriminators for cross-channel abnormal event detection in crowds. In IEEE winter conference on applications of computer vision, WACV (pp. 1896–1904). https://doi.org/10.1109/WACV.2019.00206
Sabokrou, M., Khalooei, M., Fathy, M., & et al. (2018). Adversarially learned one-class classifier for novelty detection. In Conference on computer vision and pattern recognition, CVPR (pp. 3379–3388). https://doi.org/10.1109/CVPR.2018.00356
Schlegl, T., Seeböck, P., Waldstein, S.M., & et al. (2017). Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In Information processing in medical imaging, IPMI (pp. 146–157). https://doi.org/10.1007/978-3-319-59050-9_12
Schlegl, T., Seeböck, P., Waldstein, S.M., & et al. (2019). F-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical Image Analysis, 54, 30–44. https://doi.org/10.1016/j.media.2019.01.010.
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., & et al (2017). Grad-Cam: Visual explanations from deep networks via gradient-based localization. In International conference on computer vision, ICCV (pp. 618–626). https://doi.org/10.1109/ICCV.2017.74
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations, ICLR.
Singh, M., Mandal, M.K., & Basu, A. (2005). Gaussian and Laplacian of gaussian weighting functions for robust feature based tracking. Pattern Recognition Letters, 26(13), 1995–2005. https://doi.org/10.1016/j.patrec.2005.03.015.
Article Google Scholar
Sultani, W., Chen, C., & Shah, M. (2018). Real-world anomaly detection in surveillance videos. In Conference on computer vision and pattern recognition, CVPR (pp. 6479–6488). https://doi.org/10.1109/CVPR.2018.00678
Wang, Z., Bovik, A.C., Sheikh, H.R., & et al. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. https://doi.org/10.1109/TIP.2003.819861.
Article Google Scholar
Xie, C., Liu, S., Li, C., & et al. (2019). Image inpainting with learnable bidirectional attention maps. In International conference on computer vision, ICCV (pp. 8858–8867). https://doi.org/10.1109/ICCV.2019.00895
Yu, J., Lin, Z., Yang, J., & et al. (2018). Generative image inpainting with contextual attention. In Conference on computer vision and pattern recognition, CVPR (pp. 5505–5514). https://doi.org/10.1109/CVPR.2018.00577
Yu, J., Lin, Z., Yang, J., & et al. (2019). Free-form image inpainting with gated convolution. In International conference on computer vision, ICCV (pp. 4471–4480). https://doi.org/10.1109/ICCV.2019.00457
Zaheer, M. Z., Lee, J.-H., Astrid, M., & et al. (2020). Old is Gold: Redefining the adversarially learned one-class classifier training paradigm. In Conference on computer vision and pattern recognition, CVP (pp. 14183–14193). https://doi.org/10.1109/CVPR42600.2020.01419
Zavrtanik, V., Kristan, M., & Skočaj, D. (2021). Reconstruction by inpainting for visual anomaly detection. Pattern Recognition, 112(107706). https://doi.org/10.1016/j.patcog.2020.107706.
Zhang, Y., Bai, Y., Ding, M., & et al. (2020). Multi-task generative adversarial network for detecting small objects in the wild. International Journal of Computer Vision, 128(6), 1810–1828. https://doi.org/10.1007/s11263-020-01301-6.
Article MathSciNet Google Scholar
Zhang, K., Fadjrimiratno, M.F., & Suzuki, E. (2021). Context-based anomaly detection via spatial attributed graphs in human monitoring. In International conference on neural information processing, ICONIP (pp. 450–463). https://doi.org/10.1007/978-3-030-92185-9_37
Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: An efficient data clustering method for very large databases. ACM Sigmod Record, 25(2), 103–114. https://doi.org/10.1145/233269.233324.
Article Google Scholar
Zhang, M., Tseng, C., & Kreiman, G. (2020). Putting visual object recognition in context. In Conference on computer vision and pattern recognition, CVPR (pp. 12985–12994). https://doi.org/10.1109/CVPR42600.2020.01300
Zhao, H., Gallo, O., Frosio, I., & et al. (2017). Loss functions for image restoration with neural networks. IEEE IEEE Transactions on Computational Imaging, 3(1), 47–57. https://doi.org/10.1109/TCI.2016.2644865.
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by Japan Science and Technology Agency (JST) SPRING, Grant Number JPMJSP2136.

Funding

Japan Science and Technology Agency (JST) SPRING, Grant Number JPMJSP2136.

Author information

Authors and Affiliations

ISEE, Kyushu University, Fukuoka, 8190395, Japan
Ning Dong & Einoshin Suzuki

Authors

Ning Dong
View author publications
You can also search for this author in PubMed Google Scholar
Einoshin Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ning Dong or Einoshin Suzuki.

Ethics declarations

Conflict of Interests

No known competing financial interests or personal relationships have appeared to influence the work reported in this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, N., Suzuki, E. GIAD-ST: Detecting anomalies in human monitoring based on generative inpainting via self-supervised multi-task learning. J Intell Inf Syst 59, 733–754 (2022). https://doi.org/10.1007/s10844-022-00722-8

Download citation

Received: 17 January 2022
Revised: 07 June 2022
Accepted: 08 June 2022
Published: 30 June 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10844-022-00722-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GIAD-ST: Detecting anomalies in human monitoring based on generative inpainting via self-supervised multi-task learning

Abstract

Access this article

Similar content being viewed by others

GIAD: Generative Inpainting-Based Anomaly Detection via Self-Supervised Learning for Human Monitoring

Inpainting Transformer for Anomaly Detection

Bringing Attention to Image Anomaly Detection

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GIAD-ST: Detecting anomalies in human monitoring based on generative inpainting via self-supervised multi-task learning

Abstract

Access this article

Similar content being viewed by others

GIAD: Generative Inpainting-Based Anomaly Detection via Self-Supervised Learning for Human Monitoring

Inpainting Transformer for Anomaly Detection

Bringing Attention to Image Anomaly Detection

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation