An automatic multi-camera-based event extraction system for real soccer videos

Zhang, Kailai; Wu, Ji; Tong, Xiaofeng; Wang, Yumeng

doi:10.1007/s10044-019-00830-2

An automatic multi-camera-based event extraction system for real soccer videos

Industrial and commercial application
Published: 26 June 2019

Volume 23, pages 953–965, (2020)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Kailai Zhang ORCID: orcid.org/0000-0002-9062-0865¹,
Ji Wu¹,
Xiaofeng Tong² &
…
Yumeng Wang²

419 Accesses
2 Citations
Explore all metrics

Abstract

In this article, we propose a novel and effective system based on multiple cameras to extract the events for soccer matches. A precise ontological definition of the soccer events is still an open point. According to our definition, the events include the free kick, corner kick, penalty kick and the goal, because they are the representative shots for the audience to watch. The events are very important for highlights selection and sport data analysis. At present, the events including the ball and players information are selected and labeled manually from the images, which is a big workload for the staffs. Addressing this problem, our system provides an automatic extraction of the events. For soccer videos, our system first uses the local-based deep neural network for the ball and player detection from the input images. Then, we handle with the ball and player bounding boxes separately. For players, a player can be labeled as one of the three types: two teams or the referee, and a novel unsupervised U-encoder is designed for the player labeling. For soccer ball, the application of multiple cameras allows us to refine the ball detection results. We can get the world coordinate of ball according to the camera parameters and then rebuild the ball trajectory and the court in a top view. Based on the reconstructed map, we get the soccer events by motion analysis of ball trajectory and then apply the ball location and player classification results to display the events for each camera. The test results on real videos of European soccer league show the good detection and labeling performance of our system. We find all the events in the test videos. Our proposed system can deal with many complex cases such as occlusion and pose variation that happen frequently in real applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Article 30 January 2023

Mehmet Şirin Gündüz & Gültekin Işık

ByteTrack: Multi-object Tracking by Associating Every Detection Box

Notes

This work is supported by the National Key Research and Development Program of China (No. 2018YFC0116800).
The dataset is provided by Union of European Football Associations.

References

Assfalg J, Bertini M, Del Bimbo A, Nunziati W (2002) Soccer highlights detection and recognition using HMMS. In: Proceedings of the IEEE international conference on multimedia and expo, 2002 (ICME’02), vol 1, pp 825–828
Bayat F, Moin MS, Bayat F (2014) Goal detection in soccer video: role-based events detection approach. Int J Electr Comput Eng 4(6):2088–8708
Google Scholar
Cai Q, Aggarwal JK (1996) Tracking human motion using multiple cameras. In: International conference on pattern recognition, p 68
Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
D’Orazio T, Ancona N, Cicirelli G, Nitti M (2002) A ball detection algorithm for real soccer image sequences. In: Proceedings of the international conference on pattern recognition, vol 1, pp 210–213
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 580–587
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Huang Y, Llach J, Bhagavathy S (2007) Players and ball detection in soccer videos based on color segmentation and shape analysis. In: Proceedings of multimedia content analysis and mining, international workshop (MCAM 2007), Weihai, China, June 30–July 1, 2007, pp 416–425
Javed O, Rasheed Z, Shafique K, Shah M (2003) Tracking across multiple cameras with disjoint views. In: IEEE international conference on computer vision, p 952
Khan S, Shah M (2003) Consistent labeling of tracked objects in multiple cameras with overlapping fields of view. IEEE Trans Pattern Anal Mach Intell 25(10):1355–1360
Article Google Scholar
Khan A, Lazzerini B, Calabrese G, Serafini L (2018) Soccer event detection. In: 4th international conference on image processing and pattern recognition (IPPR 2018). AIRCC Publishing Corporation, pp 119–129
Kim K, Davis LS (2006) Multi-camera tracking and segmentation of occluded people on ground plane using search-guided particle filtering. In: European conference on computer vision, pp 98–109
Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. In: European conference on computer vision, pp 82–96
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
Li Z, Tang J, He X (2018) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Syst 29(5):1947–1960
Article MathSciNet Google Scholar
Li Z, Tang J, Tao M (2018) Deep collaborative embedding for social image understanding. In: IEEE transactions on pattern analysis and machine intelligence, p 1
Liu J, Tong X, Li W, Wang T, Zhang Y, Wang H (2009) Automatic player detection, labeling and tracking in broadcast soccer video. Pattern Recognit Lett 30(2):103–113
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37
Moore BC, Moore BC (1981) Principle component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans Autom Control 26(1):17–32
Article Google Scholar
Peursum P, Venkatesh S, West GAW, Bui HH (2003) Object labelling from human action recognition. In: IEEE international conference on pervasive computing and communications, pp 399–406
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp 6517–6525
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137
Article Google Scholar
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640
Article Google Scholar
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Tong XF, Lu HQ, Liu QS (2004) An effective and fast soccer ball detection and tracking method. In: International conference on pattern recognition, vol 4, pp 795–798
Van der Maaten L, Hinton G, Van Der Maaten L (2017) Visualizing data using t-SNE. J Mach Learn Res 9(2605):2579–2605
MATH Google Scholar
Wang X, Han TX, Yan S (2010) An hog-LBP human detector with partial occlusion handling. In: IEEE international conference on computer vision, pp 32–39
Xu D, Chang SF (2007) Visual event recognition in news video using kernel methods with multi-level temporal alignment. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Xu M, Maddage NC, Xu C, Kankanhalli M, Tian Q (2003) Creating audio keywords for event detection in soccer video. In: ICME, pp 281–284
Yang B, Yan J, Lei Z, Li SZ (2016) Craft objects from images. In: Computer vision and pattern recognition, pp 6043–6051
Ye Q, Huang Q, Gao W, Jiang S (2005) Exciting event detection in broadcast soccer video with mid-level description and incremental learning. In: ACM international conference on multimedia, Singapore, pp 455–458
Yu X, Leong HW, Xu C, Tian Q (2006) Trajectory-based ball detection and tracking in broadcast soccer video. IEEE Trans Multimed 8(6):1164–1178
Article Google Scholar
Yu X, Tian Q (2003) A novel ball detection framework for real soccer video. In: ICME, pp 265–268
Zhang D, Chang SF (2002) Event detection in baseball video using superimposed caption recognition. In: International ACM conference on multimedia, pp 315–318
Zhu X, Jin X, Zhang X, Li C, He F, Wang L (2015) Context-aware local abnormality detection in crowded scene. Sci China Inf Sci 58(5):52110–052110
Article Google Scholar
Zhu X, Jing L, Wang J, Li C, Lu H (2014) Sparse representation for robust abnormality detection in crowded scenes. Pattern Recognit 47(5):1791–1799
Article Google Scholar
Zivkovic Z, Krose B (2004) An EM-like algorithm for color-histogram-based object tracking. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004 (CVPR 2004), vol 1, pp I-798–I-803

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, China
Kailai Zhang & Ji Wu
Application Research Center, Intel China Research Center, Beijing, China
Xiaofeng Tong & Yumeng Wang

Authors

Kailai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ji Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Tong
View author publications
You can also search for this author in PubMed Google Scholar
Yumeng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kailai Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, K., Wu, J., Tong, X. et al. An automatic multi-camera-based event extraction system for real soccer videos. Pattern Anal Applic 23, 953–965 (2020). https://doi.org/10.1007/s10044-019-00830-2

Download citation

Received: 26 August 2018
Accepted: 17 June 2019
Published: 26 June 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s10044-019-00830-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An automatic multi-camera-based event extraction system for real soccer videos

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

ByteTrack: Multi-object Tracking by Associating Every Detection Box

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An automatic multi-camera-based event extraction system for real soccer videos

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

ByteTrack: Multi-object Tracking by Associating Every Detection Box

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation