Abstract
Video highlight detection, or wonderful clip localization, aims at automatically discovering interesting clips in untrimmed videos, which can be applied to a variety of scenarios in real world. With reference to its study, a video dataset of Wonderful Clips of Playing Basketball (WCPB) is developed in this work. The Segment-Convolutional Neural Network (S-CNN), a start-of-the-art model for temporal action localization, is adopted to localize wonderful clips and a two-stream S-CNN is designed which outperforms its former on WCPB. The WCPB dataset presents the specific meaning of wonderful clips and annotations in playing basketball and enables the measurement of performance and progress in other realistic scenarios.
This work was supported in part by National Natural Science Foundation of China under Grant 61622115 and Shanghai Engineering Research Center of Industrial Vision Perception & Intelligent Computing (17DZ2251600).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Escorcia, V., Caba Heilbron, F., Niebles, J.C., Ghanem, B.: DAPs: deep action proposals for action understanding. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 768–784. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_47
Idrees, H., et al.: The THUMOS challenge on action recognition for videos “in the wild”. Comput. Vis. Image Underst. 155, 1–23 (2017)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of CVPR 2014, pp. 1725–1732, June 2014
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of NIPS 2012, pp. 1097–1105, December 2012
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of ICCV 2011, pp. 2556–2563, November 2011
Lin, T., Zhao, X., Shou, Z.: Single shot temporal action detection. In: Proceedings of ACM MM 2017, pp. 988–996, October 2017
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15552-9_29
Oliveira, G.L., Burgard, W., Brox, T.: Efficient deep models for monocular road segmentation. In: Proceedings of IROS 2016, pp. 4885–4891, October 2016
Rui, Y., Gupta, A., Acero, A.: Automatically extracting highlights for TV baseball programs. In: Proceedings of ACM MM 2000, pp. 105–115, October 2000
Shou, Z., Chan, J., Zareian, A., Miyazawa, K., Chang, S.F.: CDC: convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos. In: Proceedings of CVPR 2017, pp. 1417–1426, July 2017
Shou, Z., Wang, D., Chang, S.F.: Temporal action localization in untrimmed videos via multi-stage CNNs. In: Proceedings of CVPR 2016, pp. 1049–1058, June 2016
Soomro, K., Roshan Zamir, A., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild, December 2012. CoRR abs/1212.0402
Tang, H., Kwatra, V., Sargin, M.E., Gargi, U.: Detecting highlights in sports videos: cricket as a test case. In: Proceedings of ICME 2011, pp. 1–6, July 2011
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of ICCV 2015, pp. 4489–4497, December 2015
Wang, J., Xu, C., Chng, E., Tian, Q.: Sports highlight detection from keyword sequences using HMM. In: Proceedings of ICME 2004, pp. 599–602, June 2004
Yao, T., Mei, T., Rui, Y.: Highlight detection with pairwise deep ranking for first-person video summarization. In: Proceedings of CVPR 2016. pp. 982–990, June 2016
Yow, D., Yeo, B., Yeung, M., Liu, B.: Analysis and presentation of soccer highlights from digital video. In: Proceedings of ACCV 1995, pp. 499–503, December 1995
Zhao, Y., Xiong, Y., Wang, L., Wu, Z., Tang, X., Lin, D.: Temporal action detection with structured segment networks. In: Proceedings of ICCV 2017, pp. 2933–2942, October 2017
Zolfaghari, M., Oliveira, G.L., Sedaghat, N., Brox, T.: Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection. In: Proceedings of ICCV 2017, pp. 2923–2932, October 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Q., Chen, L., Wang, H., Liu, X. (2020). Wonderful Clips of Playing Basketball: A Database for Localizing Wonderful Actions. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-37731-1_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37730-4
Online ISBN: 978-3-030-37731-1
eBook Packages: Computer ScienceComputer Science (R0)