Abstract
Recently, the development of computer vision technologies has shown excellent performance in complex tasks such as behavioral recognition. Therefore, several studies propose datasets for behavior recognition, including sign language recognition. In many countries, researchers are carrying out studies to automatically recognize and interpret sign language to facilitate communication with deaf people. However, there is no dataset aiming at sign language recognition that is used in Korea yet, and research on this is insufficient. Since sign language varies from country to country, it is valuable to build a dataset for Korean sign language. Therefore, this paper aims to propose a dataset of videos of isolated signs from Korean sign language that can also be used for behavior recognition using deep learning. We present the Korean Sign Language (KSL) dataset. The dataset is composed of 77 words of Korean sign language video clips conducted by 20 deaf people. We train and evaluate this dataset in deep learning networks that have recently achieved excellent performance in the behavior recognition task. Also, we have confirmed through the deconvolution-based visualization method that the deep learning network fully understands the characteristics of the dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24673-2_3
Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Chai, X., Wanga, H., Zhoub, M., Wub, G., Lic, H., Chena, X.: DEVISIGN: dataset and evaluation for 3D sign language recognition, Technical report, Beijing (2015)
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Forster, J., et al.: RWTH-PHOENIX-Weather: a large vocabulary sign language recognition and translation corpus. In: LREC, pp. 3785–3789 (2012)
Huang, J., Zhou, W., Zhang, Q., Li, H., Li, W.: Video-based sign language recognition without temporal segmentation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)
Kapuscinski, T., Oszust, M., Wysocki, M., Warchol, D.: Recognition of hand gestures observed by depth cameras. Int. J. Adv. Rob. Syst. 12(4), 36 (2015)
Kay, W., et al.: The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017)
Lu, P., Huenerfauth, M.: Collecting and evaluating the CUNY ASL corpus for research on American Sign Language animation. Comput. Speech Lang. 28(3), 812–831 (2014)
Martínez, A.M., Wilbur, R.B., Shay, R., Kak, A.C.: Purdue RVL-SLLL ASL database for automatic recognition of American sign language. In: Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces, pp. 167–172. IEEE (2002)
Neidle, C., Thangali, A., Sclaroff, S.: Challenges in development of the American Sign Language Lexicon Video Dataset (ASLLVD) corpus. In: Proceedings of the 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, Language Resources and Evaluation Conference (LREC) 2012, CiteSeer (2012)
Oszust, M., Wysocki, M.: Polish sign language words recognition with Kinect. In: 2013 6th International Conference on Human System Interactions (HSI), pp. 219–226. IEEE (2013)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Von Agris, U., Zieren, J., Canzler, U., Bauer, B., Kraiss, K.F.: Recent developments in visual sign language recognition. Univ. Access Inf. Soc. 6(4), 323–362 (2008)
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Yang, S., Zhu, Q.: Video-based Chinese sign language recognition using convolutional neural network. In: 2017 IEEE 9th International Conference on Communication Software and Networks (ICCSN), pp. 929–934. IEEE (2017)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, S., Jung, S., Kang, H., Kim, C. (2020). The Korean Sign Language Dataset for Action Recognition. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-37731-1_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37730-4
Online ISBN: 978-3-030-37731-1
eBook Packages: Computer ScienceComputer Science (R0)