skip to main content
research-article

Deep Semantic Mapping for Heterogeneous Multimedia Transfer Learning Using Co-Occurrence Data

Published:24 January 2019Publication History
Skip Abstract Section

Abstract

Transfer learning, which focuses on finding a favorable representation for instances of different domains based on auxiliary data, can mitigate the divergence between domains through knowledge transfer. Recently, increasing efforts on transfer learning have employed deep neural networks (DNN) to learn more robust and higher level feature representations to better tackle cross-media disparities. However, only a few articles consider the correction and semantic matching between multi-layer heterogeneous domain networks. In this article, we propose a deep semantic mapping model for heterogeneous multimedia transfer learning (DHTL) using co-occurrence data. More specifically, we integrate the DNN with canonical correlation analysis (CCA) to derive a deep correlation subspace as the joint semantic representation for associating data across different domains. In the proposed DHTL, a multi-layer correlation matching network across domains is constructed, in which the CCA is combined to bridge each pair of domain-specific hidden layers. To train the network, a joint objective function is defined and the optimization processes are presented. When the deep semantic representation is achieved, the shared features of the source domain are transferred for task learning in the target domain. Extensive experiments for three multimedia recognition applications demonstrate that the proposed DHTL can effectively find deep semantic representations for heterogeneous domains, and it is superior to the several existing state-of-the-art methods for deep transfer learning.

References

  1. G. Andrew, R. Arora, J. Bilmes, and K. Livescu. 2013. Deep canonical correlation analysis. In International Conference on Machine Learning. 1247--1255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Chen, Z. Xu, K. Weinberger, and F. Sha. 2012. Marginalized denoising autoencoders for domain adaptation. In International Conference on Machine Learning. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Z. Ding, N. M. Nasrabadi, and Y. Fu. 2016. Task-driven deep transfer learning for image classification. In IEEE International Conference on Acoustics, Speech and Signal Processing. 2414--2418.Google ScholarGoogle Scholar
  4. J. Donahue, J. Hoffman, E. Rodner, K. Saenko, and T. Darrell. 2013. Semi-supervised domain adaptation with instance constraints. In IEEE Conference on Computer Vision and Pattern Recognition. 668--675. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Duan, D. Xu, and I. Tsang. 2012. Learning with augmented features for heterogeneous domain adaptation. In International Conference on Machine Learning. 711--718. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Feng, R. Li, and X. Wang. 2015. Deep correspondence restricted Boltzmann machine for cross-modal retrieval. Neurocomputing 154, 4 (2015), 50--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. X. Glorot, A. Bordes, and Y. Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In International Conference on Machine Learning. 513--520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Computation 16, 12 (2004), 2639--2664. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Hoffman, D. Wang, F. Yu, and T. Darrell. 2016. FCNS in the wild: Pixel-level adversarial and constraint-based adaptation. In Computer Vision and Pattern Recognition. arXiv preprint arXiv:1612.02649.Google ScholarGoogle Scholar
  10. L. Jing, C. Zhang, and M. K. Ng. 2012. SNMFCA: Supervised NMF-based image classification and annotation. IEEE Transactions on Image Processing 21, 11 (2012), 4508--4521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. Kulis, K. Saenko, and T. Darrell. 2011. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Computer Vision and Pattern Recognition. 1785--1792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Li, L. Duan, D. Xu, and I. W. Tsang. 2014. Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 6 (2014), 1134--1148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu. 2014. Transfer joint matching for unsupervised domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition. 1410--1417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. 2012. Multimodal deep learning. In International Conference on Machine Learning. 689--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Ni, Q. Qiu, and R. Chellappa. 2013. Subspace interpolation via dictionary learning for unsupervised domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition. 692--699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang. 2011. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks 22, 2 (2011), 199--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. J. Pan and Q. Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge 8 Data Engineering 22, 10 (2010), 1345--1359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. W. Pan and Q. Yang. 2013. Transfer learning in heterogeneous collaborative filtering domains. Artificial Intelligence 197, 4 (2013), 39--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. C. Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. G. Lanckriet, R. Levy, and N. Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions Pattern Analysis and Machine Intelligence 36, 3 (2014), 521--535. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng. 2007. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th International Conference on Machine Learning. 759--766. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Sagha, J. Deng, M. Gavryukova, J. Han, and B. Schuller. 2016. Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace. In IEEE International Conference on Acoustics, Speech and Signal Processing. 5800--5804.Google ScholarGoogle Scholar
  22. X. Shu, G. J. Qi, J. Tang, and J. Wang. 2015. Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation. In ACM International Conference on Multimedia. 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Socher, M. Ganjoo, H. Sridhar, O. Bastani, C. D. Manning, and A. Y. Ng. 2013. Zero-shot learning through cross-modal transfer. In Advances in Neural Information Processing Systems. 935--943. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Tang, X. Shu, Z. Li, G. J. Qi, and J. Wang. 2016. Generalized deep transfer networks for knowledge propagation in heterogeneous domains. ACM Transactions on Multimedia Computing Communications and Applications 12, 4s (2016), 68:1--68:22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Tao, C. Hong, J. Yu, J. Wan, and M. Wang. 2015. Multimodal deep autoencoder for human pose recovery. IEEE Transactions on Image Processing 24, 12 (2015), 5659--5670.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell. 2017. Adversarial discriminative domain adaptation. In Computer Vision and Pattern Recognition. 7167--7176.Google ScholarGoogle Scholar
  27. L. Yang, L. Jing, and M. K. Ng. 2015. Robust and non-negative collective matrix factorization for text-to-image transfer learning. IEEE Transactions on Image Processing 24, 12 (2015), 4701--4714.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Yang, L. Jing, J. Yu, and M. K. Ng. 2016. Learning transferred weights from co-occurrence data for heterogeneous transfer learning. IEEE Transactions on Neural Networks and Learning Systems 27, 11 (2016), 2187--2200.Google ScholarGoogle ScholarCross RefCross Ref
  29. Q. Yang, Y. Chen, G. R. Xue, W. Dai, and Y. Yu. 2009. Heterogeneous transfer learning for image clustering via the social web. In Joint Conference of the Meeting of the ACL and the International Joint Conference on Natural Language Processing. 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Yang, T. Zhang, C. Xu, and M. H. Yang. 2015. Boosted multifeature learning for cross-domain transfer. ACM Transactions on Multimedia Computing Communications and Applications 11, 3 (2015), 35:1--35:19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. R. Yeh, C. H. Huang, and Y. C. Wang. 2014. Heterogeneous domain adaptation and classification by exploiting the correlation subspace. IEEE Transactions on Image Processing 23, 5 (2014), 2009--2018.Google ScholarGoogle ScholarCross RefCross Ref
  32. Q. Zhang, L. T. Yang, and Z. Chen. 2016. Deep computation model for unsupervised feature learning on big data. IEEE Transactions on Services Computing 9, 1 (2016), 161--171.Google ScholarGoogle Scholar
  33. Q. Zhang, H. Zhong, L. T. Yang, Z. Chen, and F. Bu. 2016. PPHOCFS: Privacy preserving high-order CFS algorithm on the cloud for clustering multimedia data. ACM Transactions on Multimedia Computing Communications and Applications 12, 4s (2016), 66:1--66:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. X. Zhang, F. X. Yu, S. F. Chang, and S. Wang. 2015. Deep transfer network: Unsupervised domain adaptation. In Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  35. L. Zhao, Z. Chen, and Z. J. Wang. 2018. Unsupervised multiview nonnegative correlated feature learning for data clustering. IEEE Signal Processing Letters 25, 1 (2018), 60--64.Google ScholarGoogle ScholarCross RefCross Ref
  36. L. Zhao, Z. Chen, Z. Yang, Y. Hu, and M. S. Obaidat. 2018. Local similarity imputation based on fast clustering for incomplete data in cyber-physical systems. IEEE Systems Journal 12, 2 (2018), 1610--1620.Google ScholarGoogle ScholarCross RefCross Ref
  37. J. T. Zhou, S. J. Pan, I. W. Tsang, and Y. Yan. 2014. Hybrid heterogeneous transfer learning through deep learning. In 28th AAAI Conference on Artificial Intelligence. 2213--2219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. T. Zhou, I. W. Tsang, S. J. Pan, and M. Tan. 2014. Heterogeneous domain adaptation for multiple classes. In Artificial Intelligence and Statistics. 1095--1103.Google ScholarGoogle Scholar
  39. Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G. R. Xue, Y. Yu, and Q. Yang. 2011. Heterogeneous transfer learning for image classification. In AAAI Conference on Artificial Intelligence. 1304--1309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. F. Zhuang, X. Cheng, P. Luo, S. J. Pan, and Q. He. 2015. Supervised representation learning: Transfer learning with deep autoencoders. In International Conference on Artificial Intelligence. 4119--4125. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Semantic Mapping for Heterogeneous Multimedia Transfer Learning Using Co-Occurrence Data

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 1s
          Special Section on Deep Learning for Intelligent Multimedia Analytics and Special Section on Multi-Modal Understanding of Social, Affective and Subjective Attributes of Data
          January 2019
          265 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/3309769
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 January 2019
          • Accepted: 1 July 2018
          • Revised: 1 June 2018
          • Received: 1 October 2017
          Published in tomm Volume 15, Issue 1s

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format