Skip to main content

Information Fusion via Multimodal Hashing with Discriminant Canonical Correlation Maximization

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11663))

Abstract

In this paper, we introduce an effective information fusion method using multimodal hashing with discriminant canonical correlation maximization. As an effective computation method of similarity between different inputs, multimodal hashing technique has attracted increasing attentions in fast similarity search. In this paper, the proposed approach not only finds the minimum of the semantic similarity across different modalities by multimodal hashing, but also is capable of extracting the discriminant representations, which minimize the between-class correlation and maximize the within-class correlation simultaneously for information fusion. Benefiting from the combination of semantic similarity across different modalities and the discriminant representation strategy, the proposed algorithm can achieve improved performance. A prototype of the proposed method is implemented to demonstrate its performance in audio emotion recognition and cross-modal (text-image) fusion. Experimental results show that the proposed approach outperforms the related methods, in terms of accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Guan, L., Wang, Y., Zhang, R., Tie, Y., Bulzacki, A., Ibrahim, M.: Multimodal information fusion for selected multimedia applications. Int. J, Multimedia Intell. Secur. 1(1), 5–32 (2010)

    Article  Google Scholar 

  2. Balazs, J.A., Velsquez, J.D.: Opinion mining and information fusion: a survey. Inf. Fusion 27, 95–110 (2016)

    Article  Google Scholar 

  3. Ma, J., Ma, Y., Li, C.: Infrared and visible image fusion methods and applications: a survey. Inf. Fusion 45, 153–178 (2019)

    Article  Google Scholar 

  4. Suk, H.-I., Lee, S.-W.: A novel bayesian framework for discriminative feature extraction in brain-computer interfaces. IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 286–299 (2013)

    Article  Google Scholar 

  5. Gao, L., Qi, L., Chen, E., Guan, L.: Discriminative multiple canonical correlation analysis forvinformation fusion. IEEE Trans. Image Process. 27(4), 1951–1965 (2018)

    Article  MathSciNet  Google Scholar 

  6. Zhang, J.G., Huang, K.Q., et al.: Df2Net: a discriminative feature learning and fusion network for RGB-D indoor scene classification. In: 2018 AAAI, pp. 7041–7048 (2018)

    Google Scholar 

  7. Zhang, D., Li, W.-J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: 2014 AAAI, vol. 1, pp. 1–7 (2014)

    Google Scholar 

  8. Jin, L., Li, K., Hao, H., Qi, G.-J., Tang, J.: Semantic neighbor graph hashing for multimodal retrieval. IEEE Trans. Image Process. 27(3), 1405–1417 (2018)

    Article  MathSciNet  Google Scholar 

  9. Tang, J., Li, Z.: Weakly supervised multimodalhashing forscalablesocial imageretrieval. IEEE Trans. Circuits Syst. Video Technol. 28(10), 2730–2741 (2018)

    Article  Google Scholar 

  10. Zhen, Y., Yeung, D.-Y.: A probabilistic model for multimodal hash function learning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 940–948 (2012)

    Google Scholar 

  11. Wei, Y., et al.: Cross-modal retrieval with cnn visual features: a new baseline. IEEE Trans. Cybern. 47(2), 449–460 (2017)

    Google Scholar 

  12. Torralba, A., Fergus, R., Weiss, Y.: Small codes and large image databases for recognition. In: 2008 CVPR, pp. 1–8 (2008)

    Google Scholar 

  13. Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)

    Article  Google Scholar 

  14. Wollmer, M., Kaiser, M., Eyben, F., Schuller, B.R., Rigoll, G.: LSTM-modeling of continuous emotions in an audio-visual affect recognition framework. Image Vis. Comput. 31(2), 153–163 (2013)

    Article  Google Scholar 

  15. Wang, Y., Guan, L.: Recognizing human emotional state from audiovisual signals. IEEE Trans. Multimedia 10(5), 936–946 (2008)

    Article  Google Scholar 

  16. Wang, Y., Guan, L., Venetsanopoulos, A.N.: Kernel cross-modal factor analysis for information fusion with application to bimodal emotion recognition. IEEE Trans. Multimedia 14(3), 597–607 (2012)

    Article  Google Scholar 

  17. Zeshui, X., Zhao, N.: Information fusion for intuitionistic fuzzy decision making: an overview. Inf. Fusion 28, 10–23 (2016)

    Article  Google Scholar 

  18. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  19. Shekar, B.H., Sharmila Kumari, M., Mestetskiy, L.M., Dyshkant, N.F.: Face recognition using kernel entropy component analysis. Neurocomputing 74(6), 1053–1057 (2011)

    Article  Google Scholar 

  20. Cho, S., Jiang, J.: Optimal fault classification using fisher discriminant analysis in the parity space for applications to NPPs. IEEE Trans. Nucl. Sci. 65(3), 856–865 (2018)

    Article  Google Scholar 

  21. Gao, L., Zhang, R., Qi, L., Chen, E., Guan, L.: The labeled multiple canonical correlation analysis for information fusion. IEEE Trans. Multimedia 21(2), 375–387 (2019)

    Article  Google Scholar 

  22. Zhang, S., Zhang, S., Huang, T., Gao, W., Tian, Q.: Learning affective features with a hybrid deep model for audio-visual emotion recognition. IEEE Trans. Circuits Syst. Video Technol. 28(10), 3030–3043 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gao, L., Guan, L. (2019). Information Fusion via Multimodal Hashing with Discriminant Canonical Correlation Maximization. In: Karray, F., Campilho, A., Yu, A. (eds) Image Analysis and Recognition. ICIAR 2019. Lecture Notes in Computer Science(), vol 11663. Springer, Cham. https://doi.org/10.1007/978-3-030-27272-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27272-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27271-5

  • Online ISBN: 978-3-030-27272-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics