Integrating Domain Knowledge: Using Hierarchies to Improve Deep Classifiers

Brust, Clemens-Alexander; Denzler, Joachim

doi:10.1007/978-3-030-41404-7_1

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Included in the following conference series:

Asian Conference on Pattern Recognition

1766 Accesses
9 Citations

Abstract

One of the most prominent problems in machine learning in the age of deep learning is the availability of sufficiently large annotated datasets. For specific domains, e.g. animal species, a long-tail distribution means that some classes are observed and annotated insufficiently. Additional labels can be prohibitively expensive, e.g. because domain experts need to be involved. However, there is more information available that is to the best of our knowledge not exploited accordingly.

In this paper, we propose to make use of preexisting class hierarchies like WordNet to integrate additional domain knowledge into classification. We encode the properties of such a class hierarchy into a probabilistic model. From there, we derive a novel label encoding and a corresponding loss function. On the ImageNet and NABirds datasets our method offers a relative improvement of $10.4\%$ and $9.6\%$ in accuracy over the baseline respectively. After less than a third of training time, it is already able to match the baseline’s fine-grained recognition performance. Both results show that our suggested method is efficient and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep F-Measure Maximization in Multi-label Classification: A Comparative Study

Learning Hierarchy Aware Features for Reducing Mistake Severity

Category-Wise Fine-Tuning for Image Multi-label Classification with Partial Labels

Notes

1.
https://species.wikimedia.org/wiki/Main_Page.

References

Bart, E., et al.: Unsupervised learning of visual taxonomies. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Google Scholar
Barz, B., Denzler, J.: Hierarchy-based image embeddings for semantic image retrieval. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 638–647, January 2019
Google Scholar
Benkhalifa, M., Mouradi, A., Bouyakhf, H.: Integrating WordNet knowledge to supplement training data in semi-supervised agglomerative hierarchical clustering for text categorization. Int. J. Intell. Syst. 16(8), 929–947 (2001)
Article Google Scholar
Bilal, A., et al.: Do convolutional neural networks learn class hierarchy? IEEE Trans. Vis. Comput. Graph. 24(1), 152–162 (2018)
Article Google Scholar
Brust, C.-A., Denzler, J.: Not just a matter of semantics: the relationship between visual similarity and semantic similarity. arXiv:1811.07120 [cs], 17 November 2018
Deng, J., et al.: Imagenet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Google Scholar
Deng, J., et al.: Large-scale object classification using label relation graphs. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 48–64. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_4
Chapter Google Scholar
Deng, J., Berg, A.C., Li, K., Fei-Fei, L.: What does classifying more than 10,000 image categories tell us? In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 71–84. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_6
Chapter Google Scholar
Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet. In: Computer Vision and Pattern Recognition (CVPR), pp. 1777–1784 (2011)
Google Scholar
Faghri, F., et al.: VSE++: improving visual-semantic embeddings with hard negatives. arXiv:1707.05612 [cs], 18 July 2017
Fellbaum, C.: WordNet. Wiley Online Library (1998)
Google Scholar
Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic label sharing for learning with many categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15549-9_55
Chapter Google Scholar
Frome, A., et al.: DeViSE: a deep visual-semantic embedding model. In: Burges, C.J.C., et al. (eds.) Advances in Neural Information Processing Systems 26, pp. 2121–2129. Curran Associates Inc. (2013)
Google Scholar
Gaussier, E., Goutte, C., Popat, K., Chen, F.: A hierarchical model for clustering and categorising documents. In: Crestani, F., Girolami, M., van Rijsbergen, C.J. (eds.) ECIR 2002. LNCS, vol. 2291, pp. 229–247. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45886-7_16
Chapter Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
Hoffman, J., et al.: LSDA: large scale detection through adaptation. arXiv preprint arXiv:1407.5035, 18 July 2014
Huo, Y., Ding, M., Zhao, A., Hu, J., Wen, J.-R., Lu, Z.: Zero-shot learning with superclasses. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11303, pp. 460–472. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04182-3_40
Chapter Google Scholar
Hwang, S.J.: Discriminative object categorization with external semantic knowledge. Ph.D. thesis, August 2013
Google Scholar
Hwang, S.J., Grauman, K., Sha, F.: Learning a tree of metrics with disjoint visual features. In: Shawe-Taylor, J., et al. (eds.) Advances in Neural Information Processing Systems 24, pp. 621–629. Curran Associates Inc. (2011)
Google Scholar
Hwang, S.J., Sigal, L.: A unified semantic embedding: relating taxonomies and attributes. In: Advances in Neural Information Processing Systems 27, p. 9 (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference for Learning Representations (ICLR), 22 December 2014. arXiv: 1412.6980v9
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. (IJCV) 123(1), 32–73 (2017)
Article MathSciNet Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)
Google Scholar
Liu, C., et al.: Progressive neural architecture search. arXiv preprint arXiv:1712.00559 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Computer Vision and Pattern Recognition (CVPR) (2015). arXiv: 1411.4038v2
Marszalek, M., Schmid, C.: Semantic hierarchies for visual object recognition. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7, June 2007
Google Scholar
Partalas, I., et al.: LSHTC: a benchmark for large-scale text classification. arXiv preprint arXiv:1503.08581 (2015)
Rodner, E., Denzler, J.: One-shot learning of object categories using dependent Gaussian processes. In: Goesele, M., Roth, S., Kuijper, A., Schiele, B., Schindler, K. (eds.) DAGM 2010. LNCS, vol. 6376, pp. 232–241. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15986-2_24
Chapter Google Scholar
Rohrbach, M., Ebert, S., Schiele, B.: Transfer learning in a transductive setting. In: Burges, C.J.C., et al. (eds.) Advances in Neural Information Processing Systems 26, pp. 46–54. Curran Associates Inc. (2013)
Google Scholar
Settles, B.: Active learning literature survey. Technical report 1648, University of Wisconsin-Madison (2009)
Google Scholar
Sharif Razavian, A., et al.: CNN features off-the-shelf: an astounding baseline for recognition. In: Computer Vision and Pattern Recognition Workshops (CVPR-WS) (2014)
Google Scholar
Srivastava, N., Salakhutdinov, R.R.: Discriminative transfer learning with tree-based priors. In: Burges, C.J.C., et al. (eds.) Advances in Neural Information Processing Systems 26, pp. 2094–2102. Curran Associates Inc. (2013)
Google Scholar
Sun, C., et al.: Revisiting unreasonable effectiveness of data in deep learning era. In: International Conference on Computer Vision (ICCV), pp. 843–852 (2017)
Google Scholar
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. Trans. Pattern Anal. Mach. Intell. (PAMI) 30(11), 1958–1970 (2008)
Article Google Scholar
Van Horn, G., et al.: Building a bird recognition app and large scale dataset with citizen scientists: the fine print in fine-grained dataset collection. In: Computer Vision and Pattern Recognition (CVPR), pp. 595–604 (2015)
Google Scholar
Van Horn, G., et al.: The iNaturalist challenge 2017 dataset. arXiv preprint arXiv:1707.06642 (2017)
Vapnik, V., Vashist, A.: A new learning paradigm: learning using privileged information. Neural Netw. 22(5–6), 544–557 (2009)
Article Google Scholar
Verma, N., et al.: Learning hierarchical similarity metrics. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2280–2287, June 2012
Google Scholar
Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)
Article Google Scholar
Wu, Q., et al.: Image captioning and visual question answering based on attributes and external knowledge. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1367–1381 (2018)
Article MathSciNet Google Scholar
Yan, Z., et al.: HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 2740–2748. IEEE, December 2015
Google Scholar
Zhang, X., et al.: Embedding label structures for fine-grained feature representation, pp. 1114–1123 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Group, Friedrich Schiller University Jena, Jena, Germany
Clemens-Alexander Brust & Joachim Denzler
Michael Stifel Center Jena, Jena, Germany
Joachim Denzler

Authors

Clemens-Alexander Brust
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Denzler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Clemens-Alexander Brust .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brust, CA., Denzler, J. (2020). Integrating Domain Knowledge: Using Hierarchies to Improve Deep Classifiers. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-41404-7_1
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics