Skip to main content

Learning Visual Dictionaries from Class-Specific Superpixel Segmentation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11678))

Abstract

Visual dictionaries (Bag of Visual Words - BoVW) can be a very powerful technique for image description whenever exists a reduced number of training images, being an attractive alternative to deep learning techniques. Nevertheless, models for BoVW learning are usually unsupervised and rely on the same set of visual words for all images in the training set. We present a method that works with small supervised training sets. It first generates superpixels from multiple images of a same class, for interest point detection, and then builds one visual dictionary per class. We show that the detected interest points can be more relevant than the traditional ones (e.g., grid sampling) in the context of a given application—the classification of intestinal parasite images. The study uses three image datasets, with a total of 15 different species of parasites, and a diverse class, namely impurity, which makes the problem difficult with examples similar to all the remaining classes of parasites.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018

    Article  MATH  Google Scholar 

  2. Cortés, X., Conte, D., Cardot, H.: A new bag of visual words encoding method for human action recognition. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2480–2485, August 2018. https://doi.org/10.1109/ICPR.2018.8545886

  3. Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 524–531, June 2005. https://doi.org/10.1109/CVPR.2005.16

  4. Gong, X., Yuanyuan, L., Xie, Z.: An improved bag-of-visual-word based classification method for high-resolution remote sensing scene. In: 2018 26th International Conference on Geoinformatics, pp. 1–5, June 2018. https://doi.org/10.1109/GEOINFORMATICS.2018.8557124

  5. Gwet, K.L.: Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters, 4th edn. Advanced Analytics, LLC, Gaithersburg (2014)

    Google Scholar 

  6. Haas, S., Donner, R., Burner, A., Holzer, M., Langs, G.: Superpixel-based interest points for effective bags of visual words medical image retrieval. In: Müller, H., Greenspan, H., Syeda-Mahmood, T. (eds.) MCBR-CDS 2011. LNCS, vol. 7075, pp. 58–68. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28460-1_6

    Chapter  Google Scholar 

  7. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)

    Google Scholar 

  8. Li, Z., Zhang, Z., Qin, J., Zhang, Z., Shao, L.: Discriminative fisher embedding dictionary learning algorithm for object recognition. IEEE Trans. Neural Netw. Learn. Syst. 1–15 (2019). https://doi.org/10.1109/TNNLS.2019.2910146

  9. Liu, Y., Caselles, V.: Supervised visual vocabulary with category information. In: Blanc-Talon, J., Kleihorst, R., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2011. LNCS, vol. 6915, pp. 13–21. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23687-7_2

    Chapter  Google Scholar 

  10. Mikulik, A., Perdoch, M., Chum, O., Matas, J.: Learning vocabularies over a fine quantization. Int. J. Comput. Vis. 103(1), 163–175 (2013). https://doi.org/10.1007/s11263-012-0600-1

    Article  MathSciNet  Google Scholar 

  11. Minaee, S., et al.: MTBI identification from diffusion MR images using bag of adversarial visual features. IEEE Trans. Med. Imaging (2019, to appear). https://doi.org/10.1109/TMI.2019.2905917

  12. Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006). https://doi.org/10.1007/11744085_38

    Chapter  Google Scholar 

  13. Peixinho, A.Z., Benato, B.C., Nonato, L.G., Falcão, A.X.: Delaunay triangulation data augmentation guided by visual analytics for deep learning. In: 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 384–391, October 2018. https://doi.org/10.1109/SIBGRAPI.2018.00056

  14. Rocha, L.M., Cappabianco, F.A.M., Falcão, A.X.: Data clustering as an optimum-path forest problem with applications in image analysis. Int. J. Imaging Syst. Technol. 19(2), 50–68 (2009). https://doi.org/10.1002/ima.v19:2

    Article  Google Scholar 

  15. Silva, F.B., de O. Werneck, R., Goldenstein, S., Tabbone, S., da S. Torres, R.: Graph-based bag-of-words for classification. Pattern Recogn. 74, 266–285 (2018). https://doi.org/10.1016/j.patcog.2017.09.018

    Article  Google Scholar 

  16. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv abs/1409.1556 (2014)

    Google Scholar 

  17. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV 2003, vol. 2, pp. 1470–1477. IEEE Computer Society, Washington, DC (2003). https://doi.org/10.1109/ICCV.2003.1238663

  18. de Souza, L.A., et al.: Learning visual representations with optimum-path forest and its applications to Barrett’s esophagus and adenocarcinoma diagnosis. Neural Comput. Appl. (2019) https://doi.org/10.1007/s00521-018-03982-0

  19. Stehling, R.O., Nascimento, M.A., Falcão, A.X.: A compact and efficient image retrieval approach based on border/interior pixel classification. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, CIKM 2002, pp. 102–109. ACM, New York (2002). https://doi.org/10.1145/584792.584812

  20. Suzuki, C.T.N., Gomes, J.F., Falcão, A.X., Papa, J.P., Hoshino-Shimizu, S.: Automatic segmentation and classification of human intestinal parasites from microscopy images. IEEE Trans. Biomed. Eng. 60(3), 803–812 (2013). https://doi.org/10.1109/TBME.2012.2187204

    Article  Google Scholar 

  21. Tian, L., Wang, S.: Improved bag-of-words model for person re-identification. Tsinghua Sci. Technol. 23(2), 145–156 (2018). https://doi.org/10.26599/TST.2018.9010060

    Article  Google Scholar 

  22. Vargas-Muñoz, J., Chowdhury, A., Alexandre, E., Galvão, F., Miranda, P., Falcão, A.: An iterative spanning forest framework for superpixel segmentation. IEEE Trans. Image Process. (2019, to appear). https://doi.org/10.1109/TIP.2019.2897941

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors thank FAPESP (grants 2014/12236-1 and 2017/03940-5) and CNPq (grant 303808/2018-7) for the financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to César Castelo-Fernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Castelo-Fernández, C., Falcão, A.X. (2019). Learning Visual Dictionaries from Class-Specific Superpixel Segmentation. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29888-3_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29887-6

  • Online ISBN: 978-3-030-29888-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics