skip to main content
10.1145/2986035.2986042acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Food Image Recognition Using Very Deep Convolutional Networks

Published:16 October 2016Publication History

ABSTRACT

We evaluated the effectiveness in classifying food images of a deep-learning approach based on the specifications of Google's image recognition architecture Inception. The architecture is a deep convolutional neural network (DCNN) having a depth of 54 layers. In this study, we fine-tuned this architecture for classifying food images from three well-known food image datasets: ETH Food-101, UEC FOOD 100, and UEC FOOD 256. On these datasets we achieved, respectively, 88.28%, 81.45%, and 76.17% as top-1 accuracy and 96.88%, 97.27%, and 92.58% as top-5 accuracy. To the best of our knowledge, these results significantly improve the best published results obtained on the same datasets, while requiring less computation power, since the number of parameters and the computational complexity are much smaller than the competitors?. Because of this, even if it is still rather large, the deep network based on this architecture appears to be at least closer to the requirements for mobile systems.

References

  1. Inception in TensorFlow, 2016. https://github.com/tensorflow/models/tree/master/inception.Google ScholarGoogle Scholar
  2. inception v3 model, 2016. http://download.tensorflow.org/models/image/imagenet/inception-v3--2016-03-01.tar.gz.Google ScholarGoogle Scholar
  3. TensorFlow, 2016. https://www.tensorflow.org/.Google ScholarGoogle Scholar
  4. M. M. Anthimopoulos, L. Gianola, L. Scarnato, P. Diem, and S. G. Mougiakakou. A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE Journal of Biomedical and Health Informatics, 18(4):1261--1271, July 2014.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Arora, A. Bhaskara, R. Ge, and T. Ma. Provable bounds for learning some deep representations. In ICML, pages 584--592, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Bossard, M. Guillaumin, and L. Van Gool. Food-101 -- mining discriminative components with random forests. In European Conference on Computer Vision, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  7. M. Chen, K. Dhingra, W. Wu, L. Yang, R. Sukthankar, and J. Yang. PFID: Pittsburgh fast-food image dataset. In Image Processing (ICIP), 2009 16th IEEE International Conference on, pages 289--292. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Christodoulidis, M. Anthimopoulos, and S. Mougiakakou. Food recognition for dietary assessment using deep convolutional neural networks. In New Trends in Image Analysis and Processing--ICIAP 2015 Workshops, pages 458--465. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  9. G. M. Farinella, M. Moltisanti, and S. Battiato. Classifying food images represented as bag of textons. In 2014 IEEE International Conference on Image Processing (ICIP), pages 5212--5216, Oct 2014.Google ScholarGoogle ScholarCross RefCross Ref
  10. S. Geman. Hierarchy in machine and natural vision. In Proceedings of the Scandinavian Conference on Image Analysis, volume 1, pages 179--184, 1999.Google ScholarGoogle Scholar
  11. H. Hoashi, T. Joutou, and K. Yanai. Image recognition of 85 food categories by feature fusion. In Multimedia (ISM), 2010 IEEE International Symposium on, pages 296--301, Dec 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 06(02):107--116, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Jarrett, K. Kavukcuoglu, Y. Lecun, et al. What is the best multi-stage architecture for object recognition? In 2009 IEEE 12th International Conference on Computer Vision, pages 2146--2153, Sept 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. Y. Kawano and K. Yanai. FoodCam-256: A large-scale real-time mobile food recognitionsystem employing high-dimensional features and compression of classifier weights. In Proceedings of the ACM International Conference on Multimedia, pages 761--762. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Kawano and K. Yanai. Foodcam: A real-time mobile food recognition system employing fisher vector. In MultiMedia Modeling, pages 369--373. Springer, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097--1105. Curran Associates, Inc., 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  18. M. Lin, Q. Chen, and S. Yan. Network in network. CoRR, abs/1312.4400, 2013.Google ScholarGoogle Scholar
  19. M. B. E. Livingstone and A. E. Black. Markers of the validity of reported energy intake. The Journal of nutrition, 133(3):895S--920S, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  20. Y. Matsuda, H. Hoashi, and K. Yanai. Recognition of multiple-food images by detecting candidate regions. In Proc. of IEEE International Conference on Multimedia and Expo (ICME), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Mesas, M. Munoz-Pareja, E. López-Garcıa, and F. Rodrıguez-Artalejo. Selected eating behaviours and excess body weight: a systematic review. Obesity Reviews, 13(2):106--135, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. O'Hara and B. A. Draper. Introduction to the bag of features paradigm for image classification and retrieval. arXiv preprint arXiv:1101.3354, 2011.Google ScholarGoogle Scholar
  23. L. Oliveira, V. Costa, G. Neves, T. Oliveira, E. Jorge, and M. Lizarraga. A mobile, lightweight, poll-based food identification system. Pattern Recognition, 47(5):1941--1952, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Puri, Z. Zhu, Q. Yu, A. Divakaran, and H. Sawhney. Recognition and volume estimation of food intake using a mobile device. In Applications of Computer Vision (WACV), 2009 Workshop on, pages 1--8, Dec 2009.Google ScholarGoogle ScholarCross RefCross Ref
  25. J. Schmidhuber. Deep learning in neural networks: An overview. CoRR, abs/1404.7828, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Szegedy, S. Ioffe, and V. Vanhoucke. Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR, abs/1602.07261, 2016.Google ScholarGoogle Scholar
  27. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1--9, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  28. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567, 2015.Google ScholarGoogle Scholar
  29. K. Yanai and Y. Kawano. Food image recognition using deep convolutional network with pre-training and fine-tuning. In Multimedia & Expo Workshops (ICMEW), 2015 IEEE International Conference on, pages 1--6. IEEE, June 2015.Google ScholarGoogle ScholarCross RefCross Ref
  30. F. Zhu, M. Bosch, I. Woo, S. Kim, C. J. Boushey, D. S. Ebert, and E. J. Delp. The use of mobile devices in aiding dietary assessment and evaluation. Selected Topics in Signal Processing, IEEE Journal of, 4(4):756--766, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Food Image Recognition Using Very Deep Convolutional Networks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            MADiMa '16: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management
            October 2016
            102 pages
            ISBN:9781450345200
            DOI:10.1145/2986035

            Copyright © 2016 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 October 2016

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            MADiMa '16 Paper Acceptance Rate7of14submissions,50%Overall Acceptance Rate16of24submissions,67%

            Upcoming Conference

            MM '24
            MM '24: The 32nd ACM International Conference on Multimedia
            October 28 - November 1, 2024
            Melbourne , VIC , Australia

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader