ABSTRACT
We evaluated the effectiveness in classifying food images of a deep-learning approach based on the specifications of Google's image recognition architecture Inception. The architecture is a deep convolutional neural network (DCNN) having a depth of 54 layers. In this study, we fine-tuned this architecture for classifying food images from three well-known food image datasets: ETH Food-101, UEC FOOD 100, and UEC FOOD 256. On these datasets we achieved, respectively, 88.28%, 81.45%, and 76.17% as top-1 accuracy and 96.88%, 97.27%, and 92.58% as top-5 accuracy. To the best of our knowledge, these results significantly improve the best published results obtained on the same datasets, while requiring less computation power, since the number of parameters and the computational complexity are much smaller than the competitors?. Because of this, even if it is still rather large, the deep network based on this architecture appears to be at least closer to the requirements for mobile systems.
- Inception in TensorFlow, 2016. https://github.com/tensorflow/models/tree/master/inception.Google Scholar
- inception v3 model, 2016. http://download.tensorflow.org/models/image/imagenet/inception-v3--2016-03-01.tar.gz.Google Scholar
- TensorFlow, 2016. https://www.tensorflow.org/.Google Scholar
- M. M. Anthimopoulos, L. Gianola, L. Scarnato, P. Diem, and S. G. Mougiakakou. A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE Journal of Biomedical and Health Informatics, 18(4):1261--1271, July 2014.Google ScholarCross Ref
- S. Arora, A. Bhaskara, R. Ge, and T. Ma. Provable bounds for learning some deep representations. In ICML, pages 584--592, 2014.Google ScholarDigital Library
- L. Bossard, M. Guillaumin, and L. Van Gool. Food-101 -- mining discriminative components with random forests. In European Conference on Computer Vision, 2014.Google ScholarCross Ref
- M. Chen, K. Dhingra, W. Wu, L. Yang, R. Sukthankar, and J. Yang. PFID: Pittsburgh fast-food image dataset. In Image Processing (ICIP), 2009 16th IEEE International Conference on, pages 289--292. IEEE, 2009. Google ScholarDigital Library
- S. Christodoulidis, M. Anthimopoulos, and S. Mougiakakou. Food recognition for dietary assessment using deep convolutional neural networks. In New Trends in Image Analysis and Processing--ICIAP 2015 Workshops, pages 458--465. Springer, 2015.Google ScholarCross Ref
- G. M. Farinella, M. Moltisanti, and S. Battiato. Classifying food images represented as bag of textons. In 2014 IEEE International Conference on Image Processing (ICIP), pages 5212--5216, Oct 2014.Google ScholarCross Ref
- S. Geman. Hierarchy in machine and natural vision. In Proceedings of the Scandinavian Conference on Image Analysis, volume 1, pages 179--184, 1999.Google Scholar
- H. Hoashi, T. Joutou, and K. Yanai. Image recognition of 85 food categories by feature fusion. In Multimedia (ISM), 2010 IEEE International Symposium on, pages 296--301, Dec 2010. Google ScholarDigital Library
- S. Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 06(02):107--116, 1998. Google ScholarDigital Library
- K. Jarrett, K. Kavukcuoglu, Y. Lecun, et al. What is the best multi-stage architecture for object recognition? In 2009 IEEE 12th International Conference on Computer Vision, pages 2146--2153, Sept 2009.Google ScholarCross Ref
- Y. Kawano and K. Yanai. FoodCam-256: A large-scale real-time mobile food recognitionsystem employing high-dimensional features and compression of classifier weights. In Proceedings of the ACM International Conference on Multimedia, pages 761--762. ACM, 2014. Google ScholarDigital Library
- Y. Kawano and K. Yanai. Foodcam: A real-time mobile food recognition system employing fisher vector. In MultiMedia Modeling, pages 369--373. Springer, 2014. Google ScholarDigital Library
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097--1105. Curran Associates, Inc., 2012. Google ScholarDigital Library
- Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarCross Ref
- M. Lin, Q. Chen, and S. Yan. Network in network. CoRR, abs/1312.4400, 2013.Google Scholar
- M. B. E. Livingstone and A. E. Black. Markers of the validity of reported energy intake. The Journal of nutrition, 133(3):895S--920S, 2003.Google ScholarCross Ref
- Y. Matsuda, H. Hoashi, and K. Yanai. Recognition of multiple-food images by detecting candidate regions. In Proc. of IEEE International Conference on Multimedia and Expo (ICME), 2012. Google ScholarDigital Library
- A. Mesas, M. Munoz-Pareja, E. López-Garcıa, and F. Rodrıguez-Artalejo. Selected eating behaviours and excess body weight: a systematic review. Obesity Reviews, 13(2):106--135, 2012.Google ScholarCross Ref
- S. O'Hara and B. A. Draper. Introduction to the bag of features paradigm for image classification and retrieval. arXiv preprint arXiv:1101.3354, 2011.Google Scholar
- L. Oliveira, V. Costa, G. Neves, T. Oliveira, E. Jorge, and M. Lizarraga. A mobile, lightweight, poll-based food identification system. Pattern Recognition, 47(5):1941--1952, 2014. Google ScholarDigital Library
- M. Puri, Z. Zhu, Q. Yu, A. Divakaran, and H. Sawhney. Recognition and volume estimation of food intake using a mobile device. In Applications of Computer Vision (WACV), 2009 Workshop on, pages 1--8, Dec 2009.Google ScholarCross Ref
- J. Schmidhuber. Deep learning in neural networks: An overview. CoRR, abs/1404.7828, 2014. Google ScholarDigital Library
- C. Szegedy, S. Ioffe, and V. Vanhoucke. Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR, abs/1602.07261, 2016.Google Scholar
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1--9, 2015.Google ScholarCross Ref
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567, 2015.Google Scholar
- K. Yanai and Y. Kawano. Food image recognition using deep convolutional network with pre-training and fine-tuning. In Multimedia & Expo Workshops (ICMEW), 2015 IEEE International Conference on, pages 1--6. IEEE, June 2015.Google ScholarCross Ref
- F. Zhu, M. Bosch, I. Woo, S. Kim, C. J. Boushey, D. S. Ebert, and E. J. Delp. The use of mobile devices in aiding dietary assessment and evaluation. Selected Topics in Signal Processing, IEEE Journal of, 4(4):756--766, 2010.Google Scholar
Index Terms
- Food Image Recognition Using Very Deep Convolutional Networks
Recommendations
Food image recognition with deep convolutional features
UbiComp '14 Adjunct: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct PublicationIn this paper, we report the feature obtained from the Deep Convolutional Neural Network boosts food recognition accuracy greatly by integrating it with conventional hand-crafted image features, Fisher Vectors with HoG and Color patches. In the ...
Food Detection and Recognition Using Convolutional Neural Network
MM '14: Proceedings of the 22nd ACM international conference on MultimediaIn this paper, we apply a convolutional neural network (CNN) to the tasks of detecting and recognizing food images. Because of the wide diversity of types of food, image recognition of food items is generally very difficult. However, deep learning has ...
Food/Non-food Image Classification and Food Categorization using Pre-Trained GoogLeNet Model
MADiMa '16: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary ManagementRecent past has seen a lot of developments in the field of image-based dietary assessment. Food image classification and recognition are crucial steps for dietary assessment. In the last couple of years, advancements in the deep learning and ...
Comments