Skip to main content
Log in

Retrieving real world clothing images via multi-weight deep convolutional neural networks

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Clothing images are abundantly available from the Internet, especially from the e-commercial platform. Retrieving those images is of importance for commercial and social applications and has recently been received tremendous attention from communities, such as multimedia processing and computer vision. However, the large variations in clothing of their appearance and style, and even the large quantity of multiple categories and attributes make those problems challenging. Furthermore, for real world images their labels provided by shop retailers from webpages are largely erroneous or incomplete. And the imbalance among those image categories prevents the effective learning. To overcome those problems, in this paper, we adopt a multi-task deep learning framework to learn the representation. And we propose multi-weight deep convolutional neural networks for imbalance learning. The topology of this network contains two groups of layers, shared layers at the bottom and task dependent ones at the top. Furthermore, category-relevant parameters are incorporated to regularize the backward gradients for categories. Mathematical proof shows its relationship to regulating the learning rates. Experiments demonstrate that our proposed joint framework and multi-weight neural networks can effectively learn robust representations and achieve better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bai, Y., Yang, K., Yu, W., Ma, W.Y., Zhao, T.: Learning High-level Image Representation for Image Retrieval via Multi-Task DNN using Click through Data. arXiv:1312.4740 [cs.CV] (2013)

  2. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  3. Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV). PART I, pp. 663–676. Heraklion (2010)

  4. Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: European Conference on Computer Vision (ECCV). PART III, pp. 609–623. Firenze (2012)

  5. Di, W., Wah, C., Bhardwaj, A., Piramuthu, R.: Style finder: Fine-grained clothing style detection and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp. 8–13 (2013)

  6. Dong, Q., Gong, S., Zhu, X.: Multi-task curriculum transfer deep learning of clothing attributes. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 520–529. Santa Rosa (2017)

  7. Feng, F., Li, R., Wang, X.: Deep correspondence restricted Boltzmann machine for cross-modal retrieval. Neurocomputing 154(C), 50–60 (2015)

    Article  Google Scholar 

  8. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 38(1), 142–158 (2016)

    Article  Google Scholar 

  9. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Gordon, G.J., Dunson, D.B. (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 15, pp. 315–323. Fort Lauderdale (2011)

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. Las Vegas (2016)

  11. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 [cs.CV] (2012)

  12. Huang, J., Feris, R., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: IEEE International Conference on Computer Vision (ICCV), pp. 1062–1070. Santiago (2015)

  13. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia (MM), pp. 675–678. Orlando (2014)

  14. Jing, Y., Liu, D., Kislyuk, D., Zhai, A., Xu, J., Donahue, J., Tavel, S.: Visual search at pinterest. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1889–1898. Sydney (2015)

  15. Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: ACM Conference on International Conference on Multimedia Retrieval (ICMR), pp. 105–112. Dallas (2013)

  16. Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: IEEE International Conference on Computer Vision (ICCV), pp. 3343–3351. Santiago (2015)

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems (NIPS), pp. 1097–1105. Lake Tahoe (2012)

  18. Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 33(10), 1962–1977 (2011)

    Article  Google Scholar 

  19. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  20. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  21. Lin, K., Yang, H.F., Liu, K.H., Hsiao, J.H., Chen, C.S.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 499–502. Shanghai (2015)

  22. Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3330–3337. Nara (2012)

  23. Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1096–1104. Las Vegas (2016)

  24. Lynch, C., Aryafar, K., Attenberg, J.: Images don’t lie: transferring deep visual semantic features to large-scale multimodal learning to rank. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 541–548. San Francisco (2016)

  25. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Nature 323(6), 533–536 (1986)

    Article  Google Scholar 

  26. Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: 2010 European Conference of Computer Vision (ECCV), pp. 1–14. Heraklion (2012)

  27. Shankar, D., Narumanchi, S., Ananya, H.A., Kompalli, P., Chaudhury, K.: Deep learning based large scale visual recommendation and search for E-Commerce. arXiv:1703.02344 [cs.CV] (2017)

  28. Shankar, S.: DEEP-CARVING: discovering visual attributes by carving deep neural nets. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3403–3412. Boston (2015)

  29. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR), pp. 1–14. San Diego (2015)

  30. Simoserra, E., Ishikawa, H.: Fashion style in 128 floats: joint ranking and classification using weak data for feature extraction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 298–307. Las Vegas (2016)

  31. Tangseng, P., Wu, Z., Yamaguchi, K.: Looking at outfit to parse clothing. arXiv:1703.01386 [cs.CV] (2017)

  32. Wang, D., Gao, X., Wang, X., He, L., Yuan, B.: Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Trans. Image Process. (TIP) 25(10), 4540–4554 (2016)

    Article  MathSciNet  Google Scholar 

  33. Wang, X., Sun, Z., Zhang, W., Zhou, Y., Jiang, Y.G.: Matching user photos to online products with robust deep features. In: ACM on International Conference on Multimedia Retrieval (ICMR), pp. 7–14. New York (2016)

  34. Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Parsing clothing in fashion photographs. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3570–3577. Washington, DC (2012)

  35. Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Retrieving similar styles to parse clothing. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(5), 1028–40 (2015)

    Article  Google Scholar 

  36. Zhai, A., Kislyuk, D., Jing, Y., Feng, M., Tzeng, E., Donahue, J., Du, Y.L., Darrell, T.: Visual discovery at pinterest. arXiv:1702.04680 [cs.CV] (2017)

  37. Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.D.: PANDA: Pose Aligned networks for deep attribute modeling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1637–1644. Los Alamitos (2014)

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (Nos. 61273365, 61472046, and 61472048) and Discipline Building Plan in 111 Base (No. B08004). The authors thank Prof. Chuan Shi at Beijing University of Posts and Telecommunications for reading the draft of this paper and for giving helpful comments. The authors would also like to thank the editor and the anonymous reviewers for useful comments and suggestions that allowed them to improve the final version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruifan Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, R., Feng, F., Ahmad, I. et al. Retrieving real world clothing images via multi-weight deep convolutional neural networks. Cluster Comput 22 (Suppl 3), 7123–7134 (2019). https://doi.org/10.1007/s10586-017-1052-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1052-8

Keywords

Navigation