skip to main content
10.1145/3195970.3196009acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

CMP-PIM: an energy-efficient comparator-based processing-in-memory neural network accelerator

Published:24 June 2018Publication History

ABSTRACT

In this paper, an energy-efficient and high-speed comparator-based processing-in-memory accelerator (CMP-PIM) is proposed to efficiently execute a novel hardware-oriented comparator-based deep neural network called CMPNET. Inspired by local binary pattern feature extraction method combined with depthwise separable convolution, we first modify the existing Convolutional Neural Network (CNN) algorithm by replacing the computationally-intensive multiplications in convolution layers with more efficient and less complex comparison and addition. Then, we propose a CMP-PIM that employs parallel computational memory sub-array as a fundamental processing unit based on SOT-MRAM. We compare CMP-PIM accelerator performance on different data-sets with recent CNN accelerator designs. With the close inference accuracy on SVHN data-set, CMP-PIM can get ∼ 94× and 3× better energy efficiency compared to CNN and Local Binary CNN (LBCNN), respectively. Besides, it achieves 4.3× speed-up compared to CNN-baseline with identical network configuration.

References

  1. L. Cavigelli et al., "Accelerating real-time embedded scene labeling with convolutional networks," in DAC, 2015 52nd ACM/IEEE. IEEE, 2015, pp. 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Andri et al., "Yodann: An architecture for ultra-low power binary-weight cnn acceleration," IEEE TCAD, 2017.Google ScholarGoogle Scholar
  3. S. Zhou et al., "Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients," arXiv preprint arXiv:1606.06160, 2016.Google ScholarGoogle Scholar
  4. M. Rastegari et al., "Xnor-net: Imagenet classification using binary convolutional neural networks," in European Conference on Computer Vision. Springer, 2016, pp. 525--542.Google ScholarGoogle Scholar
  5. T. P. Weldon et al., "Efficient gabor filter design for texture segmentation," Pattern recognition, vol. 29, no. 12, pp. 2005--2015, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Ahonen et al., "Face description with local binary patterns: Application to face recognition," IEEE TPAMI, vol. 28, pp. 2037--2041, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Juefei-Xu et al., "Local binary convolutional neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 19--28.Google ScholarGoogle Scholar
  8. S. S. Sarwar et al., "Gabor filter assisted energy efficient fast learning convolutional neural networks," arXiv preprint arXiv:1705.04748, 2017.Google ScholarGoogle Scholar
  9. T. Tang et al., "Binary convolutional neural network on rram," in 22nd ASP-DAC. IEEE, 2017, pp. 782--787.Google ScholarGoogle Scholar
  10. S. Li et al., "Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories," in 2016 53nd DAC. IEEE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Chi et al., "Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory," in ISCA. IEEE Press, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Kim et al., "Write-optimized reliable design of stt mram," in Proceedings of the 2012 ACM/IEEE ISLPED. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Prenat et al., "Beyond stt-mram, spin orbit torque ram sot-mram for high speed and high reliability applications," in Spintronics-based Computing. Springer, 2015.Google ScholarGoogle Scholar
  14. L. Sifre and S. Mallat, "Rigid-motion scattering for image classification," Ph.D. dissertation, Citeseer, 2014.Google ScholarGoogle Scholar
  15. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.Google ScholarGoogle Scholar
  16. F. Chollet, "Xception: Deep learning with depthwise separable convolutions," arXiv preprint arXiv:1610.02357, 2016.Google ScholarGoogle Scholar
  17. K. He et al., "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770--778.Google ScholarGoogle Scholar
  18. C.-F. Pai et al., "Spin transfer torque devices utilizing the giant spin hall effect of tungsten," Applied Physics Letters, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Aga et al., "Compute caches," in High Performance Computer Architecture (HPCA), 2017 IEEE International Symposium on. IEEE, 2017, pp. 481--492.Google ScholarGoogle Scholar
  20. S. Jeloka et al., "A 28 nm configurable memory (tcam/bcam/sram) using push-rule 6t bit cell enabling logic-in-memory," IEEE Journal of Solid-State Circuits, vol. 51, no. 4, pp. 1009--1021, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. R. Collobert et al., "Torch7: A matlab-like environment for machine learning," in BigLearn, NIPS Workshop, no. EPFL-CONF-192376, 2011.Google ScholarGoogle Scholar
  22. K. Simonyan et al., "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.Google ScholarGoogle Scholar
  23. C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1--9.Google ScholarGoogle Scholar
  24. (2011) Ncsu eda freepdk45. {Online}. Available: http://www.eda.ncsu.edu/wiki/FreePDK45:ContentsGoogle ScholarGoogle Scholar
  25. Z. He et al., "High performance and energy-efficient in-memory computing architecture based on sot-mram," in NANOARCH. IEEE, 2017, pp. 97--102.Google ScholarGoogle Scholar
  26. X. Dong et al., "Nvsim: A circuit-level performance, energy, and area model for emerging non-volatile memory," in Emerging Memory Technologies. Springer, 2014, pp. 15--50.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAC '18: Proceedings of the 55th Annual Design Automation Conference
    June 2018
    1089 pages
    ISBN:9781450357005
    DOI:10.1145/3195970

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 24 June 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,770of5,499submissions,32%

    Upcoming Conference

    DAC '24
    61st ACM/IEEE Design Automation Conference
    June 23 - 27, 2024
    San Francisco , CA , USA

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader