skip to main content
10.1145/3508352.3549478acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

Published:22 December 2022Publication History

ABSTRACT

Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications. To tackle this limitation, pioneering works have developed handcrafted multiplication-free DNNs, which require expert knowledge and time-consuming manual iteration, calling for fast development tools. To this end, we propose a Neural Architecture Search and Acceleration framework dubbed NASA, which enables automated multiplication-reduced DNN development and integrates a dedicated multiplication-reduced accelerator for boosting DNNs' achievable efficiency. Specifically, NASA adopts neural architecture search (NAS) spaces that augment the state-of-the-art one with hardware inspired multiplication-free operators, such as shift and adder, armed with a novel progressive pretrain strategy (PGP) together with customized training recipes to automatically search for optimal multiplication-reduced DNNs; On top of that, NASA further develops a dedicated accelerator, which advocates a chunk-based template and auto-mapper dedicated for NASA-NAS resulting DNNs to better leverage their algorithmic properties for boosting hardware efficiency. Experimental results and ablation studies consistently validate the advantages of NASA's algorithm-hardware co-design framework in terms of achievable accuracy and efficiency tradeoffs. Codes are available at https://github.com/shihuihong214/NASA.

References

  1. Manoj Alwani, Han Chen, Michael Ferdman, and Peter Milder. 2016. Fused-layer CNN accelerators. 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2016), 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Banner, Itay Hubara, E. Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-bit Training of Neural Networks. In NeurIPS.Google ScholarGoogle Scholar
  3. Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. ArXiv abs/1812.00332 (2019).Google ScholarGoogle Scholar
  4. Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2018. Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks. arXiv preprint arXiv:1807.07928 (2018).Google ScholarGoogle Scholar
  5. Yu-Hsin Chen, T. Krishna, J. Emer, and V. Sze. 2017. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits 52 (2017), 127--138.Google ScholarGoogle ScholarCross RefCross Ref
  6. Mostafa Elhoushi, Farhan Shafiq, Y. Tian, Joey Li, and Zihao Chen. 2019. DeepShift: Towards Multiplication-Less Neural Networks. ArXiv abs/1905.13298 (2019).Google ScholarGoogle Scholar
  7. Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 770--778.Google ScholarGoogle Scholar
  8. Yu hsin Chen, Joel S. Emer, and Vivienne Sze. 2016. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016), 367--379.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical Reparameterization with Gumbel-Softmax. ArXiv abs/1611.01144 (2017).Google ScholarGoogle Scholar
  10. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2015).Google ScholarGoogle Scholar
  11. Hanxiao Liu, K. Simonyan, and Yiming Yang. 2019. DARTS: Differentiable Architecture Search. ArXiv abs/1806.09055 (2019).Google ScholarGoogle Scholar
  12. Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, and J. Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. In ICML.Google ScholarGoogle Scholar
  13. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2019. Regularized Evolution for Image Classifier Architecture Search. In AAAI.Google ScholarGoogle Scholar
  14. Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), 6517--6525.Google ScholarGoogle ScholarCross RefCross Ref
  15. Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. ArXiv abs/1804.02767 (2018).Google ScholarGoogle Scholar
  16. Yongming Shen, Michael Ferdman, and Peter Milder. 2017. Maximizing CNN accelerator efficiency through resource partitioning. 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (2017), 535--547.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Christian Szegedy, W. Liu, Y. Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, D. Erhan, V. Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  18. Alvin Wan, Xiaoliang Dai, Peizhao Zhang, Zijian He, Yuandong Tian, Saining Xie, Bichen Wu, Matthew Yu, Tao Xu, Kan Chen, Péter Vajda, and Joseph Gonzalez. 2020. FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 12962--12971.Google ScholarGoogle ScholarCross RefCross Ref
  19. Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, and Song Han. 2020. HAT: Hardware-Aware Transformers for Efficient Natural Language Processing. ArXiv abs/2005.14187 (2020).Google ScholarGoogle Scholar
  20. Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei Zhang, Chunjing Xu, and Dacheng Tao. 2020. AdderNet: Do We Really Need Multiplications in Deep Learning? 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 1465--1474.Google ScholarGoogle Scholar
  21. Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei Zhang, Chunjing Xu, and Dacheng Tao. 2021. AdderNet and its Minimalist Hardware Design for Energy-Efficient Artificial Intelligence. ArXiv abs/2101.10015 (2021).Google ScholarGoogle Scholar
  22. Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Péter Vajda, Yangqing Jia, and Kurt Keutzer. 2019. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), 10726--10734.Google ScholarGoogle ScholarCross RefCross Ref
  23. Bichen Wu, Alvin Wan, Xiangyu Yue, Peter H. Jin, Sicheng Zhao, Noah Golmant, Amir Gholaminejad, Joseph E. Gonzalez, and Kurt Keutzer. 2018. Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 9127--9135.Google ScholarGoogle ScholarCross RefCross Ref
  24. Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, and Yingyan Lin. 2020. AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs. In Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (Seaside, CA, USA) (FPGA '20). Association for Computing Machinery, New York, NY, USA, 40--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ping Xue and Bede Liu. 1984. Adaptive equalizer using finite-bit power-of-two quantizer. In IEEE Trans. Acoust. Speech Signal Process.Google ScholarGoogle Scholar
  26. Haoran You, Xiaohan Chen, Yongan Zhang, Chaojian Li, Sicheng Li, Zihao Liu, Zhangyang Wang, and Yingyan Lin. 2020. ShiftAddNet: A Hardware-Inspired Deep Network. Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) (2020).Google ScholarGoogle Scholar
  27. Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, and Quoc V. Le. 2020. BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models. In ECCV.Google ScholarGoogle Scholar
  28. Xiaofan Zhang, Junsong Wang, Chao Zhu, Y. Lin, Jinjun Xiong, W. Hwu, and D. Chen. 2018. DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs. 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2018), 1--8.Google ScholarGoogle Scholar
  29. Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, and Yingyan Lin. 2020. SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (2020), 954--967.Google ScholarGoogle Scholar
  30. Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, and Yingyan Lin. 2020. DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020), 1593--1597.Google ScholarGoogle Scholar
  31. Barret Zoph and Quoc V. Le. 2017. Neural Architecture Search with Reinforcement Learning. ArXiv abs/1611.01578 (2017).Google ScholarGoogle Scholar
  32. Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2018. Learning Transferable Architectures for Scalable Image Recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 8697--8710.Google ScholarGoogle Scholar

Index Terms

  1. NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design
          October 2022
          1467 pages
          ISBN:9781450392174
          DOI:10.1145/3508352

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 December 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate457of1,762submissions,26%

          Upcoming Conference

          ICCAD '24
          IEEE/ACM International Conference on Computer-Aided Design
          October 27 - 31, 2024
          New York , NY , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader