research-article

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

Authors:
Huihong Shi

Rice University and Nanjing University

Rice University and Nanjing University
View Profile

,
Haoran You

Rice University

Rice University
View Profile

,
Yang Zhao

Rice University

Rice University
View Profile

,
Zhongfeng Wang

Nanjing University

Nanjing University
View Profile

,
Yingyan Lin

Rice University

Rice University
View Profile

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided DesignOctober 2022Article No.: 58Pages 1–9https://doi.org/10.1145/3508352.3549478

Published:22 December 2022Publication History

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

Pages 1–9

ABSTRACT

Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications. To tackle this limitation, pioneering works have developed handcrafted multiplication-free DNNs, which require expert knowledge and time-consuming manual iteration, calling for fast development tools. To this end, we propose a Neural Architecture Search and Acceleration framework dubbed NASA, which enables automated multiplication-reduced DNN development and integrates a dedicated multiplication-reduced accelerator for boosting DNNs' achievable efficiency. Specifically, NASA adopts neural architecture search (NAS) spaces that augment the state-of-the-art one with hardware inspired multiplication-free operators, such as shift and adder, armed with a novel progressive pretrain strategy (PGP) together with customized training recipes to automatically search for optimal multiplication-reduced DNNs; On top of that, NASA further develops a dedicated accelerator, which advocates a chunk-based template and auto-mapper dedicated for NASA-NAS resulting DNNs to better leverage their algorithmic properties for boosting hardware efficiency. Experimental results and ablation studies consistently validate the advantages of NASA's algorithm-hardware co-design framework in terms of achievable accuracy and efficiency tradeoffs. Codes are available at https://github.com/shihuihong214/NASA.

References

Manoj Alwani, Han Chen, Michael Ferdman, and Peter Milder. 2016. Fused-layer CNN accelerators. 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2016), 1--12.Google ScholarCross Ref
R. Banner, Itay Hubara, E. Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-bit Training of Neural Networks. In NeurIPS.Google Scholar
Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. ArXiv abs/1812.00332 (2019).Google Scholar
Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2018. Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks. arXiv preprint arXiv:1807.07928 (2018).Google Scholar
Yu-Hsin Chen, T. Krishna, J. Emer, and V. Sze. 2017. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits 52 (2017), 127--138.Google ScholarCross Ref
Mostafa Elhoushi, Farhan Shafiq, Y. Tian, Joey Li, and Zihao Chen. 2019. DeepShift: Towards Multiplication-Less Neural Networks. ArXiv abs/1905.13298 (2019).Google Scholar
Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 770--778.Google Scholar
Yu hsin Chen, Joel S. Emer, and Vivienne Sze. 2016. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016), 367--379.Google ScholarDigital Library
Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical Reparameterization with Gumbel-Softmax. ArXiv abs/1611.01144 (2017).Google Scholar
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2015).Google Scholar
Hanxiao Liu, K. Simonyan, and Yiming Yang. 2019. DARTS: Differentiable Architecture Search. ArXiv abs/1806.09055 (2019).Google Scholar
Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, and J. Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. In ICML.Google Scholar
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2019. Regularized Evolution for Image Classifier Architecture Search. In AAAI.Google Scholar
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), 6517--6525.Google ScholarCross Ref
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. ArXiv abs/1804.02767 (2018).Google Scholar
Yongming Shen, Michael Ferdman, and Peter Milder. 2017. Maximizing CNN accelerator efficiency through resource partitioning. 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (2017), 535--547.Google ScholarDigital Library
Christian Szegedy, W. Liu, Y. Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, D. Erhan, V. Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), 1--9.Google ScholarCross Ref
Alvin Wan, Xiaoliang Dai, Peizhao Zhang, Zijian He, Yuandong Tian, Saining Xie, Bichen Wu, Matthew Yu, Tao Xu, Kan Chen, Péter Vajda, and Joseph Gonzalez. 2020. FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 12962--12971.Google ScholarCross Ref
Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, and Song Han. 2020. HAT: Hardware-Aware Transformers for Efficient Natural Language Processing. ArXiv abs/2005.14187 (2020).Google Scholar
Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei Zhang, Chunjing Xu, and Dacheng Tao. 2020. AdderNet: Do We Really Need Multiplications in Deep Learning? 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 1465--1474.Google Scholar
Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei Zhang, Chunjing Xu, and Dacheng Tao. 2021. AdderNet and its Minimalist Hardware Design for Energy-Efficient Artificial Intelligence. ArXiv abs/2101.10015 (2021).Google Scholar
Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Péter Vajda, Yangqing Jia, and Kurt Keutzer. 2019. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), 10726--10734.Google ScholarCross Ref
Bichen Wu, Alvin Wan, Xiangyu Yue, Peter H. Jin, Sicheng Zhao, Noah Golmant, Amir Gholaminejad, Joseph E. Gonzalez, and Kurt Keutzer. 2018. Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 9127--9135.Google ScholarCross Ref
Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, and Yingyan Lin. 2020. AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs. In Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (Seaside, CA, USA) (FPGA '20). Association for Computing Machinery, New York, NY, USA, 40--50. Google ScholarDigital Library
Ping Xue and Bede Liu. 1984. Adaptive equalizer using finite-bit power-of-two quantizer. In IEEE Trans. Acoust. Speech Signal Process.Google Scholar
Haoran You, Xiaohan Chen, Yongan Zhang, Chaojian Li, Sicheng Li, Zihao Liu, Zhangyang Wang, and Yingyan Lin. 2020. ShiftAddNet: A Hardware-Inspired Deep Network. Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) (2020).Google Scholar
Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, and Quoc V. Le. 2020. BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models. In ECCV.Google Scholar
Xiaofan Zhang, Junsong Wang, Chao Zhu, Y. Lin, Jinjun Xiong, W. Hwu, and D. Chen. 2018. DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs. 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2018), 1--8.Google Scholar
Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, and Yingyan Lin. 2020. SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (2020), 954--967.Google Scholar
Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, and Yingyan Lin. 2020. DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020), 1593--1597.Google Scholar
Barret Zoph and Quoc V. Le. 2017. Neural Architecture Search with Reinforcement Learning. ArXiv abs/1611.01578 (2017).Google Scholar
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2018. Learning Transferable Architectures for Scalable Image Recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 8697--8710.Google Scholar

Index Terms

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design
October 2022
1467 pages
ISBN:9781450392174
DOI:10.1145/3508352
Conference Chair:
Tulika Mitra
National University of Singapore
,
Program Chairs:
Evangeline Young
The Chinese University of Hong Kong
,
Jinjun Xiong
University at Buffalo (UB)
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 December 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate457of1,762submissions,26%
Upcoming Conference
ICCAD '24

Sponsor:

sigda

IEEE/ACM International Conference on Computer-Aided Design

October 27 - 31, 2024

New York , NY , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 106
  Total Downloads
- Downloads (Last 12 months)52
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

ABSTRACT

References

Cited By

Index Terms

Recommendations

NASA Wake Vortex Research for Aircraft Spacing

Experimental Supersonic Combustion Research at NASA Langley

NASA Aircraft Vortex Spacing System Development Status