skip to main content
10.1145/3316781.3317757acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search

Published:02 June 2019Publication History

ABSTRACT

A fundamental question lies in almost every application of deep neural networks: what is the optimal neural architecture given a specific data set? Recently, several Neural Architecture Search (NAS) frameworks have been developed that use reinforcement learning and evolutionary algorithm to search for the solution. However, most of them take a long time to find the optimal architecture due to the huge search space and the lengthy training process needed to evaluate each candidate. In addition, most of them aim at accuracy only and do not take into consideration the hardware that will be used to implement the architecture. This will potentially lead to excessive latencies beyond specifications, rendering the resulting architectures useless. To address both issues, in this paper we use Field Programmable Gate Arrays (FPGAs) as a vehicle to present a novel hardware-aware NAS framework, namely FNAS, which will provide an optimal neural architecture with latency guaranteed to meet the specification. In addition, with a performance abstraction model to analyze the latency of neural architectures without training, our framework can quickly prune architectures that do not satisfy the specification, leading to higher efficiency. Experimental results on common data set such as ImageNet show that in the cases where the state-of-the-art generates architectures with latencies 7.81× longer than the specification, those from FNAS can meet the specs with less than 1% accuracy loss. Moreover, FNAS also achieves up to 11.13× speedup for the search process. To the best of the authors' knowledge, this is the very first hardware aware NAS.

References

  1. Bowen Baker et al. 2016. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016).Google ScholarGoogle Scholar
  2. Eric Chung et al. 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8--20.Google ScholarGoogle ScholarCross RefCross Ref
  3. Jeremy Fowers et al. 2018. A configurable cloud-scale DNN processor for real-time AI. In Proc. ofISCA. IEEE Press, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Weiwen Jiang et al. 2018. Heterogeneous FPGA-based Cost-Optimal Design for Timing-Constrained CNNs. IEEE TCAD (2018).Google ScholarGoogle Scholar
  5. PYNQ. 2018. PYNQ: Python productivity for ZYNQ. http://www.pynq.io/ (2018).Google ScholarGoogle Scholar
  6. Esteban Real et al. 2017. Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J David Schaffer et al. 1992. Combinations of genetic algorithms and neural networks: A survey of the state of the art. In Proc. of COGANN-92. IEEE, 1--37.Google ScholarGoogle Scholar
  8. Yongming Shen et al. 2017. Maximizing CNN Accelerator Efficiency Through Resource Partitioning. In Proc. of ISCA. 535--547. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xuechao Wei et al. 2018. TGPA: tile-grained pipeline architecture for low latency CNN inference. In Proc. ICCAD. ACM, 58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lingxi Xie et al. 2017. Genetic CNN.. In Proc. of ICCV. 1388--1397.Google ScholarGoogle Scholar
  11. Xiaowei Xu et al. 2018. Resource constrained cellular neural networks for real-time obstacle detection using FPGAs. In Proc. of ISQED. IEEE, 437--440.Google ScholarGoogle Scholar
  12. Lei Yang et al. 2018. Optimal Application Mapping and Scheduling for Network-on-Chips with Computation in STT-RAM based Router. IEEE TC (2018).Google ScholarGoogle Scholar
  13. Chen Zhang et al. 2015. Optimizing fpga-based accelerator design for deep convolutional neural networks. In Proc. of FPGA. ACM, 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chen Zhang et al. 2016. Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster. In Proc. of ISLPED. 326--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xiaofan Zhang et al. 2018. DNNBuilder: an automated tool for building high-performance DNN hardware accelerators for FPGAs. In Proc. ICCAD. ACM, 56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Barret Zoph et al. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).Google ScholarGoogle Scholar
  17. Barret Zoph et al. 2017. Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012 2, 6 (2017).Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019
    June 2019
    1378 pages
    ISBN:9781450367257
    DOI:10.1145/3316781

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 2 June 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,770of5,499submissions,32%

    Upcoming Conference

    DAC '24
    61st ACM/IEEE Design Automation Conference
    June 23 - 27, 2024
    San Francisco , CA , USA

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader