ABSTRACT
In this paper, we present our state-of-the-art approximate techniques that cover the main pillars of approximate computing research. Our analysis considers both static and reconfigurable approximation techniques as well as operation-specific approximate components (e.g., multipliers) and generalized approximate highlevel synthesis approaches. As our application target, we discuss the improvements that such techniques bring on machine learning and neural networks. In addition to the conventionally analyzed performance and energy gains, we also evaluate the improvements that approximate computing brings in the operating temperature.
- N. P. Jouppi et al. 2017. In-datacenter performance analysis of a tensor processing unit. In International Symposium on Computer Architecture, 1--12.Google ScholarDigital Library
- H. Amrouch, G. Zervakis, S. Salamin, H. Kattan, I. Anagnostopoulos, and J. Henkel. 2020. Npu thermal management. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.Google ScholarCross Ref
- J. Miao, K. He, A. Gerstlauer, and M. Orshansky. 2012. Modeling and synthesis of quality-energy optimal approximate adders. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD).Google Scholar
- M. Shafique, W. Ahmad, R. Hafiz, and J. Henkel. 2015. A low latency generic accuracy configurable adder. In Design Automation Conference.Google Scholar
- G. Zervakis, K. Koliogeorgi, D. Anagnostos, N. Zompakis, and K. Siozios. 2019. Vader: voltage-driven netlist pruning for cross-layer approximate arithmetic circuits. IEEE Trans. on Very Large Scale Integration Systems, 27, 6, 1460--1464.Google ScholarDigital Library
- H. Saadat, H. Bokhari, and S. Parameswaran. 2018. Minimally biased multipliers for approximate integer and floating-point multiplication. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 37, 11, 2623--2635.Google ScholarCross Ref
- S. S. Sarwar, S. Venkataramani, A. Ankit, A. Raghunathan, and K. Roy. 2018. Energy-efficient neural computing with approximate multipliers. ACM Journal on Emerging Technologies in Computing Systems (JETC), 14, 2, 1--23.Google ScholarDigital Library
- Z. Tasoulas, G. Zervakis, I. Anagnostopoulos, H. Amrouch, and J. Henkel. 2020. Weight-oriented approximation for energy-efficient neural network inference accelerators. IEEE Trans. Circuits Syst. I: Regular Papers, 1--14.Google ScholarCross Ref
- G. Zervakis, S. Xydis, D. Soudris, and K. Pekmestzi. 2019. Multi-level approximate accelerator synthesis under voltage island constraints. IEEE Transactions on Circuits and Systems II: Express Briefs, 66, 4, 607--611.Google ScholarCross Ref
- S. Lee, L. K.John, and A. Gerstlauer. 2017. High-level synthesis of approximate hardware under joint precision and voltage scaling. In Design, Automation and Test in Europe (DATE) Conference.Google Scholar
- S. Jain, S. Venkataramani, V. Srinivasan, J. Choi, P. Chuang, and L. Chang. 2018. Compensated-dnn: energy efficient low-precision deep neural networks by compensating quantization errors. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).Google Scholar
- E. Wang, J.J. Davis, R. Zhao, H.-C. Ng, X. Niu, W. Luk, P. Y. Cheung, and G. A. Constantinides. 2019. Deep neural network approximation for custom hardware: where we've been, where we're going. arXiv preprint arXiv:1901.06955.Google Scholar
- H. Jiang, C. Liu, L. Liu, F. Lombardi, and J. Han. 2017. A review, classification, and comparative evaluation of approximate arithmetic circuits. 13, 4.Google Scholar
- S. Venkataramani, S. T. Chakradhar, K. Roy, and A. Raghunathan. 2015. Approximate computing and the quest for computing efficiency. In Design Automation Conference (DAC), 1--6.Google Scholar
- S.Rehman, W. El-Harouni, M. Shafique, A. Kumar, J. Henkel, and J. Henkel. 2016. Architectural-space exploration of approximate multipliers. In International Conference on Computer-Aided Design (ICCAD).Google Scholar
- H. Saadat, H. Javaid, A. Ignjatovic, and S. Parameswaran. 2020. Realm: reduced-error approximate log-based integer multiplier. In Design, Automation and Test in Europe (DATE '20), 1366--1371.Google Scholar
- K. Y. Kyaw, W. L. Goh, and K. S. Yeo. 2010. Low-power high-speed multiplier for error-tolerant application. In Proc. EDSSC, 1--4.Google Scholar
- K. Bhardwaj, P. S. Mane, and J. Henkel. 2014. Power- and area-efficient approximate wallace tree multiplier for error-resilient systems. In Fifteenth International Symposium on Quality Electronic Design, 263--269.Google Scholar
- P. Kulkarni, P. Gupta, and M. Ercegovac. 2011. Trading accuracy for power with an underdesigned multiplier architecture. In Internatioal Conference on VLSI Design, 346--351.Google Scholar
- S. Hashemi, R. I. Bahar, and S. Reda. 2015. DRUM: a dynamic range unbiased multiplier for approximate applications. In International Conference on Computer-Aided Design.Google Scholar
- S. Narayanamoorthy, H. A. Moghaddam, Z. Liu, T. Park, and N. S. Kim. 2015. Energy-efficient approximate multiplication for digital signal processing and classification applications. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23, 6, 1180--1184.Google ScholarDigital Library
- J. N. Mitchell. 1962. Computer multiplication and division using binary logarithms. IRE Trans. on Electronic Computers, EC-11, 4, 512--517.Google Scholar
- M. S. Ansari, B. F. Cockburn, and J. Han. 2019. A hardware-efficient logarithmic multiplier with improved accuracy. In Design, Automation Test in Europe Conference Exhibition.Google Scholar
- J. Y. F. Tong et al. 2000. Reducing power by optimizing the necessary precision/range of floating-point arithmetic. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, 8, 3, 273--286.Google ScholarDigital Library
- H. Zhang, W. Zhang, and J. Lach. 2014. A low-power accuracy-configurable floating point multiplier. In International Conference on Computer Design).Google Scholar
- S. Lee and A. Gerstlauer. 2019. Approximate high-level synthesis of custom hardware. In Approximate Circuits: Methodologies and CAD. S. Reda and M. Shafique, editors. Springer.Google Scholar
- S. Lee and A. Gerstlauer. 2018. Data-dependent loop approximations for performance-quality driven high-level synthesis. IEEE Embedded Systems Letters (ESL), 10, 1, 18--21.Google ScholarDigital Library
- S. Lee and A. Gerstlauer. 2013. Fine grain word length optimization for dynamic precision scaling in DSP systems. In IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC).Google Scholar
- S. Lee, D. Lee, K. Han, T. Kim, E. Shriver, L. K.John, and A. Gerstlauer. 2016. Statistical quality modeling of approximate hardware. In IEEE International Symposium on Quality Electronic Design (ISQED).Google Scholar
- A. Raha and V. Raghunathan. 2017. Towards full-system energy-accuracy tradeoffs: a case study of an approximate smart camera system. In Design Automation Conference, 1--6.Google Scholar
- G. Zervakis, H. Amrouch, and J. Henkel. 2020. Design automation of approximate circuits with runtime reconfigurable accuracy. IEEE Access, 8, 53522--53538.Google ScholarCross Ref
- G. Bulman, P. Barletta, J. Lewis, N. Baldasaro, M. Manno, A. Bar-Cohen, and B. Yang. 2016. Superlattice-based thin-film thermoelectric modules with high cooling fluxes. Nature Communications, 7, (2016), 10302.Google Scholar
- T. Alan, A. Gerstlauer, and J. Henkel. 2020. Runtime accuracy-configurable approximate hardware synthesis using logic gating and relaxation. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
- K. He, A. Gerstlauer, and M. Orshansky. 2011. Controlled timing-error acceptance for low energy IDCT design. In Design, Automation and Test in Europe (DATE) Conference.Google Scholar
- B. Boroujerdian, H. Amrouch, J. Henkel, and A. Gerstlauer. 2018. Trading off temperature guardbands via adaptive approximations. In IEEE International Conference on Computer Design (ICCD).Google Scholar
- H. Kim, J. Kim, H. Amrouch, J. Henkel, A. Gerstlauer, K. Choi, and H. Park. 2020. Aging compensation with dynamic computation approximation. IEEE Transactions on Circuits and Systems I: Regular Papers, 67, 4, 1319--1332.Google ScholarCross Ref
- H. Amrouch, B. Khaleghi, A. Gerstlauer, and J. Henkel. 2017. Towards aging-induced approximations. In ACM/IEEE Design Automation Conference (DAC).Google Scholar
- P. Stanley-Marbell et al. 2020. Exploiting errors for efficiency: a survey from circuits to applications. ACM Computing Surveys (CSUR), 53, 3, 51:1-51:39.Google Scholar
Index Terms
- Approximate Computing for ML: State-of-the-art, Challenges and Visions
Recommendations
Invited - Cross-layer approximate computing: from logic to architectures
DAC '16: Proceedings of the 53rd Annual Design Automation ConferenceWe present a survey of approximate techniques and discuss concepts for building power-/energy-efficient computing components reaching from approximate accelerators to arithmetic blocks (like adders and multipliers). We provide a systematical ...
Implementation of energy-efficient approximate multiplier with guaranteed worst case relative error
AbstractExisting design methods for approximate multipliers typically rely on exhaustive simulation to determine the approximation error. However, this approach is not tractable for complex designs. In this paper, a two-dimensional piecewise ...
A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits
Often as the most important arithmetic modules in a processor, adders, multipliers, and dividers determine the performance and energy efficiency of many computing tasks. The demand of higher speed and power efficiency, as well as the feature of error ...
Comments