skip to main content
10.1145/3061639.3062337acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

Ultra-Efficient Processing In-Memory for Data Intensive Applications

Published:18 June 2017Publication History

ABSTRACT

Recent years have witnessed a rapid growth in the domain of Internet of Things (IoT). This network of billions of devices generates and exchanges huge amount of data. The limited cache capacity and memory bandwidth make transferring and processing such data on traditional CPUs and GPUs highly inefficient, both in terms of energy consumption and delay. However, many IoT applications are statistical at heart and can accept a part of inaccuracy in their computation. This enables the designers to reduce complexity of processing by approximating the results for a desired accuracy. In this paper, we propose an ultra-efficient approximate processing in-memory architecture, called APIM, which exploits the analog characteristics of non-volatile memories to support addition and multiplication inside the crossbar memory, while storing the data. The proposed design eliminates the overhead involved in transferring data to processor by virtually bringing the processor inside memory. APIM dynamically configures the precision of computation for each application in order to tune the level of accuracy during runtime. Our experimental evaluation running six general OpenCL applications shows that the proposed design achieves up to 20x performance improvement and provides 480x improvement in energy-delay product, ensuring acceptable quality of service. In exact mode, it achieves 28x energy savings and 4.8x speed up compared to the state-of-the-art GPU cores.

References

  1. J. Gubbi et al., "Internet of things (IoT): A vision, architectural elements, and future directions," Future Generation Computer Systems, vol. 29, no. 7, pp. 1645--1660, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Samragh et al., "Looknn: Neural network with no multiplication," in IEEE/ACM DATE, 2017.Google ScholarGoogle Scholar
  3. K. Hwang et al., Distributed and cloud computing: from parallel processing to the internet of things. Morgan Kaufmann, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Balasubramonian et al., "Near-data processing: Insights from a micro-46 workshop," Microarchitecture, vol. 34, no. 4, pp. 36--42, 2014.Google ScholarGoogle Scholar
  5. G. Loh et al., "A processing-in-memory taxonomy and a case for studying fixed-function pim," in WoNDP, 2013.Google ScholarGoogle Scholar
  6. M. Imani et al., "Mpim: Multi-purpose in-memory processing using configurable resistive memory," in IEEE ASP-DAC, pp. 757--763, IEEE, 2017.Google ScholarGoogle Scholar
  7. S. Pugsley et al., "Comparing implementations of near-data computing with in-memory mapreduce workloads," Microarchitecture, vol. 34, no. 4, pp. 44--52, 2014.Google ScholarGoogle Scholar
  8. A. M. Aly et al., "M3: Stream processing on main-memory mapreduce," in ICDE, pp. 1253--1256, IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Han et al., "Approximate computing: An emerging paradigm for energy-efficient design," in ETS, pp. 1--6, IEEE, 2013.Google ScholarGoogle Scholar
  10. M. Imani et al., "Efficient neural network acceleration on gpgpu using content addressable memory," in IEEE/ACM DATE, 2017.Google ScholarGoogle Scholar
  11. M. Imani et al., "Resistive configurable associative memory for approximate computing," in DATE, pp. 1327--1332, IEEE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. Gupta et al., "Impact: imprecise adders for low-power approximate computing," in ISLPED, pp. 409--414, IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Imani et al., "Masc: Ultra-low energy multiple-access single-charge tcam for approximate computing," in IEEE/ACM DATE, pp. 373--378, IEEE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Q. Guo et al., "Ac-dimm: associative computing with stt-mram," in ISCA, vol. 41, pp. 189--200, ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Q. Guo et al., "A resistive tcam accelerator for data-intensive computing," in Microarchitecture, pp. 339--350, ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Imani et al., "Exploring hyperdimensional associative memory," in IEEE HPCA, IEEE, 2017.Google ScholarGoogle Scholar
  17. X. Yin et al., "Design and benchmarking of ferroelectric fet based tcam," in IEEE/ACM DATE, IEEE, 2017.Google ScholarGoogle Scholar
  18. J. Borghetti et al., "A hybrid nanomemristor/transistor logic circuit capable of self-programming," PNAS, vol. 106, no. 6, pp. 1699--1703, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Imani et al., "Acam: Approximate computing based on adaptive associative memory with online learning," in IEEE/ACM ISLPED, pp. 162--167, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Yavits et al., "Resistive associative processor," IEEE Computer Architecture Letters, vol. 14, no. 2, pp. 148--151, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Borghetti et al., "Memristive switches enable stateful logic operations via material implication," Nature, vol. 464, no. 7290, pp. 873--876, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. Kvatinsky, G. Satat, N. Wald, E. G. Friedman, A. Kolodny, and U. C. Weiser, "Memristor-based material implication (IMPLY) logic: design principles and methodologies," TVLSI, vol. 22, no. 10, pp. 2054--2066, 2014.Google ScholarGoogle Scholar
  23. S. Kvatinsky et al., "MAGIC -- memristor-aided logic," TCAS II, vol. 61, no. 11, pp. 895--899, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  24. N. Talati et al., "Logic design within memristive memories using memristor-aided loGIC (MAGIC)," IEEE TNano, vol. 15, pp. 635--650, jul 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Siemon et al., "A complementary resistive switch-based crossbar array adder," JETCAS, vol. 5, no. 1, pp. 64--74, 2015.Google ScholarGoogle Scholar
  26. V. Gupta et al., "Low-power digital signal processing using approximate adders," TCAD, vol. 32, no. 1, pp. 124--137, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Ubal et al., "Multi2sim: a simulation framework for cpu-gpu computing," in PACT, pp. 335--344, ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Kvatinsky et al., "Vteam: a general model for voltage-controlled memristors," TCAS II, vol. 62, no. 8, pp. 786--790, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  29. "Caltech Library." http://www.vision.caltech.edu/Image_Datasets/Caltech101/.Google ScholarGoogle Scholar

Index Terms

  1. Ultra-Efficient Processing In-Memory for Data Intensive Applications

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017
        June 2017
        533 pages
        ISBN:9781450349277
        DOI:10.1145/3061639

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 June 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate1,770of5,499submissions,32%

        Upcoming Conference

        DAC '24
        61st ACM/IEEE Design Automation Conference
        June 23 - 27, 2024
        San Francisco , CA , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader