ABSTRACT
This paper introduces Bitwise, a compiler that minimizes the bitwidth the number of bits used to represent each operand for both integers and pointers in a program. By propagating 70 static information both forward and backward in the program dataflow graph, Bitwise frees the programmer from declaring bitwidth invariants in cases where the compiler can determine bitwidths automatically. Because loop instructions comprise the bulk of dynamically executed instructions, Bitwise incorporates sophisticated loop analysis techniques for identifying bitwidths. We find a rich opportunity for bitwidth reduction in modern multimedia and streaming application workloads. For new architectures that support sub-word data-types, we expect that our bitwidth reductions will save power and increase processor performance.
This paper also applies our analysis to silicon compilation, the translation of programs into custom hardware, to realize the full benefits of bitwidth reduction. We describe our integration of Bitwise with the DeepC Silicon Compiler. By taking advantage of bitwidth information during architectural synthesis, we reduce silicon real estate by 15 - 86%, improve clock speed by 3 - 249%, and reduce power by 46 - 73%. The next era of general purpose and reconfigurable architectures should strive to capture a portion of these gains.
- 1.C. S. Ananian. The Static Single Information Form. Technical Report MIT-LCS-TR-801, Massachusetts Institute of Technology, 1999.]]Google Scholar
- 2.Annapolis Micro Systems, Inc., Annapolis, MD. WILD- ONE(tin) Reference Manual, 1999. Revision 3.3.]]Google Scholar
- 3.J. Babb. High-Level Compilation For Reconfigurable Architectures. PhD thesis, EECS Department, MIT, Department of Electrical Engineering and Computer Science, May 2000.]] Google ScholarDigital Library
- 4.J. Babb, M. Rinard, A. Moritz, W. Lee, M. Frank, R. Barua, and S. Amarasinghe. Parallelizing Applications Into Silicon. In Proceedings of the IEEE Workshop on FPCAs for Custom Computing Machines (FCCM), Napa Valley, CA, April 1999.]] Google ScholarDigital Library
- 5.R. Barua, W. Lee, S. Amarasinghe, and A. Agarwal. Maps: A Compiler-Managed Memory System for Raw Machines. In Proceedings of the 26th International Symposium on Computer Architecture, Atlanta, GA, May 1999.]] Google ScholarDigital Library
- 6.Bitwise Project. http://www, cag. lcs.mit, edu/bitwise.]]Google Scholar
- 7.D. Brooks and M. Martonosi. Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance. In 5th International Symposium of High Performance Computer Architecture, January 1999.]] Google ScholarDigital Library
- 8.M. Budiu, S. Goldstein, M. Sakr, and K. Walker. BitValue inference: Detecting and exploiting narrow bitwidth computations. In Proceedings of the EuroPar 2000 European Conference on Parallel Computing, Munich, Germany, Aug. 2000.]] Google ScholarDigital Library
- 9.R. French, M. Lam, J. Levitt, and K. Olukotun. A General Method for Compiling Event-Driven Simulations. 32nd A CM/IEEE Design Automation Conference, June 1995.]] Google ScholarDigital Library
- 10.M. P. Gerlek, E. Stoltz, and M. Wolfe. Beyond Induction Variables: Detecting and Classifying Sequences Using a Demand-Driven SSA Form. A CM Transactions on Programming Languages and Systems, 17(1):85-122, January 1995.]] Google ScholarDigital Library
- 11.W. Harrison. Compiler Analysis of the Value Ranges for Variables. IEEE Transactions on Software Engineering, 3:243-250, May 1977.]]Google ScholarDigital Library
- 12.IKOS Systems, Inc. VirtuaLogic Emulation System Documentation, 1999. Version 3.0.4.]]Google Scholar
- 13.R. Johnson and K. Pingali. Dependence-Based Program Analysis. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, pages 78-89, 1993.]] Google ScholarDigital Library
- 14.J. Kin, M. Gupta, and W. H. Magione-Smith. The Filter Cache: An Energy Efficient Memory Structure. Micro-30.]] Google ScholarDigital Library
- 15.K. Knobe and V. Sarkar. Array SSA form and its use in Parallelization. In Principles of Programming Languages (POPL 93), pages 107-120.]] Google ScholarDigital Library
- 16.S. Larsen and S. Amarasinghe. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, Vancouver, BC, June 2000.]] Google ScholarDigital Library
- 17.W. Lee, R. Barua, M. Frank, D. Srikrishna, J. Babb, V. Sarkar, and S. Amarasinghe. Space-Time Scheduling of Instruction-Level Parallelism on a Raw Machine. In Proceedings of the Eighth A CM Conference on Architectural Support for Programming Languages and Operating Systems, pages 46-57, San Jose, CA, Oct. 1998.]] Google ScholarDigital Library
- 18.Open SystemC Initiative. http://www.systemc, org.]]Google Scholar
- 19.J. Patterson. Accurate Static Branch Prediction by Value Range Propagation. In Proceedings of the SIC- PLAN Conference on Programming Language Design and Implementation, volume 37, pages 67-78, June 1995.]] Google ScholarDigital Library
- 20.A. Peleg and U. Weiser. MMX Technology Extension to Intel Architecture. 16(4):42-50, Aug 1996.]] Google ScholarDigital Library
- 21.R. Razdan. PRISC: Programmable Reduced Instruction Set Computers. PhD thesis, Division of Applied Science, Harvard University, (Harvard University Technical Report 14-94, Center for Research in computing technologies), May 1994.]] Google ScholarDigital Library
- 22.R. Rugina and M. Rinard. Pointer Analysis for Multithreaded Programs. In Proceedings of the SIGPLAN Conference on Program Language Design and Implementation, pages 77-90, Atlanta, GA, May 1999.]] Google ScholarDigital Library
- 23.R. Rugina and M. Rinard. Automatic Parallelization of Divide and Conquer Algorithms. In Proceedings of the SIGPLAN Conference on Program Language Design and Implementation, Vancouver, BC, June 2000.]] Google ScholarDigital Library
- 24.M. D. Smith. Extending SUIF for Machine-dependent Optimizations. In Proceedings of the First S UIF Compiler Workshop, pages 14-25, Stanford, CA, Jan. 1996.]]Google Scholar
- 25.J. Tyler, J. Lent, A. Mather, and H. V. Nguyen. A1- tiVec(tm): Bringing Vector Technology to the PowerPC(tm) Processor Family. Phoenix, AZ, February 1999.]]Google Scholar
- 26.R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S.-W. Liao, C.-W. Tseng, M. Hall, M. Lam, and J. Hennessy. SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers. A CM SIGPLAN Notices, 29(12), Dec. 1996.]] Google ScholarDigital Library
Index Terms
- Bidwidth analysis with application to silicon compilation
Recommendations
Bidwidth analysis with application to silicon compilation
This paper introduces Bitwise, a compiler that minimizes the bitwidth the number of bits used to represent each operand for both integers and pointers in a program. By propagating 70 static information both forward and backward in the program dataflow ...
Matlab to C compilation targeting application specific instruction set processors
DATE '16: Proceedings of the 2016 Conference on Design, Automation & Test in EuropeThis paper discusses a MATLAB to C compiler exploiting custom instructions such as instructions for SIMD processing and instructions for complex arithmetic present in Application Specific Instruction Set Processors (ASIPs). The compiler generates ANSI C ...
Comments