research-article

Optimizing techniques for saturated arithmetic with first-order linear recurrence

Authors:
Weihua Zhang

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Lili Liu

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Chen Zhang

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Hongjiong Zhang

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Binyu Zang

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Chuanqi Zhu

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

SAC '09: Proceedings of the 2009 ACM symposium on Applied ComputingMarch 2009Pages 1883–1889https://doi.org/10.1145/1529282.1529704

Published:08 March 2009Publication History

SAC '09: Proceedings of the 2009 ACM symposium on Applied Computing

Pages 1883–1889

ABSTRACT

Saturated arithmetic is a typical operation in multimedia applications, most multimedia extensions in the instruction set architecture (ISA) of modern processors provide saturation instructions for such operation. Therefore, extensive researches have focused on how to utilize saturation instructions to optimize programs. Previous algorithms mainly focus on purely saturated arithmetic, however saturated arithmetic is often mingled with first-order linear recurrence (FOLR) in real life applications. When FLOR pattern appears in the program, previous algorithms can not identify the saturated arithmetic as well.

In fact, the saturated arithmetic with FOLR (SAWF) is a new and significant pattern, especially, SAWF with one as coefficient is frequently used in multimedia applications. Hence, it is necessary to explore a method with which such pattern can be efficiently vectorized. This paper discusses how to vectorize SAWF, explores the efficient method to vectorize SAWF with one as coefficient and gives its evaluation and implement a library for the optimizing technique. Such an implementation manner can make compilers are able to exploit it more easily. The experimental results shows the optimizing technique can achieve a speedup of 1.19 to 1.46 on Pentium IV processor. At the same time, the optimizing techniques in this paper can also be used to develop a library for SAWF so a programmer can benefit even without changing the compiler.

References

Gang Ren, Peng Wu, David Padua. An Empirical Study On the Vectorization of Multimedia Applications for Multimedia Extensions. Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, 2005. Google ScholarDigital Library
Aart J. C. Bik, Milind Girkae, Paul M. Grey, Xinmin Tian. Automatic Detection of Saturation and Clipping Idioms. Proceedings of the 15th International Workshop on Languages and Compilers for parallel computers, July, 2002 Google ScholarDigital Library
Slingerland N, Smith A J. Measuring the Performance of Multimedia Instruction Sets. IEEE Trans. Computers, 2002, 51(11): 1317--1332. Google ScholarDigital Library
Nathan T. Slingerland, Alan Jay Smith. Design and characterization of the Berkeley multimedia workload, Multimedia Syst, 2002, 8(4): 315--327. Google ScholarDigital Library
Ren G, Wu P, Padua D. A Preliminary Study On the Vectorization of Multimedia Applications for Multimedia Extensions. Proc. Of the 16th Int'l WorkShop on Languages and Compilers for Parallel Computing. 2003Google Scholar
Weihua Jiang, Chao Mei, BoHuang, Jianhui Li, Jiahua Zhu, Binyu Zang, Chuanqi Zhu. Boosting the Performance of Multimedia Applications Using SIMD Instructions. Compiler Constructions. 2005 Google ScholarDigital Library
Jiahua Zhu, HongJiang Zhang, Hui Shi, Binyu Zang, Chuanqi Zhu "Overflow Controlled SIMD Arithmetic". The 17th International Workshop on Languages and Compilers for Parallel Computing (LCPC 04) Google ScholarDigital Library
Hong-Soog Kim, Young-Ha Yoon, Dong-Soo Han. Parallel Processing of First Order Linear Recurrence on SMP Machines. The Journal of Supercomputing, 27, 295--310, 2004 Google ScholarDigital Library
M. Nakamura, Y. Okabe, and T. Tsuda. New fast algorithms for first-order linear recurrences on vector computers. In 5th Workshop on Compilers for Parallel Computers, pp. 167C174, June 1995.Google Scholar
H. Wang, A. Nicolau, S. Keung, and Kai-Yeung Siu. Computing programs containing band linear recurrences on vector supercomputers. IEEE Transactions on Parallel and Distributed Systems, 7(8):769C782, August 1996. Google ScholarDigital Library
Y. Tanaka, K. Iwasawa, S. Gotoo, and Y. Umetani. Compiling techniques for first-order linear recurrences on a vector computer. In Supercomputing 88, pp. 174C181, IEEE, November 1988. Google ScholarDigital Library
H. Wang, A. Nicolau, S. Keung, and K. S. Siu. Scalable techniques for computing band linear recurrences on massively parallel and vector supercomputers. In 8th International Parallel Processing Symposium, pp. 502C508. IEEE/ACM, April 1994. Google ScholarDigital Library
Randy Allen, Ken Kennedy, Carrie Porterfield and Joe Warren. Conversion of Control Dependence to Data Dependence. ACM Symposium on Principles of Programming. Google ScholarDigital Library
Zheng B, Tsai J Y, Zhang BY, Chen T, Huang B, Li J H, Ding Y H, Liang J, Zhen Y, Yew P C, Zhu C Q. Designing the Agassiz Compiler for Concurrent Multithreaded Architectures. Proc. Of the 12th Intel WorkShop on Languages and Compilers for Parallel Computing, 1999:380--398 Google ScholarDigital Library

Index Terms

Optimizing techniques for saturated arithmetic with first-order linear recurrence
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Multi-dimensional Vectorization in LLVM
WPMVP'19: Proceedings of the 5th Workshop on Programming Models for SIMD/Vector Processing

Loop vectorization is a classic technique to exploit SIMD instructions in a productive way. In multi-dimensional vectorization, multiple loops of a loop nest are vectorized at once. This exposes opportunities for data reuse, register tiling and more ...
Read More
Outer-loop vectorization: revisited for short SIMD architectures
PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques

Vectorization has been an important method of using data-level parallelism to accelerate scientific workloads on vector machines such as Cray for the past three decades. In the last decade it has also proven useful for accelerating multi-media and ...
Read More
Enhancing LLVM Optimizations for Linear Recurrence Programs on RVV
ICPP Workshops '23: Proceedings of the 52nd International Conference on Parallel Processing Workshops

The RISC-V Vector Extension (RVV) has emerged as a promising vector architecture for high-performance computing. It enables parallel computing capability for RISC-V CPUs by introducing additional vector instructions and vector registers. To fully ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '09: Proceedings of the 2009 ACM symposium on Applied Computing
March 2009
2347 pages
ISBN:9781605581668
DOI:10.1145/1529282
Conference Chairs:
Sung Y. Shin
South Dakota State University, United States
,
Sascha Ossowski
University Rey Juan Carlos, Spain
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 March 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
FOLR
SAWF
SIMD
optimization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 123
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Optimizing techniques for saturated arithmetic with first-order linear recurrence

SAC '09: Proceedings of the 2009 ACM symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-dimensional Vectorization in LLVM

Outer-loop vectorization: revisited for short SIMD architectures

Enhancing LLVM Optimizations for Linear Recurrence Programs on RVV