Pipelining flat CORDIC based trigonometric function generators

doi:10.1016/S0026-2692(01)00107-0

Microelectronics Journal

Volume 33, Issues 1–2, 2 January 2002, Pages 77-89

https://doi.org/10.1016/S0026-2692(01)00107-0 Get rights and content

Abstract

Despite further refinements of the CORDIC algorithm with the introduction of redundant arithmetic and higher radix CORDIC techniques, in terms of circuit latency and performance, the iterative nature remains to be the major bottleneck for further optimization. A technique known as flat CORDIC, in which the conventional X and Y recurrences are successively substituted to express the final vectors in terms of the initial vectors, can be used to eliminate the iterative process. In this paper, the techniques devised for the VLSI efficient implementation of a pipelined 16-bit flat CORDIC based sine–cosine generator are presented. Three possible schemes to pipeline the 16-bit flat CORDIC design have been presented to demonstrate the suitability of the proposed method to realize high throughput implementations. The 16-bit architecture has been synthesized with 0.35 μ CMOS process library using Synopsys. Finally, a detailed comparison with other major contributions show that the flat CORDIC based sine–cosine generators are, on average, 30% faster and occupy some 30% less silicon area.

Section snippets

The Co-ordinate rotation digital computer (CORDIC)

CORDIC (Coordinate Rotation DIgital Computer) is an iterative algorithm for computing trigonometric, hyperbolic and transcendental functions in a compute efficient manner. In CORDIC, the functions are computed by simple shift and add operations [1]. CORDIC algorithm finds numerous applications where hardware acceleration is warranted. Digital signal processing, image processing, singular value decomposition, matrix triangularization, robot kinematics and neural networks are some of the areas,

The generalized flat CORDIC equation

In an N-bit Flat CORDIC, the final values of X_N and Y_N are expressed in terms of X₀ and Y₀ for the parallel implementation of the CORDIC algorithm. This is achieved by successive substitutions of X_i and Y_i in the basic CORDIC equations. Here, we will derive the final equations of the flat CORDIC from the basic CORDIC equations.

CORDIC algorithm for the rotation mode in circular co-ordinate system is characterized by the following basic equations: $X_{i+1} =X_{i} −s_{i} Y_{i} 2^{−i}$ $Y_{i+1} =Y_{i} +s_{i} X_{i} 2^{−i}$ where, X_i and Y_i

Sine–cosine generation

The components of the basic flat CORDIC architecture namely the pre-computation unit, the combiner unit and the adder arrays are tailored to generate the sine and cosine values of the input angles. Implementation of the 16-bit sine–cosine generator is discussed first and this is shown in Fig. 4. Brief operational details of the pre-computation unit are as follows. The Split Decomposition Algorithm (SDA), is used for pre-computing the signed digits. More operational details of the SDA are

Pipelining the flat CORDIC architecture

Although the various modules of the flat CORDIC are implemented using combinatorial logic, they consist of full adder arrays along the predefined stages of the critical path. This leads to the finding that a highly pipelined implementation is feasible. In this section, the method of pipelining the 16-bit flat CORDIC architecture is presented.

Performance comparisons

Six different architectures are considered along with the flat CORDIC architecture to facilitate a detailed comparison. These include both the unfolded and pipelined designs cited in the literature. A true comparison between different implementations is possible only if circuit level simulations are provided. Since this is not always made available in the literature, a rough and first order approximate comparison based on the full adder count for hardware complexity and number of full adder

Conclusions

In this paper, the various equations of the flat CORDIC technique were defined prior to developing a generalized architecture for a flat CORDIC based sine/cosine generation. The salient aspects of the flat CORDIC architecture were then elaborated with the help of a 16-bit flat CORDIC based sine/cosine generator. Three different methods to pipeline the flat CORDIC architecture have been presented to demonstrate that the flat CORDIC architecture lends well for pipelining due to its CSA tree based

References (7)

J.S. Walther
A unified algorithm for elementary functions
Proc. Spring. Joint Comput. Conf.
(1971)
N. Takagi et al.
Redundant CORDIC methods with a constant scaling factor for Sine and Cosine computation
IEEE Trans. Comput.
(1991)
D. Timmermann et al.
Low latency time CORDIC algorithms
IEEE Trans. Comput.
(1992)

There are more references available in the full text version of this article.

Cited by (0)

View full text

Microelectronics Journal

Pipelining flat CORDIC based trigonometric function generators

Abstract

Section snippets

The Co-ordinate rotation digital computer (CORDIC)

The generalized flat CORDIC equation

Sine–cosine generation

Pipelining the flat CORDIC architecture

Performance comparisons

Conclusions

A unified algorithm for elementary functions

Proc. Spring. Joint Comput. Conf.

Redundant CORDIC methods with a constant scaling factor for Sine and Cosine computation

IEEE Trans. Comput.

Low latency time CORDIC algorithms

IEEE Trans. Comput.