# Design and Analysis of Digital Wave Generator using CORDIC Algorithm with Pipelining and Angle Recoding Technique

Navdeep Prashar<sup>1</sup>, Amandeep Singh<sup>2</sup> and Balwinder Singh<sup>3</sup>

<sup>1-2</sup>Department of VLSI Design, CDAC, Mohali, India nav.prashar@gmail.com, eramansingh@yahoo.com <sup>3</sup>ACS Division, CDAC, Mohali, India balwinder\_cdacmohali@yahoo.com

## ABSTRACT

CORDIC algorithm is used for calculation of complex functions in signal processing and wireless communication applications. These functions are the combination of sine and cosine terms that are linked to complex analysis. Pipeline architectures are used in CORDIC algorithm to reduce the critical path, increases the clock speed. An angle recoding method is used to reduce the latency and obtain the desired angle in least number of iteration. In this paper, Hardware efficient Digital sine and cosine wave generator is designed and implemented by using pipelined CORDIC architecture. Digital sine and cosine wave generator is modeled and verified using Xilinx 12.3 ISE. The results shows the significant reduction in critical path, increase in clock speed using pipelining and number of iterations is also reduced by Angle Recoding Method.

## **KEYWORDS**

CORDIC, Sine and Cosine generator, Pipelined Architecture, Original CORDIC, Micro-rotations.

# **1. INTRODUCTION**

CORDIC stands for Coordinate Rotation Digital Computer proposed by Volder [1] and further extended by Walther [2], has been used to efficiently implement Trigonometric, Hyperbolic, Exponential functions, Coordinate Transformation etc. using the same hardware.

CORDIC is a hardware efficient algorithm. In CORDIC, there is no requirement of multiplication or division block instead; it works only with adder, subtractor and shifter. Owing to its simplicity it is easily implemented on a VLSI system. As it was designed for hardware applications, there are features like low cost that made CORDIC excellent choices for small computing devices. Since, CORDIC is used as a building block in various chip, the critical aspects to be considered are high speed, low power and low area for achieving overall performance. These aspects can be achieved by implementing various techniques in CORDIC algorithm. Pipelined CORDIC

DOI: 10.5121/cseij.2012.2310

architecture is implemented in order to avoid iterative cycle in CORDIC. Angle recoding method is used to reduce the number of iterations; latency and error of CORDIC as it skips it skip over certain intermediate CORDIC iteration to deliver the same precision.

During the last 50 years of the CORDIC algorithm a wide variety of applications have emerged. The CORDIC algorithm has received increased attention after an unified approach is proposed for its implementation [2]. Thereafter CORDIC based computing has been the choice for scientific calculator applications and HP-2152A co-processor, HP-9100 desktop calculator, HP-35 calculator is a few such devices based on the CORDIC algorithm. Several modifications have been proposed in the literature for the CORDIC algorithm during the last two decades to provide high performance and low cost hardware solutions for real time computation of a two dimensional vector rotation and transcendental functions.

This paper divided into following sections- section 2 introduces CORDIC algorithm, in section 3 the Pipelined CORDIC architecture is discussed, section 4 presents angle recoding method for CORDIC and section 5 discusses the Simulation and Results, section 6 Conclusion.

## **2. CORDIC ALGORITHM**

Depending on the function to be evaluated, CORDIC algorithm involves rotation of vector v on x-y plane in circular, linear, hyperbolic coordinate system. Figure 1 shows the trajectories for the vector  $v_i$  for the successive CORDIC iterations. This algorithm performs iteration iteratively using a specific incremental rotation angles selected so that each iteration is performed by shift and add operation.



Figure1. Iterations using rotation in various coordinate systems.

The norm of a vector in these coordinate systems are defined in terms of  $x^2 + my^2$ , where m {1,0,-1} presents, a circular, linear, hyperbolic coordinate system respectively. A rotation trajectory in a circle defined by  $x^2 + y^2 = 1$  in the circular coordinate system. For linear system the rotation trajectory defined by x=1 and similarly, the rotation trajectory of hyperbolic system is given by function  $x^2 - y^2 = 1$ .

The CORDIC is mainly employed in two modes namely rotation mode and vectoring mode. In the vectoring mode, the co-ordinate components of a vector are given and the magnitude and angular argument of the original vector are computed. In the rotation mode, the co-ordinate components of a vector and an angle of rotation are given and the co-ordinate components of the original vector, after rotation through a given angle, are computed.

## **3. PIPELINED ARCHITECTURE**

CORDIC processor is implemented in number of ways depending on the applications. The simple architecture is the serial architecture which gives the direct solution to the CORDIC basic equations. The CORDIC block in serial architecture consists of three adder / subtractor and two shifters with a ROM containing a look up table. Serial architecture performs one micro rotation for every clock cycle. In this architecture all iterations are executed in same hardware, hence Output is available after n clock cycle. Figure 2 shows the serial architecture which uses n clock cycle for single calculation hence it is very slow and not suitable for high speed implementations.



Figure 2. Serial Pipelined Architecture

Pipelined architecture contains n number of cascaded CORDIC blocks. The first output of n stage CORDIC is obtained after n clock cycles. Thereafter, Output will be generated during each clock cycle. Pipelined Architecture have much higher frequency of operation hence these architectures are well suited for satellite communication. Pipelined Architecture having shift registers perform

fixed number of shifts at every time. It contains registers at every stage to store the angle for a particular micro rotation. Every stage performs single micro rotation hence; i<sup>th</sup> stage performs i<sup>th</sup> micro rotation. Figure 3 shows the pipelined architecture. The main advantage of the pipelined architecture compared to serial architecture is high throughput due to the hard-wired shifts rather than time and area consuming barrel shifters and elimination of ROM.



Figure3. Pipelined architecture

This architecture is much faster than serial architecture. The sign of 'z' gives the direction of iteration at every stage.

In this paper, a six stage pipelined sine and cosine wave generator is designed performing a specific micro rotation at every stage. It operates in rotation mode. Any angle is given as input and sine and cosine output is generated such that value of sine and cosine function is given by

$$X_n = \cos$$
(1)  
$$Y_n = \sin$$
(2)

The RTL view of Pipelined CORDIC is shown in Figure 4.



Figure 4. RTL view of Pipelined CORDIC

For any input angle, three outputs are generated sine function, cosine function and eps indicates the error proximity to the required angle.

# 4. ANGLE RECODING METHOD

For an iterative algorithm latency is defined as the product of number of iteration to the cycle time of each iteration. Original CORDIC results in high and constant latency. In order to reduce the latency an angle recoding method is introduced. In this method angle selection is done without increase in cycle time. A range is defined for the different angle constants. A particular angle constant is selected, if the input rotation angle lies in that range. The comparison process can be done for different angle constants using input angle. To use an angle recoding method it is necessary to identify the range associated with each angle constant. These ranges can than used as reference values.

Defining the range of residual angles around the angle constant [5] is shown in Table 1.

| Input angle         | Angle Constant [Z ] |  |  |  |
|---------------------|---------------------|--|--|--|
| 45 <sup>0</sup>     | [35.78, 67.5)       |  |  |  |
| 26.565 <sup>0</sup> | [20.295, 35.78)     |  |  |  |
| 14.036 <sup>0</sup> | [10.5775, 20.295)   |  |  |  |
| 7.125 <sup>0</sup>  | [5.3505, 10.5775)   |  |  |  |
| 3.576 <sup>0</sup>  | [2.6825, 5.3505)    |  |  |  |
| 1.79 <sup>0</sup>   | [1.342, 2.6825)     |  |  |  |
| 0.895 <sup>0</sup>  | [0.6715, 1.32)      |  |  |  |
| $0.448^{0}$         | [0.3359, 0.6715)    |  |  |  |
| 0.22380             | [0.119, 0.3359)     |  |  |  |

Table 1: Defining the range of residual angles around angle constant

In an original CORDIC rotation of a vector is done using a fixed number of angular steps which are executed in sequence, these angular steps are known as micro-rotations. The process is carried out until the vector is arrived within a smaller angular distance of its final resting position.

In original CORDIC rotation through a given angle is carried out by sequence of angular steps .For an angle 46 degree, final angle is obtain at eight iteration is  $46.384^{\circ}$  which is closest to  $46^{\circ}$ . Similarly, for an angle 25 degree, final angle is obtained at eight iteration is  $27.58^{\circ}$  which is closest to  $27.5^{\circ}$ . For an angle 9 degree, final angle obtain at eight iteration is  $9.289^{\circ}$  closest to  $9^{\circ}$ . In an angle recoding method rotation through given angle is carried through set of available angle constant. For an angle 46 degree, the rotating vector takes only two iterations to reach its target position.

$$46^0 \quad (45 + 0.895) = 45.895^0.$$

Rotation through angle 27.5 degree is given as

 $27.5^0 \quad (26.565 + 0.895) = 27.46^0.$ 

Similarly, rotation through angle 9 degree is given as

$$\Theta^0$$
 (7.125+1.79) = 8.915<sup>0</sup>.

Taking the clock period of 100ps, Latency is given by expression =  $n^* t_c$ . Where n is the number of iterations,  $t_c$  is the cycle time of each iteration.

## **5. EXPERIMENTAL RESULTS**

The code of pipelined, original CORDIC and angle recoding method is written in VHDL and simulation is done using Modelsim SE 6.3f. Also, the sine and cosine function for each angular step is calculated.

## **5.1 Results for Pipelined CORDIC**

This includes the results of Pipelined sine and cosine wave generator implemented on target device XILINX SPARTAN 3E xc3s100e-4vq100, using XILINX 12.3. Area and Timing reports are included for particular target device. Simulation of the pipelined sine and cosine generator is done. Table 2 shows the device utilization summary of sine and cosine generator using CORDIC algorithm.

| Parameters                 | Used | Available | Utilization |
|----------------------------|------|-----------|-------------|
| Number of Slice Flip Flops | 166  | 1920      | 8%          |
| Number of Slices           | 117  | 960       | 12%         |
| Number of 4 input LUTs     | 180  | 1920      | 9%          |
| Number of bonded IOBs      | 34   | 66        | 51%         |
| Number of BUFGMUXs         | 1    | 24        | 4%          |

Table2. Hardware Device Utilization Summary

The timing reports include the total time delay for output to appear after giving input. At Speed Grade:-4 the minimum period required 5.244 ns which refer to a maximum frequency of 190.694MHz.The minimum arrival time before clock is 2.059 ns and maximum output required time after clock is 4.310 ns.

#### 5.2 Comparison with previous work

The present work is implemented on Xilinx Spartan 3E (speed grade -4). Thus, finally comparison of the present Pipelined CORDIC Sine and Cosine wave generator with previous work is done shown in Table 3.

| Parameter                | Present work | Previous Work [3] |  |  |
|--------------------------|--------------|-------------------|--|--|
| No. of Slices Flip Flops | 166          | 188               |  |  |
| No. of Slices            | 117          | 115               |  |  |
| No. of 4 input LUTs      | 180          | 183               |  |  |
| No. of bonded IOBs       | 34           | 35                |  |  |
| Maximum Frequency MHz    | 190.694      | 148.017           |  |  |
| Delay                    | 5.244ns      | 6.756 ns          |  |  |
| Input arrival time       | 2.059ns      | 5.404 ns          |  |  |
| Output Required time     | 4.310ns      | 6.198 ns          |  |  |

Table3. Comparison of Present work with previous work

## 5.3 Simulation results for original CORDIC and angle recoding method

Simulation results of original CORDIC for angle 27.5degree is shown in Figure 5.



Figure 5.Simulation of original CORDIC for an angle 27.5<sup>0</sup>

From above simulation results, it is cleared that final target value comes at eight iterations in an Original CORDIC.

Simulation result of angle recoding method for angle 27.5degree is shown in Figure 6.

| wave default     |         |                                                                                                                 |                     |   |            |      |           |          |
|------------------|---------|-----------------------------------------------------------------------------------------------------------------|---------------------|---|------------|------|-----------|----------|
| Mozago           | 9       |                                                                                                                 |                     |   |            |      |           |          |
| 🔶 լիսուն։ Հի Ա   | 0       |                                                                                                                 |                     |   |            |      |           |          |
| Acondici2/analia | 27.5    | 27.5                                                                                                            |                     | 1 |            |      |           | 1        |
| /cordic2/din     | 0.015   | 0.414.2                                                                                                         | 0.015               | 1 |            |      |           | 1        |
| 🔶 Jamaha Mpana   | 0.000   | 0.021                                                                                                           | 0.000               |   |            |      |           | 1        |
| 🔶 Acordic2/m     | 0.01    | 11.91.121                                                                                                       | 11.0*               | 1 |            |      |           | 1        |
| /condic3/h       | 0.895   | 26.565                                                                                                          | 01.0005             | 1 |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
|                  |         |                                                                                                                 |                     |   |            |      |           | 1        |
| . 🗰 🕸 🔛          | er 5000 | per de la companya de | di internet di same |   | -1012 (199 | <br> | 1 <u></u> | <br>1111 |
| 🖌 🦯 😁 🔹 Curreor  | 1 0     | pe Oper                                                                                                         |                     |   |            |      |           | <br>     |

Figure6. Simulation of angle recoding method for an angle 27.5<sup>o</sup>

Simulation result of above angle recoding method implies that final target value comes in two iterations.

Figure 7 shows the rotation through 27.5 degree using original CORDIC and angle recoding method.



Figure 7. Rotation through 27.5 degree using original CORDIC and angle recoding method

# 5.4 Comparison of Original CORDIC and Angle Recoding Method

| Input<br>Angle | Original<br>CORDIC |                 |       | Angle recoding<br>Method |                 |       |  |
|----------------|--------------------|-----------------|-------|--------------------------|-----------------|-------|--|
|                | No. of iterations  | latency<br>(ns) | Error | No. of iterations        | latency<br>(ns) | Error |  |
| 46             | 8                  | 0.8             | 0.384 | 2                        | 0.2             | 0.105 |  |
| 27.5           | 8                  | 0.8             | 0.08  | 2                        | 0.2             | 0.04  |  |
| 9              | 8                  | 0.8             | 0.289 | 2                        | 0.2             | 0.09  |  |

Table4. Comparison of original CORDIC and angle recoding method.

From Table 4 it is observed that the number of iterations, angle error and latency is reduced by using angle recoding method.

#### **6.** CONCLUSION

This paper presents the two techniques i.e. Pipelining and Angle Recoding Method in CORDIC algorithm. CORDIC has been implemented in pipeline to avoid iterative cycles and to increase the throughput. Pipelined Digital sine cosine generator is targeted for Spartan 3E device and it requires 12% of its total number of slices with a time delay of 5.244ns and a maximum frequency of 190.694 MHz Hence, Pipelining technique results in the significant increase in the speed of the system. An angle recoding method is implemented to reduce the number of iteration, latency and angle error. Using angle recoding method the number of iterations are reduced 75% than that of original CORDIC.

## REFERENCES

- [1] J.E Volder, (1959) "The CORDIC Trigonometric Computing Technique," Proc. IRE Transaction on Electronic Computers, Vol. EC-8, No.3, pp.330-334.
- [2] J. S. Walther, (1971) "A Unified Algorithm for Elementary Functions," Proc. AIFS Spring Joint Computer Conference, pp.379-385.
- [3] R. Ranga Teja, P. Sudhakara Reddy, (2011) "SINE/COSINE Generator Using Pipelined CORDIC Processor," Proc. IACSIT International Journal of Engineering and Techonology, Vol.3, No.4, pp.431-434.
- [4] Esteban O. Gracia, Rene Cumplido, Miguel Arias, (2006) "Pipelined CORDIC Design on FPGA for a Digital Sine and Cosine Waves Generator", Proc. 3rd International Conference on Electrical and ElectronicsEngineering, pp.1-4.
- [5] Terence K. Rodrigues and Earl E. Swartzlander, (2010) "Adaptive CORDIC: Using Parallel Angle Recoding to Accelerate Rotations", Proc. IEEE Transactions on Computers, Vol.59, No.4, pp.522-531.
- [6] W. Chang , and J. W. Chein, (2003) "High-Speed and Low-Power Split-Radix FFT", Proc. IEEE Transactions On Signal Processing, Vol. 51, No. 3, pp.864-874.
- [7] V. K. Anuradha and V. Visvanathan. (1994) "A CORDIC Based Programmable DXT Processor Array", Proc. 7th IEEE International Conference on VLSI Design, pp. 343-348.
- [8] Jean Duprat, Jean Michel Muller, (1993) "The CORDIC Algorithm: New Results for Fast VLSI Implementation," Proc. IEEE Transactions on Computers, Vol. 42, no. 2, pp.168-178.
- [9] Ray Andraka, (1998) "A Survey of CORDIC algorithms for FPGA based Computers," Proc. ACM/SIGDA sixth international symposium on FPGA, pp.191-200.
- [10] HerberiDawid, Heinrich Meyr, (1992) "VLSI Implementation of the CORDIC algorithm using redundant arithmetic," Proc. IEEE International symposium on circuits and systems, Vol.3, pp.1089-1092.
- [11] B.Das, S.Banerjee, (2002) "Unified CORDIC based chip to realize DFT/DHT/DCT/DST," IEE Proc. Computer and Digital Techniques, Vol.149, No.4, pp.121-127.
- [12] J. G. Proakis and D. G. Manolakis, (2008) "Digital signal processing principles, algorithms and applications," Delhi Prentice Hall.
- [13] Y.H Hu and s. Naganathan, (1993) "An Angle Recoding Method for CORDIC Algorithm Implementation," Proc. IEEE Transactions on Computers, Vol. 42, No.1, pp 99-102.

#### **Authors Biography**

**Balwinder Singh** has obtained his Bachelor of Technology degree from National Institute of Technology, Jalandhar and Master of Technology degree from University Centre for Inst. & Microelectronics (UCIM), Punjab University, Chandigah in 2002 and 2004 respectively. He is currently serving as Senior Engineer in Centre for Development of Advanced Computing (CDAC), Mohali and is a part of the teaching faculty and also pursuing Phd from GNDU **Amritsar**. He has 6+ years of teaching experience to both undergraduate and postgraduate students. Singh has published three

books and many papers in the International & National Journal and Conferences. His current interest includes Genetic algorithms, Low Power techniques, VLSI Design & Testing, and System on Chip.

**Navdeep Prashar** received his B.Tech. (Electronics and Communication Engineering) degree from the CT Institute of Engineering, Management and Technology, Jalandhar affiliated to Punjab Technical University, Jalandhar in 2010, and presently he is doing M.Tech. (VLSI design) degree from Centre for Development of Advanced Computing (CDAC), Mohali and working on his thesis work. His area of interest is Embedded Systems, VLSI Design, Digital Signal Processing etc.

**Amandeep Singh** received his B.Tech. (Electronics and Communication Engineering) degree from the Beant College of Engineering and Technology, Gurdaspur affiliated to Punjab Technical University, Jalandhar in 2010, and presently he is doing M.Tech. (VLSI design) degree from Centre for Development of Advanced Computing (CDAC), Mohali and working on his thesis work. His area of interest is Embedded Systems, VLSI Design and Testing, System on Chip, MEMS etc.



