# **A NEW N-FOLD FLIP-FLOP WITH OUTPUT ENABLE**

Mounir Zid<sup>1</sup>, Carlo Pistritto<sup>2</sup>, Rached Tourki<sup>1</sup> and Alberto Scandurra<sup>2</sup>

<sup>1</sup> Electronics and Micro-Electronics Laboratory, Faculty of Sciences of Monastir, University of Monastir, Tunisia. mounir\_zid@yahoo.fr

<sup>2</sup> On Chip Communication Systems (OCCS) - STMicroelectronics, Catania, Italy.

### ABSTRACT

With the evolution of the semiconductor industry and the continuous growing demands for high performance VLSI circuit, the aggressive scaling in feature size and high integration density along with the high operating frequencies make power consumption and digital noise in modern analog and digital devices one of the top concerns of Very Large Scale Integration (VLSI) circuit design. In this paper we delve into the design of n-fold flip-flops with output enable. A new n-fold flip-flop exploiting the clock gating technique for both outputs enabling and power saving is presented. To evaluate its performance, an octal flip-flop was built according to the new proposed structure and compared to the main octal flip-flops used today. The different flip-flops were implemented in STMicroelectronics 65 nm process technology and simulated for the worst case condition where the switching activity is maximal. Post layout simulation showed that the new circuit provides the same functional performances as conventional solutions with significantly less power consumption, area and digital noise.

### **KEYWORDS**

Flip-flops; Output enabling; Low power design; Clock gating

# **1. INTRODUCTION**

The advancements in the field of CMOS technology have promoted a continuous increase in the density of integration as well as in the frequency of operation of the VLSI ICs. As technology advances push for smaller devices and faster operations, power consumption and noise become severe problems when designing high-speed ICs. These challenging concerns are mainly due to the excessive switching activity in the chip that keeps increasing proportionally to the frequency augment and the number of transistors. According to Technology Roadmap for Semiconductors [1], dynamic power is expected to keep being a significant portion of the total power consumption in the coming CMOS systems which represents a grand challenge that must be overcome. It has been shown that a significant portion (30-60%) of the total power consumption is dissipated in the clock distribution network and flip-flops that constitute ubiquitous elements for digital CMOS ICs design [2]. This power is dissipated in form of heat and radiation which complicate further the design of the circuit. Since the leakage current augments with temperature, the inefficient package cooling technologies used today contribute to drastically boost the power dissipation. Clock-gating is a simple and effective method for decreasing dynamic power consumption. As the dynamic power is proportional to frequency and the number of transistors, it can be lowered by the clock gating technique that reduces the switching activity of the consuming signal transitions. Further, clock gating contributes also to mitigate the simultaneous switching, hence, minimizing the switching noise levels and extending the circuit's life.

There are many n fold flip-flops with controllable outputs that have been reported in the literature. Some of these flip-flops have a small data-to-output delay and are quite good at being

high performance. However, as frequency increase is required, their design and optimization become crucial from both noise and power consumption standpoint. In this paper, a novel n fold flip-flops based on the concept of clock gating is proposed. The circuit consumes less power, results in less digital noise and takes small layout area. The proposed circuit is compared to the conventional and widely used flip-flop (DFF) circuits manufactured and sold today. The remainder of the paper is organized as follows. Section 2 gives an overview on the sources of power consumption is CMOS devices. The concept of clock gating for power saving is explained in sections 3. In section 4, the two well-known n fold flip-flops with output enable are reviewed. Section 5 presents the new n fold flip-flops and explains its operation and technique to mitigate power consumption and alleviate the noise. In section 6, the implementation and the extensive simulation of individual flip-flops as well as their comparisons are presented. Finally, section 7 concludes the paper.

### 2. SOURCES OF POWER CONSUMPTION IN CMOS CIRCUITS

CMOS technology has rapidly embraced the field of digital integrated the field of analog and digital integrating circuit since their invention. In recent years there has been an increasing demand for high-speed and at low power digital circuits. With technology shrinking and high clock frequency utilization, power consumption becomes a crucial design problem. In embedded system, this problem is exacerbated by the decent battery technology that limits the circuit's autonomy to only few hours. Therefore, it is very important to reduce at maximum the power consumption of a digital circuit while maintaining its performances as they are. The power consumed by a digital CMOS circuit is given by the relation (1):

$$P_{avg} = I_{leakage} V_{dd} + I_{sc} V_{dd} + (C_L V V_{dd} f_{clk} \alpha)$$
(1)

The relation shows that the overall power consumption is partitioned in 3 main components where the first represents the static power caused by the leakage current Ileakage. The second component is the short circuit power caused by the short circuit current Isc flowing from power supply Vdd to ground during the simultaneous conducting of the CMOS transistors. Finally, the last component is the switching power due in essence to capacitive charge/discharge and which is regarded as a non trivial source of power dissipation in digital circuit. In the switching power term, CL represents the effective switched loading capacitance, fclk is the clock frequency and V is the voltage swing of the signals. The coefficient  $\alpha$  denotes the switching activity that is none other than the probability of the power consuming transitions. In CMOS devices the dynamic power consumption due to the short circuit current and load capacitance charge/discharge dominates the total power consumption. Though the static power keeps increasing with each new technology node generation, the dynamic power still constitute the main cause of power loss in VLSI CMOS devices. With the STMicroelectronics 65 nm CMOS technology, the dynamic power represents 60% of the total power losses (Fig. 1). This obliges designers to find solution capable to mitigate its effects on their design performances in order to provide low power devices aiming at longer battery life, a high reliability and avoiding the utilization of cooling devices.



Figure 1. Increasing Contribution of Leakage Power in STMicroelectronics CMOS technology.

### 3. CLOCK GATING FOR LOW-POWER CIRCUIT DESIGN

Looking at the relation (1), power consumption can be reduced by using different low power techniques at various design levels [3]. Since the last score and until recently, many researches focused on this issue and provided different efficient solutions such as energy recovery [4][5][6] and the well-known low power techniques [3][7]. Among these techniques, clock gating [8][9] is a popular low power technique allowing to reduce the dynamic power of a circuit at the logic level by decreasing the switching activity in the clocked elements and in the clock trees forming it. This can be done by deactivating the clock signal fed to the concerned circuits when these are no needed. The Fig. 2 shows the commonly used implementation of this technique at the logic level. The basic idea is simple and consists in letting or not the clock signal reach the DFF according to the state of a control signal CE. The gated clock signal is obtained by ANDing the clock signal with the signal SCE that represents the synchronized version of the control signal CE. This signal is used to obtain a glitch-free clock gating. Hence, when CE becomes high, the SCE signal becomes high at the next rising edge of the clock and the gated clock is passed to the clocked elements. Conversely, if CE goes back to low state, the SCE signal copies it at the next clock rising edge disabling the gated clock and forcing the clocked element to enter in standby mode. Applied to an entire system, this technique allows saving a considerable amount of power when the clock is deactivated.



Figure 2. Clock gating circuit.

### 4. CONVENTIONAL FLIP-FLOPS WITH OUTPUT ENABLE

The n fold flip-flops with output enable is an important building block in digital systems. It consists of a set of flip-flops which outputs can be whether opaque or transparent according to the state of a single control signal used to control the transparency of these outputs. On-chip and Off-chip applications utilize such circuits as bank registers to store data for a defined period of time. A wide selection of different n fold flip-flops with output enable can be found in the literature [10][11]. One solution widely used is the Mux based n fold flip-flop. As an example, the 74F378 circuit [12] commercialized by Philips Semiconductors is built based on this principle for data retention. The internal schematic diagram of this Hex D flip-flop with output enable is shown in Fig. 3. If the register must hold state and prevent the reading of the new data present at its input, each flip-flop output of the register is fed back into the flip-flop's input through a Multiplexer (Mux). Conversely, if a new data have to be copied by the output, the new data is routed to the flip-flop's input by the same Multiplexer whose select signal is used to decide whether the register has to hold the new data or recycles the existing data.



The limits of the circuit are due to the utilization of a Multiplexer for each flip-flop to hold the data at the output. When the number of flip-flops is increased to enlarge the width of the register, the circuit area and its power consumption increase considerably. Furthermore, since the input and output data are steered by the Multiplexer, the delay introduced by data path to the input signal restricts the operation of this device only at low and moderate frequencies. Other solutions utilize Tri-State buffers to control the transparency of the register's output. This solution is used by STMicroelectronics to implement an octal flip-flop [13]. The register is built with eight flipflops which outputs are controlled with dedicated non-inverting or inverting Tri-State buffer (Fig. 4). All these buffers share the same control signal which High/Low level decides the behavioural opaque/transparent mode of the buffers. The circuit is more useful when it employs non inverting buffers because of their ability to maintain information once buffers are deactivated. A noninverting buffer consists of a Tri-State inverter in serie with a conventional inverter. Hence, when the control signal is active, the buffers are in transparent mode and the register outputs the same data present at the outputs of the flip-flops. If the control signal is disabled, the outputs of the first inverters of the buffers are at high impedance forcing the second inverters forming the buffers to hold the old data at their outputs.



Figure 4. Octal D flip-flop with output enable.

Compared to the previous solution, this circuit has less transistors count, consumes less power and can operate at high frequencies since it has a small input delay. However, for both these solutions, the flip-flops of the registers are permanently clocked regardless the transparency of the output. A huge amount of power is dissipated dynamically because of the continual switching of the transistor. Even if no new data is captured, the flip-flips are still clocked and the circuit keeps consuming power. This consumed power is essentially dynamic and is linearly proportional to the number of flip-flops. To lessen the dynamic power, conditional capture and conditional precharge [14][15] flip-flops can be used instead of using normal flip-flops. It has been shown that the application of conditional techniques can reduce the switching activity of the transistors and save more than 50% of the consumed power [16]. Eventually this helps also reduce the effect of switching on the circuit performance. Though conditional flip-flops are very efficient for dynamic power saving, their static power consumption remains high since a considerable number of transistors are required to design it. In addition, they take a large area on silicon when implemented and their design is complex if compared to other low power and high speed flipflops such as True Single-Phase Clock (TSPC) flip-flops.

# 5. THE PROPOSED LOW-POWER AND LOW-NOISE FLIP-FLOPS WITH OUTPUT ENABLE

We propose an n fold flip-flop that takes advantages of clock gating to control the transparency of its output and reduce both its power consumption and digital noise. The circuit is constructed with a certain number of flip-flops and one unique clock gating module which main function is to control the clock fed to the flip-flops. According to the state of a control signal, the clock is passed or not to the clocked elements allowing the data retention at the output when the clock signal is disabled. Fig. 5 shows the controller's circuit. This controller is similar to that shown in Fig. 2 but has the advantage to introduce less latency to the gated clock hence minimizing skew when a high speed clock is used to synchronize the circuit. The timing diagram and the schematic diagram of an octal flip-flop implemented based on the proposed principle is shown in Fig. 6.



Figure 5. Clock gating module.

The circuit consists of eight flip-flops and a clock controlling module and has two modes of operations. To update the outputs of the register, a '1' logic must be present at the control signal EN. As long as this signal remains at high state, at each clock cycle the controller outputs a clock pulse that is transported by the Load signal to all the flip-flops forming the register. Therefore, the outputs of the register are transparent since the flip-flops loads the inputs at each rising edge of gated clock Load. If the EN signal passes to low state, the gated clock is deactivated and the flip-flops are no longer clocked preventing them to update their outputs. The outputs keep holding the last data that were stored in the flip-flop before deactivating the gated clock until this clock is activated again. During this mode the switching activity is reduced at maximum in the clock path



(a) Example of an octal flip-flop.

(b) Timing diagram.



as well as in the flip-flops and hence, the unnecessary dynamic power consumption is avoided. Another advantage of this solution is that it results in a small digital noise and allows a long life for the transistors since switching activity is reduced. This makes our improved conditional capture methods suitable for high-performance VLSI systems.

### 6. IMPLEMENTATION, SIMULATION AND COMPARISON

To prove the effectiveness of our approach for the design of high performances n fold-flip with controlled outputs we have chosen to compare the octal flip-flop of Fig. 6 to two other octal flip-



(a) The flip-flop circuit.

(b) Flip-flop power consumption vs frequency.

Figure 7. The modified version of the TSPC DFF proposed in [18].

flops among which one is a Multiplexer based while the other is a tri-state buffer based flip-flop. This allows obtaining a fair comparison since we can use the same flip-flop to build the different circuits. For this reason, the third approach that reposes on conditional capture/precharge flip-flop is eliminated in our comparison since it utilizes another kind of flip-flops. The three circuits were implemented at transistor level using the STMicroelectronics 65 nm lpsvt CMOS process technology. To do that, we designed and tested first a standard cell library containing a variety of logic gates and flip-flops. The library was designed according to a full custom design flow and the final circuits were constructed in the same manner but by taking the required circuit from the library which we instantiate in our design. The circuits were built using the TSPC flip-flop disclosed in [18] and which modified schematic is depicted in Fig. 7.

The modification that we introduced aimed to endow the circuit with an asynchronous reset. We used the minimum transistor size for the implementation in order to minimize power consumption and achieve high-speed operations. Using a 1.1 V power supply and by sweeping the clock frequency and temperature, PLS simulation proved that the DFF is capable to operate correctly at frequency below 9 GHz for temperatures ranging from -70°C to 130°C. Fig. 9 presents the simulated PLS current dissipation at 27°C in the DFF implemented with 65 nm lpsvt CMOS transistors.

Our final three circuits were simulated using SpectreRF, a component of the Cadence Design Suite [17] with foundry supplied BSIM4 models. Using a 500 MHz clock signal, the simulation was performed at typical process corners, typical environment conditions, using a 1.1 V power supply and taking into account the parasitic components to accurately analyze the circuits. For critical power consumption measurement, the circuits were simulated in the worst case condition by using a switching activity  $\alpha$ =1. Fig. 8 illustrates the post layout simulation result of the proposed octal flip-flop. In the figure, the signal D represents the signal fed to all the inputs of the flip-flop, CLK represents the clock signal, C is the gating control signal, S is one output of the flip-flop and I is the current flowing through the power supply pin. The post layout simulation proved that the new octal flip-flop consumes much less power and results in much less current



Figure 8. Post layout simulation Waveforms.

Figure 9. Layout views of the implemented DFFs: (a) Tri-states buffer Octal DFF. (b) Octal Mux based DFF and (c) Proposed low-power Octal DFF.

spike during the active and sleep mode when compared to the other circuits (Fig. 8). Furthermore, the simulation showed that the circuit achieves 71 % power reduction when operating in the sleep mode. The circuit takes the smallest area and its layout drawing is very simple since the circuit has less transistor count and requires less metal wires. Fig. 9 shows the layout drawing of the three circuits while Table 1 summarizes the comparison between these circuits when outputs are enabled OE or disabled OD.

Power consumption and noise reduction depends on many implementation issues. In fact, it is proportional to the number of flip-flops used, the voltage swing, the power supply and the testbench stimulus as well as the frequency of operation of the circuits. In addition, it depends on the transistor sizes which may require to be enlarged when equal rise/fall time and particular logical effort are required. The obtained results are for a one byte bit flip-flop and eventually they will provide an enhanced gain in term of power consumption, noise reduction and area when the proposed architecture is used to design bigger flip-flops with an extended number of bytes. Moreover, the power consumed in the proposed circuit can be less than that was measured. In fact, it can be further reduced by eliminating the synchronizing flip-flop of the clock controller that was used just to avoid the appearing of glitches in the gated clock. As a result, the control signal can fed directly the input of the Multiplexer and additional power and area can be saved. From another side, though the conditional capture/precharge based solutions utilize another kind of flip-flop, it is clear that the proposed circuit is better because the flip-flop that it encompasses has less transistor count and consequently it will consume less power and occupy lower area. All these advantages make our solution the right design options for low-power and high-performance applications.

| Feature                 | Octal Mux based DFF         | Tri-states buffer Octal<br>DFF with enable | Proposed low-<br>power Octal DFF |
|-------------------------|-----------------------------|--------------------------------------------|----------------------------------|
| Max Current pics        | 390 (OE)                    | 350 (OE,OD)                                | 300 (OE)                         |
| (μΑ)                    | 230 (OD)                    |                                            | 100 (OD)                         |
| Total power (W)         | 2.76 10 <sup>-05</sup> (OE) | 2.08 10 <sup>-05</sup> (OE)                | 2.07 10 <sup>-05</sup> (OE)      |
| -                       | 1.25 10 <sup>-05</sup> (OD) | 1.97 10 <sup>-05</sup> (OD)                | 5.96 10 <sup>-05</sup> (OD)      |
| Area (µm <sup>2</sup> ) | 244.42585                   | 179.52                                     | 168.4087                         |
| Metals Layers           | 3                           | 3                                          | 2                                |

# **7.** CONCLUSION

A new n-fold flip-flop circuit with controllable output is presented in this paper. By exploiting the clock gating technique for both output control and switching activity reduction, the circuit introduces low noise and provides the ability to mitigate power consumption when its outputs are opaque. An octal flip-flop circuit build according to the new presented architecture was implemented in 65 nm STMicroelectronics process technology using 2 metal layers. Post layout simulation showed that the circuit has better power and digital noise performances when compared to the conventional similar industrial solutions. The circuit consumes about 20  $\mu$ W under a 1.1 V power supply and occupies an area of about 168  $\mu$ m<sup>2</sup>.

### **ACKNOWLEDGEMENTS**

This work was financially supported by the On-Chip Communication Systems (OCCS) team, STMicrolectronics, Tunis.

### References

- [1] International Technology Roadmap for Semiconductors, 2009. Available from: http://public.itrs.net/
- [2] Kawaguchi, H., Sakurai, T. "A reduced clock-swing flip-flop (RCSFF) for 63% power reduction", *IEEE J. Solid-State Circuits* 1998; 33:807–811.
- [3] Keating, M., Flynn, D., Aitken, R., Gibbons, A., Shi, K. Low Power Methodology Manual: For System-on-Chip Design, Springer, 2007; 13:19

- [4] Ziesler, C. H., Kim, J., Papaefthymiou, C., Kim, S. Energy Recovery Design for Low-Power ASICs, *In Proceedings of IEEE International SOC Conference* 2003; 424-427.
- [5] Alimadadi, M., Sheikhaei, S., Lemieux, G., Mirabbasi, S., Dunford, W., Palmer, P. Energy Recovery from High-frequency Clocks using DC-DC Converters, *IEEE Computer Society Annual Symposium on VLSI* 2008; 162 - 167.
- [6] Mahmoodi, H., Tirumalashetty, V., Cooke, M., Roy, K. Ultra Low Power Clocking Scheme Using Energy Recovery and Clock Gating, *IEEE Transactions on Very Large Scale Integration Systems* 2009; 17:33-44.
- [7] Srivastava, R., Imam, S.A., Pandey, S. Low Power Design Techniques for high performance Digital Integrated Circuits, *MASAUM Journal of Reviews and Surveys* 2009; 1:81-90.
- [8] Sulaiman, D.R Using Clock gating Technique for Energy Reduction in Portable Computers, in Proceedings of the 2008 International Conference on Computer and Communication Engineering (ICCCE 2008); 839 842.
- [9] Strollo, A.G.M., Napoli, E., Caro, D.D. New clock-gating techniques for low-power flip-flops. Proceedings of the 2000 International Symposium on Low Power Electronics and Design (ISLPED 2000), Rapallo, Italy; 114–119.
- [10] LOW VOLTAGE OCTAL D-TYPE FLIP-FLOP WITH 5V TOLERANT INPUTS AND OUTPUTS, TC7MZ574FK Datasheet 1999, TOSHIBA.
- [11] Octal D-Type Flip-Flop with 3-STATE Outputs, 74AC574-74ACT574 Datasheet 1999 Fairchild Semiconductor.
- [12] 74F378 Hex D flip-flop with enable, Product specification IC15 Data Handbook Oct. 1989, Philips Semiconductors.
- [13] OCTAL D-TYPE FLIP FLOP NON-INVERTING (3-STATE) WITH 5V TOLERANT INPUTS AND OUTPUTS, 74LCX374 Datasheet September 2004, STMicroelectronics.
- [14] Zhao, P., K. Darwish, T., A. Bayoumi, M., A., High-Performance and Low-Power Conditional Discharge Flip-Flop, *IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI)* SYSTEMS 2004; 12: 477-484.
- [15] Seyedi, A. S., Rasouli, S. H., Amirabadi, A., Afzali-Kusha A. Clock Gated Static Pulsed Flip-Flop (CGSPFF) in Sub 100 nm Technology, in Proceedings of the 2006 *Emerging VLSI Technologies* and Architectures (ISVLSI'06) 2006; 373-377.
- [16] Nedovic, N., Aleksic, M., Oklobdzija, V. G. Conditional Techniques for Low Power Consumption Flip-Flops, *IEEE International Conference on Electronics, Circuits and Systems* 2001; 2: 803 – 806.
- [17] Virtuoso Spectre Circuit Simulator RF Analysis Theory, Cadence, Product Version 6.2, June 2007.
- [18] Yang, S-H., You, Y., CHO, K-R. A New Dynamic D-Flip-Flop aiming at glitch and Charging Sharing Free, *IEICE TRANS. ELECTON.* 2003; 3: 496-505.

#### Author

Mounir Zid received his Engineering degree in Electricity in 2004 from the National Engineering School of Monastir (ENIM). He received his M.S. and his Ph. D degree in Physics from the Faculty of Sciences of Monastir in 2006 and 2012 respectively. His research interests are Analog and Digital design. He is currently a Postdoc carrying out researches on advanced high speed serial interconnects and Photonic NoCs.

Rached Tourki was born in Tunis, on May 13 1948. He received the B.S. degree in Physics (Electronics option) from Tunis University, in 1970; the M.S. and the Ph.D. in Electronics from Orsay Electronic Institute, Paris-south University in 1971 and 1973 respectively. From 1973 to 1974 he served as Microelectronics Engineer in Thomson-CSF. He received the Doctorat d'etat in Physics from Nice University in 1979. Since this date he has been Professor in Microelectronics and Microprocessors with the

Physics department in the Faculty des of Sciences of Monastir. His researches interests are digital signal processing and hardware–software codesign for rapid prototyping in telecommunications.

Alberto Scandurra, born in Messina (Sicily, Italy) on April the 13th 1972. Degree in Electronic Engineering (110/110 cum laude) got from Faculty of Engineering of Messina University in 1997. Working for STMicroelectronics since 1998, always in the field of on-chip communication systems, covering different aspects of the VLSI design flow, i.e. architecture, modeling, digital design, physical design. Teacher of "VLSI design" at Mediterranea University of Reggio Calabria (Italy) since 2007/2008. Currently working at novel on-chip interconnect solutions, such as Network on Chip, high speed on-chip and off-chip links, optical interconnect.

Carlo Pistritto On Chip Communication Systems (OCCS) Manager, is in charge of a central group in ST whose mission is to provide ST divisions and key customers with state of the art competitive technology and infrastructure for on chip communication systems and off chip memory access. Pistritto holds an Electronic Engineer from Catania University. He joined ST in 1993 mainly as digital designer for complex SoCs. Since then he was involved in many activities from design to architecture and Project Management in various applications field such as GPS, DVD, Digital Camera, SmartCard. He spent some of his time abroad in particular in UK and California where he was the ST representative for a joined project with HP in Palo Alto on VLIW Microprocessor Architecture. Since 2003 is in charge of On Chip Communication System Group that is now a well recognized reference for some of the technology developed such as the Bus Interconnect (STBus) and more recently the Network On Chip (STNoC).