### OPTIMAL DESIGN OF NANOSCALE STANDARD CELLS CONSIDERING PARASITICS

### **A DISSERTATION**

Submitted in partial fulfillment of the requirements for the award of the degree

of MASTER OF TECHNOLOGY

in

### **ELECTRONICS AND COMMUNICATION ENGINEERING**

(With Specialization in Semiconductor Devices & VLSI Technology (SDVT))

By ANKIT PIPERSANIYA



DEPARTMENT OF ELECTRONICS AND COMPUTER ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY ROORKEE ROORKEE -247 667 (INDIA) JUNE, 2010 I hereby declare that the work presented in this dissertation report entitled, "Optimal Design of Nanoscale Standard Cells Considering Parasitics" towards the partial fulfillment of the requirements for the award of degree of Master of Technology in Electronics and Communication Engineering with specialization in Semiconductor Devices and VLSI Technology (SDVT), Indian Institute of Technology Roorkee, is an authentic record of my own work carried out during the period from July 2009 to June 2010, under the guidance of Dr. Anand Bulusu, Assistant Professor, Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee.

The results embodied in this dissertation have not submitted for the award of any other Degree or Diploma.

Date: 28/06/2010

Place: Roorkee

ANKIT PIPERSANIYA

#### CERTIFICATE

This is to certify that the above statement made by the candidate is correct to the best of my knowledge and belief.

Date: 24 - 06 - 2010

Place: Roorkee

B. Anand

Dr. Anand Bulusu Assistant Professor

With great sense of pleasure and privilege, I take this opportunity to express my deepest sense of gratitude towards my supervisor and guide; **Dr. Anand Bulusu** for his valuable suggestions, sagacious guidance, scholarly advice and insightful comments and constructive suggestions to improve the quality of the present work. His professionalism, accurate advice, suggestions and his ways of thinking inspired me, and this inspiration guides me in every moment of my life.

I would like to thank Harsh sir, without his encouragement and support I would never have made it to this stage of my academic career. I thank my friends here at IIT- Aryan, Atul, Parag, Satish and Sourabh for the time we enjoyed together in the lab and during our outdoor trips, and for the late-night stress-relieving parties. You guys made my time at Roorkee really memorable!

I thank Swati for her support throughout. You always get the best out of me.

I also want to thank the Lab staff of VLSI Lab for their valuable support in completing my work.

Most of all I would like to thank my parents, my brother, Arpit for their support throughout the numerous ups and downs that I have experienced. Finally, I would like to extend my gratitude to all those persons who directly or indirectly contributed towards this work.

### LIST OF SYMBOLS/ABBREVIATIONS

.

| w                                                          | Width of NMOS in Equivalent Inverter                                                                                                                                      |
|------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| w <sub>p</sub>                                             | Width of PMOS                                                                                                                                                             |
| wn                                                         | Width of NMOS                                                                                                                                                             |
| p/n ratio                                                  | Width Ratio of PMOS and NMOS                                                                                                                                              |
| Tp_HL                                                      | High to Low Propagation Delay                                                                                                                                             |
| Tp_LH                                                      | Low to High Propagation Delay                                                                                                                                             |
| Тр                                                         | Average Propagation Delay                                                                                                                                                 |
| Tp_r                                                       | High to High Propagation Delay                                                                                                                                            |
| Tp_f                                                       | Low to Low Propagation Delay                                                                                                                                              |
| P <sub>inv</sub>                                           | Parasitic Delay of Inverter                                                                                                                                               |
| ρ                                                          | setup ratio for buffer two stage buffer                                                                                                                                   |
| d                                                          | delay of logic gate                                                                                                                                                       |
| clk                                                        | Clock                                                                                                                                                                     |
|                                                            |                                                                                                                                                                           |
| clk                                                        | Complementary of Clock                                                                                                                                                    |
| clk                                                        | Complementary of Clock                                                                                                                                                    |
| clk<br>PDN                                                 | Complementary of Clock Pull Down Network                                                                                                                                  |
|                                                            |                                                                                                                                                                           |
| PDN                                                        | Pull Down Network                                                                                                                                                         |
| PDN<br>PUN                                                 | Pull Down Network<br>Pull Up Network                                                                                                                                      |
| PDN<br>PUN<br>FSM                                          | Pull Down Network<br>Pull Up Network<br>Finite State Machine                                                                                                              |
| PDN<br>PUN<br>FSM<br>CAD                                   | Pull Down Network<br>Pull Up Network<br>Finite State Machine<br>Computer Added Design                                                                                     |
| PDN<br>PUN<br>FSM<br>CAD<br>LVS                            | Pull Down Network<br>Pull Up Network<br>Finite State Machine<br>Computer Added Design<br>Layout Vs Schematic                                                              |
| PDN<br>PUN<br>FSM<br>CAD<br>LVS<br>FF                      | Pull Down Network<br>Pull Up Network<br>Finite State Machine<br>Computer Added Design<br>Layout Vs Schematic<br>Flip-Flop                                                 |
| PDN<br>PUN<br>FSM<br>CAD<br>LVS<br>FF<br>MUX               | Pull Down Network<br>Pull Up Network<br>Finite State Machine<br>Computer Added Design<br>Layout Vs Schematic<br>Flip-Flop<br>Multiplexer                                  |
| PDN<br>PUN<br>FSM<br>CAD<br>LVS<br>FF<br>MUX<br>INV        | Pull Down Network<br>Pull Up Network<br>Finite State Machine<br>Computer Added Design<br>Layout Vs Schematic<br>Flip-Flop<br>Multiplexer<br>Inverter                      |
| PDN<br>PUN<br>FSM<br>CAD<br>LVS<br>FF<br>MUX<br>INV<br>DRC | Pull Down Network<br>Pull Up Network<br>Finite State Machine<br>Computer Added Design<br>Layout Vs Schematic<br>Flip-Flop<br>Multiplexer<br>Inverter<br>Design Rule Check |

.

### ABSTRACT

Speed and area are two main concerns in the design of modern integrated circuits. With scaling of technology, size of devices reducing and most of the chip area in is covering by interconnects, also, effects of parasitics on circuit performance cannot be neglected any more. Designer in the industry today uses semicustom design because it takes less time to layout a big circuit. However, this requires a lot of man hours to develop a complete standard cell library with different drive strengths.

In this work, we have developed a standard cell library which contains 44 cells of basic logic gates like NAND, NOR, NOT, BUFFER, D-latch/ FF, Half-Adder, MUX. We have done a full characterization of library and from this found out that how important is the role of interconnects with scaling of technology. All the library cells are layed-out using Virtuoso Layout Editor and characterization of all cells are done on Specter circuit simulator (Cadence EDA Tool). We observe that the impact of local interconnect is critical in determining the timing parameters of sequential circuits.

In sequential system, the most basic storing circuit is a D-latch. We have devised a new methodology to design D-latch. We have evaluated our method and compared the performance with earlier methodology of designing D-latch is observed an improvement in speed and reduction in area of d-latch cell. We have shown that our methodology produces D-latches with a greater robustness with respect to charge injection on dynamic nodes.

### **TABLE OF CONTENTS**

| ACKNOWLEDGEMENTS             | iii  |
|------------------------------|------|
| LIST OF SYMBOLS/ABBREVATIONS | iv   |
| ABSTRACT                     | v    |
| LIST OF FIGURES              | viii |
| LIST OF TABLES               | x    |
|                              |      |

### CHAPTER 1: INTRODUCTION......1

| 1.1 Motivation                 | 2 |
|--------------------------------|---|
| 1.2 Objectives                 | 2 |
| 1.3 Organization of the Report | 2 |

#### 

|                                             | ) |
|---------------------------------------------|---|
| 2.2 Approaches in Integrated Circuit Design | 1 |
| 2.3 Library Specifications                  | 5 |
| 2.4 Drawing Layout Strategies               | 5 |
| 2.5 Routing Grids and Pins                  | 5 |
| 2.6 Power and Ground Rails                  | 5 |

| CHAPTER 3: Characterization of Standard Cells | 8  |
|-----------------------------------------------|----|
| 3.1 Inverter Standard Cells                   | 8  |
| 3.2 NAND Standard Cells                       | 9  |
| 3.3 NOR Standard Cells                        | 10 |
| 3.4 Two Stage Buffer Standard Cells           | 12 |
| 3.5 Multiplexer Standard Cell                 | 14 |
| 3.6 Half Adder Standard Cells                 |    |
| 3.7 D-Latch Standard Cells                    | 16 |
| 3.8 Importance of Local Interconnect          | 17 |

.

| CHAPTER 4: Sequential Circuits             |    |
|--------------------------------------------|----|
| 4.1 Timing Metrics for Sequential Circuits | 19 |
| 4.2 Static Latch and Flip-Flop             | 20 |
| 4.2.1 The Bistability Principle            | 20 |
| 4.2.2 Static Latch Robustness              | 21 |
| 4.3 Multiplexer Based Latches              | 23 |
| 4.3.1 Timing Properties                    | 23 |
| 4.4 Timing Analysis using Cadence          | 24 |

. .

| CHAPTER 5: Efficient and Robust D-latch Design Methodology | 26 |
|------------------------------------------------------------|----|
| 5.1 Traditional Design Methodology of D-latch Cell         | 27 |
| 5.2 New Proposed Design Methodology of D-latch Cell        | 27 |
| 5.3 Simulation Results                                     | 30 |
| 5.3.1 Setup Time calculation of D-latch Cell               | 30 |
| 5.3.2 Discussion on New Design Methodology                 | 32 |
| 5.4 Performance of New Design D-latch                      | 34 |

| <b>CHAPTER 6: CONCLUSION</b> |  |
|------------------------------|--|
| 6.1 Future Scope             |  |

.

.

| ERENCES   | .37 |
|-----------|-----|
| ENDIX - A | .39 |
| ENDIX - B | .40 |

### LIST OF FIGURES

| Figure No. | Title                                                               | Page No. |
|------------|---------------------------------------------------------------------|----------|
| 2.1        | The template of the designed cells with power and ground rails      | 7        |
| 3.1        | CMOS inverter circuit diagram                                       | 8        |
| 3.2        | CMOS NAND gate circuit diagram                                      | 9        |
| 3.3        | CMOS NOR gate circuit diagram                                       | 10       |
| 3.4        | Two stage Buffer                                                    | 12       |
| 3.5 (a)    | Graph to calculate the value of $\rho$ for buffer_150               | 13       |
| 3.5 (b)    | Graph to calculate the value of $\rho$ for buffer_300               | 13       |
| 3.6        | Circuit diagram of 2×1 Multiplexer                                  | 14       |
| 3.7        | Circuit diagram of half adder                                       | 15       |
| 3.8        | Static D-latch circuit diagram                                      | 16       |
| 4.1        | Sequential system                                                   | 18       |
| 4.2        | Timing constraint of D-latch                                        | 19       |
| 4.3        | Timing of positive and negative latches                             | 20       |
| 4.4        | Voltage transfer characteristics of inverter connected back to back | 21       |
| 4.5        | Voltage transfer characteristics after a small deviation in input   | 22       |
| 4.6        | Multiplexer type static D-latch                                     | 23       |
| 4.7 (a)    | Setup time simulation at $T_{D-C} = 55$ psec                        | 24       |
| 4.7 (b)    | Setup time simulation at $T_{D-C} = 56$ psec                        | 25       |

| 5.1 | Voltage transfer characteristic with increasing the width of transistor keeping p/n ratio constant | 27 |
|-----|----------------------------------------------------------------------------------------------------|----|
| 5.2 | D-latch in transparent mode                                                                        | 28 |
| 5.3 | D-latch in latch mode                                                                              | 28 |
| 5.4 | D-latch in latch mode                                                                              | 29 |
| 5.5 | Voltage transfer characteristics of two inverters connected back to back                           | 29 |
| 5.6 | Graph to calculate the setup time of D-latch                                                       | 31 |
| 5.7 | Graph between no. of fingers and setup time for d-latch                                            | 32 |
| 5.8 | Schematic of D-latch                                                                               | 33 |
| 5.9 | D-latch in latch mode with noise at Yb node                                                        | 34 |

.

### LIST OF TABLES

.

| Table No. | Title                                                                                                            | Page No. |
|-----------|------------------------------------------------------------------------------------------------------------------|----------|
| 3.1       | Delay of inverter extracted from schematic and layout for different sizing                                       | 9        |
| 3.2       | Delay of NAND gate extracted from schematic and layout for<br>different sizing                                   | 10       |
| 3.3       | Delay of NOR extracted from schematic and layout for different sizing                                            | 11       |
| 3.4       | Delay of Buffer extracted from schematic and layout with variation<br>of different load capacitances             | 12       |
| 3.5       | Delay of Buffers extracted from schematic and layout for different sizing                                        | 13       |
| 3.6       | Delay of Multiplexer extracted from schematic and layout for all combination of input to output                  | 14       |
| 3.7       | Delay of Half-Adder extracted from schematic and layout for different sizing                                     | 15       |
| 3.8       | Setup time and clock to output (C-Y) delay for D-latch with different sizing extracted from layout and schematic | 16       |
| 4.1       | Variation of delay with T <sub>D-C</sub> in D-latch                                                              | 25       |
| 5.1       | Percentage reduction in area of latch by new design methodology                                                  | 30       |
| 5.2       | Setup time of D-latches with traditional and proposed design methodology                                         | 31       |
| 5.3       | Values of voltage at which D-latch flips to another state                                                        | 35       |

### Chapter

### Introduction

With technology scaling, impact of local interconnects on circuit start to dominate[1]. Local interconnects introduce a significant delay which should be taken into consideration in circuit designing. Transistors operate faster as their dimensions are scaled down. The wires on the chips that connect these transistors to form a circuit, however, do not exhibit the same benefit of scaling. The drive for faster chips with lower cost and greater functionality has transformed these wires (interconnects) into what determines the performance and reliability of a nanometer-scale integrated circuit (IC).

Layout of random logic circuits can be designed using methods like full-custom design, standard cells or automatic layout generation. Layouts using full custom design are extremely dense, however, due to the high time to market constraint this approach is used selectively for most critical part of a circuit. The standard-cell approach is currently the most used solution. In standard cell design, a library of full custom cells are automatically placed and routed. Standard cell methodology widely used in IC design. However, the effect of interconnect and other parasitics are more significant in standard cell based design.

The impact of these local interconnects and other parasitics on sequential circuits is even more severe. One of the basic building blocks of sequential circuits are D-latches. The effects of local interconnects and parasitics on them would be amplified in the entire circuit design, to the extent of circuit malfunction, if they are not controlled or accounted for within the Dlatch itself. Hence, the need for a robust and parasitic-tolerant D-latch design is extremely essential.

### 1.1 Motivation

Two most important concerns in VLSI technology are speed and area. With scaling of technology the effect of local interconnect is significantly increasing[2], especially in sequential circuits. In standard cell library the basic memory cell is D-latch. Impact of delay due to local interconnects on time constraints of D-latch is very high. Also with technology scaling clock frequency increases[3] for which we need to reduce the time constraints of D-latch like setup and hold time.

### 1.2 Objectives

The following objectives have been achieved successfully in this thesis:

- ✓ Developed a standard cell library at 90nm
- ✓ Characterization of all standard cells of designed library
- ✓ Study the impact of local interconnects on standard cells specially on D-latch cell
- ✓ Finally, a new methodology to design D-latch is proposed

### **1.3** Organization of the Report

Chapter 2 includes a brief knowledge of standard cell library. 3<sup>rd</sup> chapter shows the importance of local interconnects in circuits at 90nm technology and their effect on standard cells by comparing the results in schematic and layout. In chapter 4 we have explained the working of D-latch and their time constraints. In chapter 5 we have given a new design methodology to design D-latch and compared the performance and robustness of new design with the normal D-latch.

### Chapter



### **Standard Cell Library**

An extremely powerful concept in VLSI is the standard cell library. Standard cells help create efficient & dense layouts because they are easily abutted during the layout process. Standard cell layout simply means that all standard cells - NAND, NOR, NOT etc. in the design are designed with standard dimensions for heights, widths, actives and wells, and have standard power (vdd!) and ground (gnd!) buses.

Standard cell based design has become a mainstream design style for recent VLSI's. Standard cell libraries are getting bigger, containing more than 500 cells and expect better performance from the resulting VLSI's. Generating, verifying and maintaining these big libraries, however, needs lots of time and manpower, and errors may crept into the cell design and cell characterization processes. Moreover, technologies are getting diverse and changing rapidly and a cell library must be generated from scratch more frequently.

### 2.1 What is a Standard Cell?

A standard cell consists of a set of transistors and their connections which implements a Boolean logic or a storage function. Although it is possible to generate any Boolean function using only NAND (or NOR) gates, but the design will be more area effective if other logical gates are also includes in the library. The elementary gates such as Buffer, Inverter, NAND, NOR and memory cells are often found in any standard library while the rich libraries contain additional gates with higher complexity such as adders and multipliers.

The initial design of a standard cell begins with implementing the functionality of the cell at the transistor level. The schematic view of a cell is used for this purpose. In addition, schematic views are widely used for simulating and debugging the circuits. The schematic of a cell can be represented by symbol view which consists of the input and output ports of the cell as well as some text information.

Standard cell libraries contain another view which is called layout. Designing the layout view of a cell is compulsory since the netlist is useful for simulation purposes and not for fabrication. The layout of a cell represents what will be physically placed on a chip. Each layout consists of several base layers which form the structures of the transistors and interconnect lines. Designing area efficient layouts which could meet the required power and timing constraints is still a challenging task despite the existence of different CAD tools to aid the process of design.

The designed cell layouts must be checked to insure that no design rules are violated (Design Rule Check). Then it is necessary to test the layout by Layout Versus Schematic (LVS) in order to verify compatibility of the layout with corresponding schematic. Now, post layout simulation can be performed by extracting the parasitic after passing the LVS check.

### 2.2 Approaches in Integrated Circuit Design

The way that an integrated circuit is constructed depends on the constraints to fulfill. There are three approaches to create a digital integrated circuit. The first approach, *Full-Custom Design*, is when the designer plans the layout manually. In this approach, each transistor in design is sized and optimized manually to meet the desired constraints. The advantages of this approach include a compact area, performance improvements as well as the ability to include various components such as microprocessors or analog components. Obviously, manufacturing time and cost will increase and a higher skill will be required on the part of the design team as well. As a result, this method is suitable for the designs with strict requirements to fulfill.

The second approach, Semi-Custom Design, is when the designer uses already designed logic blocks from a cell library to construct the circuit. In this approach, the desired functionality is realized by placing set of simple or even complex logic blocks (instead of transistors and interconnects in the full-custom approach) over and over again in the layout. The main advantage of semi-custom over full-custom design is that the required time to develop a

4

circuit is decreased dramatically. The drawback of semicustom approach is the loss of control over the layout as well as characteristics of the gates as pre-developed logic blocks are used in the design.

In contrast with full-custom method, the chips designed using the semi-custom method are cheaper in small production volume, but more expensive in high production volume. This makes semi-custom design very appropriate for debugging and prototyping new designs.

A good approach in chip design is using a combination of full and semi-custom design methods, where the designer creates and optimizes logical blocks manually and then uses them in the layout instead of using pre-designed blocks from a library.

The third approach, Automatic Design, is when a CAD tool creates the layouts automatically and uses standard library cells to realize the circuit. The design is described in high-level hardware description languages such as verilog or VHDL. Then, the high level description is fed to the tool to create the corresponding layout. The CAD tools are able to optimize the generated layouts to meet the desired constraints. Although this method is the fastest way of realizing a circuit layout, but it suffers from less optimized layouts as well as loss of control over the way that the layout is generated.

### 2.3 Library Specifications

The library has been implemented at 90nm standard CMOS technology. Standard cell library usually contains at least NOT, NAND, NOR and DF/F to be able to implement different logic functions without difficulty. The designed cell library contains 44 types of elementary gates: BUFFER, INVERTER, D-FF, D-LATCH, MUX, NAND, NOR and HALF-ADDER. The cells in the library (except the HALF ADDER) come with different driving strengths.

The height of the layout of each cell in the library is fixed and equals to 4.05  $\mu$ m. The width of the layouts of the cells varies between 1.8 and 36.905  $\mu$ m for the INV and DFF gates respectively. The width of each cell must be an integer multiple of the horizontal grid spacing. As the increase in the width must be a multiple of the vertical grid (0.15  $\mu$ m), a large area may stay unoccupied. The differential pins, also occupy a relatively large area since they must be placed on the intersections of adjacent grids.

5

### 2.4 Drawing Layout Strategies

All of the cells in the library have power and ground pins in common and hence the corresponding rails must be placed in the same place in the layout of the cells.

In addition, the connections near the borders of the cells are spaced at least  $0.15 \mu m$  from the boundary to prevent DRC errors when the cells are placed side by side. The body contacts of the PMOS devices should be spaced at least  $0.24 \mu m$  from the borders. The pins must be places on the intersections of the grids. Placing the pins of a cell on different horizontal or vertical routing grids will ease routing of the pins. It is also beneficial to make the connections only by METAL1 and poly layers to generate lower blockages. Sometimes it is not possible to draw all of the connections using the mentioned layers due to high complexity of connections. In this case, making the connections on the preferred grids using METAL2 layer will generate fewer blockages.

### 2.5 Routing Grids and Pins

Routing grids routes the pins over the cells. In general, it is important to choose grid spacing for different routing layers properly to simplify routing and to avoid errors[4]. The grid spacing is chosen  $0.15 \,\mu\text{m}$  in the layout view of the cells.

### 2.6 Power and Ground Rails

The differential pins need to be placed on the intersection of grids but power and ground pins are abutment pins and do not need to be placed on the intersections. This means that the power and ground rails are drawn in the layout such that they are automatically connected by placing the cells side-by-side. It's important that the metal contacts are placed properly on these shared rails to prevent DRC errors after placement. To do this, the metal contacts are placed symmetrically on the intersection of the grids. The remaining space between the contact and the adjacent grids should be filled with metal layers. This is to prevent DRC errors if the contact of another cell is placed on the same or adjacent grid.



Figure 2.1: The template of the designed cell with power and ground rails

In figure 2.1, a layer called prBoundary is shown which determines the effective boundary of the cell. The boundary of each cell is smaller than the overall cell layout since the power and ground rails are shared amongst adjacent cells. In the next chapter, design of a CMOS buffer/inverter and other standard cells are discussed in detail.

7

### Chapter



### Characterization of Standard Cells

We have developed a standard cell library containing 44 cells. This library covers almost all basic gates (NOT, NAND, NOR, Buffer, MUX, Half Adder, D-latch) with should be present in a standard cell library. Every standard cell has been sized for minimum delay, equal rise/fall transition times and minimum area. Using this library we can design any circuit. All designed standard cells are characterized for  $T_P$  the propagation delay. All outputs are taken with load capacitance of 15fF value and all input signals have rise/fall time of 5 psecs.

#### 3.1 Inverter Standard Cells

We have designed standard cells to keep rise and fall transition times equal. For this, we keep the ratio of PMOS device width  $W_p$  and NMOS device width  $W_n$  constant at  $W_p/W_n=2$  in all standard cells.



Figure 3.1: CMOS inverter circuit diagram

As we observe from the Table 3.1 there is a difference in delay of schematic and layout because of local interconnects.

|       | Schematic  |            |         | Layout     |            |         |  |
|-------|------------|------------|---------|------------|------------|---------|--|
| Logic | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) |  |
| 150   | 88.36      | 99.5       | 93.94   | 109.6      | 120.6      | 115.1   |  |
| 200   | 67.19      | 75.9       | 71.53   | 83.19      | 91.45      | 87.32   |  |
| 250   | 54.25      | 61.6       | 57.92   | 67.14      | 73.93      | 70.54   |  |
| 300   | 46.35      | 52.33      | 49.34   | 56.87      | 63.37      | 60.12   |  |
| 350   | 40.56      | 45.7       | 43.14   | 49.39      | 54,43      | 51.9    |  |
| 400   | 36.16      | 40.5       | 38.33   | 43.79      | 48.98      | 46.38   |  |
| 500   | 30.03      | 34         | 32.01   | 36.06      | 39.97      | 38.02   |  |
| 600   | 25.97      | 29.1       | 27.53   | 31.33      | 34.29      | 32.81   |  |
| 800   | 21.32      | 23.8       | 22.54   | 24.85      | 26.66      | 25.76   |  |
| 1000  | 18.35      | 20.6       | 19.47   | 20.61      | 22.46      | 21.54   |  |

Table 3.1: Delay of inverter extracted from schematic and layout for different sizing

#### 3.2 NAND Standard cells

We have designed the two input NAND gate with appropriate sizing for same output transition (rise/fall) times.



Figure 3.2: CMOS NAND gate circuit diagram

Let the equivalent inverter of this NAND gate have the width of NMOS equal to w then PMOS width should be 2w to keep  $W_p/W_n$  ratio equal to 2[5]. Now because transistors M1 and M2(as shown in figure 3.2) are in series in NAND logic, to keep the same resistance as the equivalent inverter we have to take the width 2w for M1 and M2 transistors, whereas the

width of transistors M3 and M4 will remain same as for the equivalent inverter PMOS (2w). The width of all transistors in NAND should be double of the equivalent inverter NMOS (w).

|       | Layout     |            |         | Schematic  |            |         |  |
|-------|------------|------------|---------|------------|------------|---------|--|
| Logic | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) |  |
| 150   | 108.1      | 117.2      | 112.6   | 80.45      | 106.7      | 93.575  |  |
| 200   | 83.32      | 90.57      | 86.95   | 62.49      | 82.92      | 72.705  |  |
| 250   | 68.84      | 74.7       | 71.77   | 51.86      | 69.24      | 60.55   |  |
| 300   | 58.42      | 64.26      | 61.34   | 44.72      | 59.39      | 52.055  |  |
| 400   | 46         | 51.09      | `48.54  | 36.1       | 47.82      | 41.96   |  |
| 500   | 39.61      | 45.26      | 42.44   | 36.41      | 40.67      | 38.54   |  |
| 600   | 34.53      | 39.9       | 37.22   | 27.03      | 36.27      | 31.65   |  |

Table 3.2: Delay of NAND gate extracted from schematic and layout for different sizing

We can see from Table 3.2 that there are significant differences in delay of schematic and layout circuit. This shows that the impact of local interconnect on circuit is very significant and we should take care of it at the time of designing. Also with increasing the size of gate, delay is decreasing in same proportion.

### 3.3 NOR Standard Cells

In NOR gate the two transistors are in series in PUN, so for equal transition times we have to double the width (4w) of M3 and M4 as compare to the equivalent inverter PMOS transistor (2w), whereas the PDN with M1 and M2 in parallel will have the same width as equivalent inverter NMOS (w).



Figure 3.3: CMOS NOR gate circuit diagram

|       |            | Layout     | Schematic |            |            |         |
|-------|------------|------------|-----------|------------|------------|---------|
| Logic | Tp_HL (ps) | Tp_LH (ps) | Tp (ps)   | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) |
| 150   | 131.7      | 103.9      | 117.8     | 92.48      | 92.7       | 92.59   |
| 200   | 102.5      | 80.55      | 91.525    | 72.25      | 77.41      | 74.83   |
| 250   | 86.99      | 68.42      | 77.705    | 58.97      | 58.98      | 58.97   |
| 300   | 75.52      | 59.05      | 67.285    | 51.08      | 50.1       | 50.59   |
| 400   | 60.54      | 47.11      | 53.825    | 40.45      | 40.22      | 40.33   |
| 500   | 51.92      | 41.39      | 46.655    | 34.46      | 33.55      | 34.00   |
| 600   | 46.97      | 37.31      | 42.14     | 30.25      | 30.04      | 30.14   |

Table 3.3: Delay of NOR gate extracted from schematic and layout for different sizing

The transition time of output in schematic is almost same because of the good sizing the PUN and PDN. Interconnect and fingering effects are clearly observed because of unequal output transition time in layout. Comparing the NAND and NOR gate with same equivalent inverter sizing the delays in layout are about same but the size of NOR gate is taking larger area as compare to NAND gate.

Since PMOS devices have a lower mobility relative to NMOS devices, stacking devices in series must be avoidable as much as possible. NAND implementation is clearly preferred over NOR implementation for implementing generic logic[5].

We have designed the two stage buffer for 150nm and 300nm width inverter sizes with the method of logical effort [6]. The first inverter selected is with minimum size 150nm and for minimum delay for this buffer we have calculated the width of second inverter. The width of second inverter will be  $\rho$  times of first inverter, where  $\rho$  is the best setup ratio because it is the ratio of the sizes of successive inverters in a string of inverters designed to drive a large capacitive load.



Figure 3.4: Two stage Buffer

To calculate the value of  $\rho$  we plot a graph between the delay of first inverter and  $C_L/C_{in}$  which is a straight line following the equation given below [6].

$$d = \tau (gh + P_{inv})$$
(3.1)

Equation gives the delay of a logic gate in terms of logical effort g, electrical effort h, and parasitic delay p. The process parameter  $\tau$  represents the speed of the basic transistor. For the inverter value of the logical effort g is 1. By comparing the straight line equation from the graph with the above equation we get the value of P<sub>inv</sub>. So for the minimum delay the value of  $\rho$  calculated by equation [6]

$$\rho = 0.71 P_{inv} + 2.82$$
 (3.2)

Table 3.4: Delay of Buffer extracted from schematic and layout with variation of different load capacitances

|              |            | Buffer_150 |         | Buffer_300 |            |         |  |
|--------------|------------|------------|---------|------------|------------|---------|--|
| $C_L/C_{in}$ | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) | Tp_HL (ps) | Tp_LH (ps) | Tp (ps) |  |
| 2            | 20.13      | 23.5       | 21.81   | 19.72      | 23.69      | 21.70   |  |
| 3            | 27.98      | 33.27      | 30.62   | 27.1       | 33.26      | 30.18   |  |
| 5            | 44.08      | 53.64      | 48.86   | 42.3       | 53.28      | 47.79   |  |
| 9            | 76.61      | 95.51      | 86.06   | 73.04      | 94.43      | 83.73   |  |

Now, we calculate values of  $P_{inv}$  for both the buffers from graph and substitute value of  $P_{inv}$  in equation 3.2 to evaluate the value of  $\rho$ .



Figure 3.5: Graph to calculate the value of  $\rho$ 

Values obtained after calculation -

For Buffer\_150  $\rho$ =3.22, so the size of NMOS and PMOS in the second inverter is 500nm and size 1000nm respectively.

For Buffer\_300  $\rho$ =3.08, so the size of NMOS and PMOS in the second inverter is 900nm and 1800nm respectively.

| Table 3.5: Delay of Buffer | extracted from | schematic and | layout for | different sizing |
|----------------------------|----------------|---------------|------------|------------------|
| 2                          |                |               |            | . 0              |

|       | Schematic |           |         | Layout    |           |         |  |
|-------|-----------|-----------|---------|-----------|-----------|---------|--|
| Logic | Tp_r (ps) | Tp_f (ps) | Tp (ps) | Tp_r (ps) | Tp_f (ps) | Tp (ps) |  |
| 150   | 63.99     | 63.57     | 63.78   | 74.2      | 80.77     | 77.48   |  |
| 300   | 48.45     | 48.66     | 48.55   | 54.64     | 58.37     | 56.50   |  |

Buffer\_150 is minimum size buffer designed for minimum delay. We observe that, even though PMOS is weaker than NMOS Tp\_r (high to high propagation delay) is lesser than Tp\_f (low to low propagation delay) in layout circuit because in Tp\_r, the first inverter NMOS and second inverter PMOS works where as in Tp\_f, first inverter PMOS and second inverter NMOS works. We know that in our inverters the weaker device is PMOS, which is the weaker in first inverter because of less width, so the time for rising is more for first inverter when the input is falling for this reason, the Tp\_f is more than Tp\_r in buffer layout circuit. Also we see the effect of interconnect in Table 3.5 by differentiating the Tp of layout and schematic.

#### 3.5 Multiplexer Standard Cell

This library includes a  $2 \times 1$  multiplexer cell using transmission gates. The most widely-used solution to deal with the voltage-drop problem is the use of transmission gates. It builds on the complementary properties of NMOS and PMOS transistors: NMOS devices pass a strong 0 but a weak 1, while PMOS transistors pass a strong 1 but a weak 0. The ideal approach is to use an NMOS to pull-down and a PMOS to pull-up. The transmission gate combines the best of both device flavors by placing a NMOS device in parallel with a PMOS device. Transmission gates can be used to build some complex gates very efficiently.

The PMOS and NMOS transistors in the transmission gate can be equal in width because both transistors operate in parallel while driving the output. To drive the multiplexer one inverter at input and another inverter at output is added. The input inverter is of minimum size and the size determined for buffer has taken the size of output inverter, which is giving minimum delay for the multiplexer.



Figure 3.6: Circuit diagram of 2×1 Multiplexer

Table 3.6: Delay of Multiplexer extracted from schematic and layout for all combination of input to output

|       | Layout    |           |         | Schematic |           |         |  |
|-------|-----------|-----------|---------|-----------|-----------|---------|--|
| Logic | Tp_f (ps) | Tp_r (ps) | Tp (ps) | Tp_r (ps) | Tp_f (ps) | Tp (ps) |  |
| Mux_a | 106.8     | 117.5     | 112.15  | 89.76     | 91.91     | 90.835  |  |
| Mux_b | 106.6     | 117       | 111.8   | 89.69     | 91.91     | 90.8    |  |

We see the effect of interconnects on propagation delay in schematic and layout from Table 3.6. Also one more thing which we observe is that in layout input a to output and input b to output delays are same because of symmetric layout.

### 3.6 Half Adder Standard Cell

This library includes a cell of Half Adder which take the two inputs a and b and give the outputs sum and carry.

$$SUM = a \oplus b$$
$$CARRY = a.b = (a_c+b_c)_c$$



Figure 3.7: Circuit diagram of half adder

Table 3.7: Delay of Half-Adder extracted for layout and schematic for all combination of

|         | Layout    |           |         | Schematic |           |         |  |
|---------|-----------|-----------|---------|-----------|-----------|---------|--|
| Logic   | Tp_f (ps) | Tp_r (ps) | Tp (ps) | Tp_r (ps) | Tp_f (ps) | Tp (ps) |  |
| Sum_a   | 130.1     | 138.5     | 134.3   | 113.6     | 102.2     | 107.9   |  |
| Sum_b   | 128.4     | 136       | 132.2   | 113.3     | 102.2     | 107.7   |  |
| Carry_a | 126.8     | 155.5     | 141.6   | 107.2     | 110.6     | 108.9   |  |
| Carry_b | 122.2     | 150.2     | 136.2   | 112       | 116.3     | 114.1   |  |

input to output

We measure delay in half adder by keeping the one input terminal constant to zero and apply a square wave input with 2ns period and 5ps rise/fall time to other input. The calculated propagation delay for sum is obtained and by applying input 1 instead of 0 at constant input terminal and the same square wave at second input terminal the value of delay for carry is calculated. In circuit, all transistors are sized for same propagation delay. In EXOR the size selected of NMOS is 300nm and PMOS is 600nm. Also the size of NMOS and PMOS for NOR are selected respectively 150nm and 600nm. In table 3.6 the difference in propagation delay of layout and schematic is due to local interconnect effects, but the propagation delay from input a and b to sum are approximately same because of symmetric layout.

### 3.7 D-Latch Standard Cells

The most robust and common technique to build a latch involves the use of transmission gate multiplexer. Figure 3.8 shows the implementation of positive static D-latch based on multiplexer.

Library contains two D latches of different sizes. In dlatch\_250, size of NMOS in inverter 1 2 and 3 is taken to be 250nm (PMOS size is double of NMOS) and the transmission gate 1 and 2 have same size of 250nm (NMOS and PMOS both have same size). In dlatch\_400 size of NMOS in inverter 1 2 and 3 is taken to be 400nm (PMOS size is double of NMOS) and the transmission gate 1 and 2 are taken 400nm (NMOS and PMOS have same sizes).



Figure 3.8: Static D-latch circuit diagram

 Table 3.8: setup time and clock to output (C-Y) delay for D-latch with different sizing extracted from layout and schematic

|         | Sche            | matic          | Layout         |                |  |
|---------|-----------------|----------------|----------------|----------------|--|
| d_latch | setup time (ps) | C-Y Delay (ps) | Setuptime (ps) | C-Y Delay (ps) |  |
| 400     | 80              | 72.75          | 90             | 81.2           |  |
| 250     | 105             | 96             | 115            | 110            |  |

We have measured the setup time and C-Y delay for square wave clock with period 1ns and rise/fall time 5ps and with same rise/fall time D input also.

In Table 3.8, the values of setup time and clock to output delay (C-Y Delay) are shown for layout and schematic. By comparing the schematic and layout time constraints, we observe the importance of interconnect in case of latches because there is significant change in setup time and C-Y delay, which can cause malfunctioning in sequential circuit. So we have to take care of local interconnect in latches and flip-flops with scaling technology for right functioning of sequential circuit.

### 3.8 Importance of Local Interconnect

We have characterized all standard cells of library in schematic and layout. We have seen a major difference in all compared parameters of layout and schematic. This difference in parameters is because of local interconnect and parasitic, which add an extra delay in our circuits. The extra delay added by local interconnect give more severe effect in sequential circuits (as we have seen in characterization of D-latch cells). A wrong calculation of setup, time can be cause malfunctioning in circuit. In next chapter we have explained sequential circuit basic building blocks and there time constraints.

### Chapter

# 4

### Sequential Circuits

In the same way that gates are the building blocks of combinatorial circuits, latches and flipflops are the building blocks of sequential circuits. Both latches and flip-flops (FF) are circuit elements whose output depends not only on the current inputs, but also on previous inputs and outputs. The difference between a latch and a flip-flop is that a latch is level sensitive, whereas a flip-flop is edge sensitive.

Figure 4.1 shows a block diagram of a generic FSM that consists of combinational logic and flip-flop, which hold the system state. The system depicted here belongs to the class of synchronous sequential system, in which all flip-flops are under control of a clock signal. The outputs of the FSM are a function of the current input and the current state[7]. The next state is determined based on the current state and the current inputs and is fed to the inputs of the flip-flops.



Figure 4.1: Sequential system

On the rising edge of the clock, the next state bits are copied to the output of the flip-flops (after propagation delay), and a new cycle begins. The flip-flop then ignores changes in the input signals until the next rising edge. In general, flip-flop can be positive edge triggered

(where the input data is copied on the rising edge of the clock) or negative edge triggered (where the input data is copied on the falling edge of the clock)[7].

This chapter discusses the CMOS implementation of the most important sequential building block.

#### 4.1 Timing Metrics for Sequential Circuits

There are three important timing parameters associated with a latch/FF. They are shown in figure 4.2. The setup time  $(t_{su})$  is the time that the data input (D) must be valid before the clock transition[8]. The hold time  $(t_{hold})$  is the time the data input must be remain valid after a clock edge[8]. Assuming that the setup and hold times are met, the data at the D input is copied to the Q output after a worst case propagation delay (with reference to the clock edge) denoted by  $t_{c-y}$ .



Figure 4.2: Timing constraint of D-latch

Once we know the timing information for the latch/FF and the combinational blocks, we can derive the system-level timing constraints (as shown in figure 4.1). In sequential circuits, switching events take place concurrently in response to a clock stimulus. Results of operations await the next clock transition before progressing to the next stage. In other words, the next cycle cannot begin unless all current computations have completed and the system has come to rest. The clock period T, at which the sequential circuit operates, must thus accommodate the longest delay of any stage in the network.

#### 4.2 Static Latch and Flip-Flop

A latch is an essential component in the construction of an edge-triggered FF. It is levelsensitive circuit that passes the D input to the Y output when the clock signal is high. This latch is said to be in transparent mode. When the clock is low, the input data sampled on the falling edge of the clock is held stable at the output for the entire phase, and the latch is in hold mode. The inputs must be stable for a short period around the falling edge of the clock to meet set-up and hold requirements. A latch operating under the above conditions is a positive latch. Similarly, a negative latch passes the D input to the Y output when the clock signal is low. The signal waveforms for a positive and negative latch are shown in Figure 4.3.

Contrary to level-sensitive latches, edge-triggered FF only sample the input on a clock transition — 0-to-1 for a positive edge-triggered flip-flop, and 1-to-0 for a negative edge-triggered flip-flop. They are typically built using the latch primitives of Figure 4.3. A most-often recurring configuration is the master-slave structure that cascades a positive and



Figure 4.3: Timing of positive and negative latches[5]

### 4.2.1 The Bistability Principle

Static memories use positive feedback to create a bistable circuit — a circuit having two stable states that represent 0 and 1. The basic idea is shown in Figure 4.4a, which shows two

inverters connected in cascade along with a voltage-transfer characteristic typical of such a circuit. Also plotted are the voltage transfer characteristics (VTCs) of the first inverter, that is, Vol versus Vil, and the second inverter is Vo2 versus Vol. The latter plot is rotated to accentuate that Vi2 = Vol. Assume now that the output of the second inverter Vo2 is connected to the input of the first Vil, as shown by the dotted lines in Figure 4.4 (a). The resulting circuit has only three possible operation points (*A*, *B*, and *C*), as demonstrated on the combined VTC in figure 4.4(b).



Figure 4.4: Voltage transfer characteristics of inverter connected back to back

Under the condition that the gain of the inverter in the transient region is larger than 1, only A and B are stable operation points, and C is a metastable operation point[9].

#### 4.2.2 Static Latch robustness

Suppose that the cross-coupled inverter pair is biased at point C. A small deviation from this bias point, possibly caused by noise, is amplified and regenerated around the circuit loop. This is a consequence of the gain around the loop being larger than 1. The effect is demonstrated in Figure 7.5a. A small deviation d is applied to Vi1 (biased in C). This deviation is amplified by the gain of the inverter. The enlarged divergence is applied to the second inverter and amplified once more. The bias point moves away from C until one of the operation points A or B is reached. In conclusion, C is an unstable operation point. Every

deviation (even the smallest one) causes the operation point to run away from its original bias.



Figure 4.5: Voltage transfer characteristics after a small deviation in input

The chance is indeed very small that the cross-coupled inverter pair is biased at C and stays there. Operation points with this property are termed meta-stable[9].

On the other hand, A and B are stable operation points, as demonstrated in Figure 4.5b. In these points, the loop gain is much smaller than unity. Even a rather large deviation from the operation point is reduced in size and disappears.

Hence the cross-coupling of two inverters results in a bistable circuit, that is, a circuit with two stable states, each corresponding to a logic state. The circuit serves as a memory, storing either a 1 or a 0 corresponding to positions A and B.

In order to change the stored value, we must be able to bring the circuit from state A to B and vice-versa. Since the precondition for stability is that the loop gain is smaller than unity, we can achieve this by making A (or B) temporarily unstable by increasing gain to a value larger than 1. This is generally done by applying a trigger pulse at Vi1 or Vi2. For instance, assume that the system is in position A (Vi1 = 0, Vi2 = 1), forcing Vi1 to 1 causes both inverters to be on simultaneously for a short time and the loop gain to be larger than 1. The positive feedback regenerates the effect of the trigger pulse, and the circuit moves to the other state (B in this case). The width of the trigger pulse need be only a little larger than the total

propagation delay around the circuit loop, which is twice the average propagation delay of the inverters.

### 4.3 Multiplexer Based Latches

There are many approaches for constructing latches in which one technique is by use of transmission gate multiplexer. Multiplexer based latches have the important added advantage that the sizing of devices only affects performance and are not critical to the functionality. Figure 4.6 shows an implementation of static positive latch based on multiplexer.



Figure 4.6: Multiplexer type static D-latch

A transistor level implementation of a positive latch based on multiplexers is shown in Figure 4.6. When CLK is high, the bottom transmission gate is on and the latch is transparent - that is, the D input is copied to the Y output. During this phase, the feedback loop is open since the top transmission gate is off. The feedback does not have to be overridden to write the memory and hence sizing of transistors is not critical for realizing correct functionality.

#### 4.3.1 Timing Properties

Latches are characterized by three important timing parameters: the set-up time, the hold time and the propagation delay. It is important to understand the factors that affect these timing parameters, and develop the intuition to manually estimate them. Assume that the propagation delay of each inverter is tpd\_inv, and the propagation delay of the transmission gate is tpd\_tx. Also assume that the contamination delay is 0 and the inverter delay to derive CLK from CLK has a delay equal to 0. The set-up time is the time before the rising edge of the clock that the input data D must become valid. For the transmission gate multiplexer-based latch, the input D has to propagate through I1, T1, I2 and I3 before the rising edge of the clock. This is to ensure that the node voltages on both terminals of the transmission gate T2 are at the same value. Otherwise, it is possible for the cross-coupled pair I2 and I3 to settle to an incorrect value. The set-up time is therefore equal to  $3 * Tp inv + Tp_tx[5]$ .

The hold time represents the time that the input must be held stable after the rising edge of the clock. In this case, the transmission gate T1 turns off when clock goes high and therefore any changes in the D-input after clock going high are not seen by the input. Therefore, the hold time is 0.

#### 4.4 Timing Analysis using Cadence

To obtain the set-up time of the latch using cadence, we progressively skew the input with respect to the clock edge until the circuit fails. Figure 4.7 shows the set-up time simulation assuming a skew of 56 psec and 55 psec. For the 56 psec case, the correct value of input D is sampled (in this case, the Y output remains at the value of VDD). For a skew of 55 psec, an incorrect value propagates to the output (in this case, the Y output transitions to 0). Node Y starts to go high while the output of I2 (the input to transmission gate T2) starts to fall. However, the clock is enabled before the two nodes across the transmission gate (T2) settle to the same value and therefore, results in an incorrect value written into the latch.



(a)  $T_{D-C} = 56$  psec



(b)  $T_{D-C} = 55$  psec

Figure 4.7: Setup time simulation

Table 4.1: Variation of delay with  $T_{D-C}$  for D-latch

| D-latch                |        |       |       |       |       |       |      |       |      |
|------------------------|--------|-------|-------|-------|-------|-------|------|-------|------|
| Td <sub>D-C</sub> (ps) | 63     | 62    | 61    | 60    | 59    | 58    | 57   | 56    | 55   |
| Td <sub>C-Y</sub> (ps) | 0.3206 | 1.464 | 2.693 | 4.369 | 7.317 | 13.93 | 27.7 | 80.02 | fail |

We can see in figure 4.7(b) that at  $T_{DC} = 55$ psec (data to clock skew) D-latch fails to latch the value applied at input D. With reducing  $T_{D-C}$  (data to clock skew) delay from clock to output increases as shown in Table 4.1, at  $T_{D-C} = 55$ psec delay of latch is very high as we can see in figure 4.7 (a). This shows that setup time should be much larger then this marginal value of  $T_{DC} = 56$  psec, so that the clock to output delay will not be large. In next chapter we have proposed a new methodology to design D-latch, also explained is the method to calculate the setup time.

### Chapter



## Efficient and Robust D-latch Design Methodology

At 90nm technology the effect of parasitic are significant in D-latch as we have seen in chapter 3. When we design the D-latch at schematic level it doesn't include the parasitic of device and interconnect effects, so the calculation of setup time can be wrong. I have proposed a new sizing methodology for D-latch, which takes less area with the same driving capability and more resist to noise.

Today technology has two main constraints, speed and area. We need to design our static Dlatch cells such that it will have minimum area and clock period for the high speed. With the scaling technology, the speed of circuits or the frequency of clock is also increasing. Thus, we need to reduce the time constraint of D-latch or FF for scaling down the minimum clock period of clock while keeping the same driving capability of that D-latch cell.

Clock frequency of a system is decided by the time constraints (setup time, hold time, delay of combinational circuit) of sequential circuits as we have explained in chapter 4. The time constraint depending on the size of D-latch is setup time. So by proper sizing of D-latch we can reduce the setup time and also can reduce the size of D-latch.

### 5.1 Traditional Design Methodology of D-latch Cell

In standard library, the size of cells with different driving capability increases in multiple of finite width by increasing number of fingers. The p/n ratio in standard cells is constant so there transfer characteristics of forward and feedback inverter will not shift in any direction and the meta-stable point C remain stable at its position. Increasing the no of fingers in inverter keeping p/n ratio constant will change the slope of transfer characteristic, which affects only the time to achieve any stable state from meta-stable state[10].



Figure 5.1: Voltage transfer characteristic with increasing the width of transistor keeping p/n ratio constant

As we have seen in chapter 3 that at 90nm technology the effect of parasitics is very high and with increasing size of latch the effect of these parasitic will also increase[11]. Effectively, we need to size our latch such that parasitic can keep as minimum as possible for the all fanout cells.

### 5.2 New Proposed Design Methodology of D-latch Cell

As shown in the figure 5.2 the forward inverter plays the role to drive the load where as feedback inverter does not drive the same load. In transparent mode of latch, we can see forward path is on and feedback path is off. To latch correct value of input in latch both terminals of switch 2 should be at same voltage value before switching on this switch 2.

With increasing the fan-out of cell the load of inverter 2 increases proportionally but the load on inverter 3 don't increase proportionally. It means if we increase the size of inverter 2 proportional to fan-out and keep the size of inverter 3 constant and small, D-latch still work and the parasitic of D-latch decreases. As the parasitic reduces, delay also decreases and we know that the setup time is proportional to sum of delay of all inverters when switch 2 is closed as shown in figure 5.2. so setup time also decreases.



Figure 5.2: D-latch in transparent mode

There are two main advantages of using this new design methodology in D-latch. First one is area reduction in high proportion for high fan-out D-latch. Second is setup time decrease due to which chances of violation in latch output reduces.



Figure 5.3: D-latch in latch mode

In figure 5.4 we can see that if we size inverter 2 and keep the size of inverter 1 constant to a minimum value such that it can drive the inverter 2 only after that also meta-stable state don't move from its position. The concept behind it is that the p/n ratio of all inverter is equal constant, so transfer characteristic of any inverter don't move in any direction, only slop of transition changes as shown in figure 5.5.



Figure 5.4: D-latch in latch mode



Figure 5.5: Voltage transfer characteristics of two inverters connected back to back

Setup time is proportional to sum of delays of all inverters. By keeping size of inverter 3 constant and increasing the size of inverter 2 proportional to fan-out, the D-latch setup time decreases.

By new proposed sizing in D-latch, the area of D-latch reduced in high proportion as compare to traditional D-latch. We see in Table 5.1, reduction in area by new proposed sizing of Dlatch. Layout of D-latch by proposed and traditional design methodology are shown in Appendix A and B respectively.

|         | D-latch Cell Area<br>(µm <sup>2</sup> ) |          | % reduction in |
|---------|-----------------------------------------|----------|----------------|
| no of   | Traditional                             | Proposed | Area           |
| fingers | sizing                                  | sizing   |                |
| 1       | 3.9                                     | 3.9      | 0              |
| 2       | 6.45                                    | 5.25     | 18.60          |
| 3       | 7.8                                     | 6        | 23.08          |
| 4       | 10.65                                   | 7.8      | 26.76          |
| 5       | 12                                      | 8.555    | 28.71          |
| 6       | 15                                      | 10.35    | 31.00          |
| 7       | 16.195                                  | 11.1     | 31.46          |
| 8       | 18.905                                  | 12.75    | 32.56          |

 Table 5.1: Percentage reduction in area of latch by new design methodology

#### 5.3 Simulation Results

Post-layout simulation has performed on the implemented new sized D-latch. Simulation results confirm the functionality and performance of D-latch. All the simulations are preformed on specter for D-latch extracted view of layout. Extracted views of D-latch layout are shown in appendix A. In all measurements, the power supply vdd is given 1volt and the all input signal (input D of D-latch, clock and clock bar signal) rise/fall time is 5ps.

### 5.3.1 Setup Time Calculation of D-latch

We examine setup time behavior of minimum sized D-latch. The D-latch is loaded with the load of double sized inverter as compare to latch inverter 2, and its setup time is examined for clock and data slopes of 5ps. The simulation results are plotted in figure 5.6. When data settles a long time before the clock edge, the data to output delay equals 67.88 ps. moving the data transition closer to clock edge causes  $t_{D-Q}$  delay to increase[5]. This becomes noticeable at an offset between data and clock of about 75 ps. The latch completely fails to latch the data when data precedes the clock by 56 ps. A 5% increase in  $t_{D-Q}$  is observed at 63 ps, and this time is entered in the library as the setup time for the particular slope of data and clock. This characterization of setup time adds a margin to the design of about 10 ps.



Figure 5.6: Graph to calculate the setup time of D-latch

Similarly, we have calculated the setup time of all cells of D-latches. Table 5.2 shows the setup time of D-latches with different driving strength. In the Table 5.2 the size of minimum width of NMOS is 300 nm, which is consider as size of single finger. Now all other cells are multiple of this width of NMOS.

|               | D-latch setup | D-latch setup |  |
|---------------|---------------|---------------|--|
| no of fingers | time for      | time for      |  |
|               | traditional   | proposed      |  |
|               | sizing        | sizing        |  |
| 1             | 63            | 63            |  |
| 2             | 60            | 52            |  |
| 3             | 58            | 48            |  |
| 4             | 58            | 47            |  |
| 5             | 59            | 46            |  |
| 6             | 59            | 48            |  |
| 7             | 59            | 46            |  |
| 8             | <b>6</b> 0    | 50            |  |

Table 5.2: Setup time of D-latches with traditional and proposed design methodology

### 5.3.2 Discussion on New Design Methodology

As we can see in figure 5.7 the variation of setup time with size of D-latch, we can differentiate clearly that the setup time has reduced in high proportion by applying new methodology with respect to earlier design methodology for the same drive strengths. For minimum size D-latch sizing will be same of forward and feedback inverters that is why there is no change in setup time for both design methodology (shown in figure 5.7).



Figures 5.7: Graph between no. of fingers and setup time for D-latch



Figure 5.8: Schematic of D-latch

For increasing the fan-out of D-latch, we increase the size of forward inverter but the size of feedback inverter keeps constant at minimum. Concept behind this methodology is that to increase the fan-out of D-latch the size of forward inverter also has to increase proportionally because this inverter will drive the load at D-latch output, however feedback inverter load is not directly proportional to fan-out and so the size of inverter at feedback will not need to increase to proportionally to fan-out. We keep the size of inverter at feedback of D-latch at lesser then the size of forward inverter so that it don't increase the parasitic at node Yb and length of local interconnect also will reduce as shown in figure 5.8. Therefore, the delay of inverter will reduce and finally the setup time for all fan-out D-latch standard cells will reduce. With increasing the size of forward inverter the load at feedback inverter also increase of setup time. So with increasing of fan-out of D-latch size of forward inverter increase in same proportion but the size of feedback inverter does not increase in same proportion but it size such it can drive forward inverter without increasing delay.

#### 5.4 Performance of New Design D-latch

For digital integrated electronic the challenge are size reduction, high speed with the need of high performances.



Figure 5.9: D-latch in latch mode with noise at Yb node

To check the performance of new proposed D-latch as compare to traditional D-latch, we have examined our D-latch in latch mode by injecting charge as shown in figure 5.9 as an upset[12]. This upset violates the stored value in latch and give the value of voltage at which the latch flips to the wrong state. Let D-latch be latched at one state, so the voltage at Yb node is at logic 0. Now we give a upset (inject a some charge using a voltage source and ideal switch) at node Yb and examine at which voltage state of latch violate and flip to another state. We do this test for all cells of normal D-latch and sized D-latch.

We can compare the performance of normal D-latch and sized D-latch by comparing the value of voltages at which D-latch cell flip to other unwanted state. Table 5.3 shows the value of voltage at which a particular cell fails to latch at its state and flip to wrong state.

|              | positive noise  |          | negative noise  |          |
|--------------|-----------------|----------|-----------------|----------|
|              | Traditional     | Proposed | Traditional     | Proposed |
| NO of finger | logic 1 fail(v) |          | logic 0 fail(v) |          |
| 1            | 0.66            |          | 0.22            |          |
| 2            | 0.65            | 0.62     | 0.25            | 0.3      |
| 3            | 0.66            | 0.62     | 0.63            | 0.6      |
| 4            | 0.64            | 0.6      | 0.62            | 0.58     |
| 5            | 0.65            | 0.6      | 0.63            | 0.34     |
| 6            | 0.64            | 0.59     | 0.62            | 0.35     |
| 7            | 0.64            | 0.59     | 0.63            | 0.35     |
| 8            | 0.64            | 0.58     | 0.63            | 0.36     |

Table 5.3: Values of voltages at which D-latch flips to another state

Let D-latch be latched to logic 1 and node Yb be at logic 0. Now we apply an upset at node Yb of positive voltage and observe the voltage at which the state of D-latch flips to wrong state. Comparing the results from Table 5.3 for traditional and proposed D-latch, we see that for both traditional and proposed D-latch, for some lower values they fail but the difference is very small. Now let D-latch be latched to logic 0 and node Yb be at logic 1. Now we apply an upset at node Yb of negative voltage and observe the voltage at which the state of D-latch flips to logic 1(wrong state). Comparing the values of voltage at which latches fail we observe that the value of negative noise is much higher in new proposed latch. This means that our sized D-latch is more susceptible to noise.

By this analysis and comparing the results of traditional and proposed D-latch we find that the performance of our sized D-latch is better than traditional D-latch.

# Chapter



### CONCLUSION

In this work, design of a standard cell library at 90nm technology has been discussed and implemented. The implemented and characterized standard cells are working properly for various driving strength. The comparison between schematic and layout cell has been done, it shows that the effect of parasitic due to local interconnect has increased in great extent.

Although the designed library is good in terms of area and variety of cells still several modifications need to be applied to the library to make it more efficient. The improvement is to redesign the D-latch, which has reduced the area of the sequential circuits dramatically. One of the improvement is to redesign the D-latch, which will impact the design of sequential circuits immensely. A new sizing methodology has been shown which utilize a weak feedback inverter. This redesign the size of the D-latch, its setup time, area and performance in terms of capacitance coupling have been observed to be better than earlier D-latch design. As a comparison between the two designs, the response of new D-latch to voltage upset on dynamic node is observed and it is found that our new design is better in form of robustness.

### 6.1 Future Scope

In this work, I have proposed new design methodology only for multiplexer D-latch architecture. We can repeat this work for other D-latch architecture also. In future, it can be implemented for below 90 nm technology nodes.

## REFERENCES

- [1] Ralf Göttscheb and Wolfgang Krautschneidera, "Impact of parasitic elements on the performance of digital CMOS circuits with gigabit feature size," *Solid-State Electronics 3rd International Workshop on Ultimate Integration of Silicon*, Munich, Germany, vol. 47, no. 7, pp. 1243-1248, 2003.
- [2] Navin Srivastava and Kaustav Banerjee, "Interconnect challenges for nanoscale electronic circuits," *JOM*, vol. 56,no. 10, pp. 30-31, 2004.
- [3] C.R. Bachelu and M.C. Lefebvre, "A study of the use of local interconnect in CMOS leaf cell design," *in Proc. IEEE ISCAS*, vol. 3, pp. 1258-1261, 1992.
- [4] B. Schurmann, J. Altmeyer, "The effect of pin constraints on layout area," *European Design and Test Conference*, pp. 480, 1995.
- [5] Jan M. Rabey, Anantha Chandrakasan and Borivoje Nicolic, *Digital integrated circuits: A design perspective*, Prentice Hall of India Private Limited, New Delhi, 2006.
- [6] Ivan Sutherlands, Bob Sproull and David Harris, *Logical effort: Designing fast CMOS circuits*, Morgan Kaufmann Publications, 1999.
- [7] Sung-Mo Kang and Yusuf Lebiebici, *CMOS Digital integrated circuits: Analysis and Design*, Tata McGraw-Hill Publishing Company Limited, New Delhi, 2003.
- [8] Jian Zhou, Jin Liu and Dian Zhou, "Reduced setup time static D flip-flop," *Electronics Letters*, vol. 37, no. 5, pp. 279-280, 2001.
- [9] T.A. Jackson and A. Albicki, "Analysis of metastable operation in D latches," *IEEE Transactions on Circuits and Systems*, vol. 36, no. 11, pp. 1392-1404, 1989.
- [10] C.L. Portmann and T.H.Y.Meng, "Loading effects on metastable parameters of CMOS latches," *Symposium on VLSI Circuits, Digest of Technical Papers*, pp. 21-22, 2002.

- [11] C.L. Portmann, H. Y. Meng, "Metastability in CMOS library elements in reduced supply and technology scaled applications," *IEEE Journal of Solid-State Circuits*, vol. 30, no. 1, pp. 39-46, 1995.
- [12] T. Monnier, F. M. Roche, J. Cosculluela and R. Velazco, "SEU testing of a novel hardened register implemented using standard CMOS technology," *IEEE Nuclear and Plasma Sciences Society*, vol. 46, no. 6, pp. 1440-1444, 1999.
- [13] Maryam Shojaei Baghini, Madhav P. Desai, "Impact of technology scaling on metastability performance of CMOS synchronizing latches," *Proc. Of ASP-DAC*, pp. 317-322, 2002.

Layout of D-latch design by using new proposed methodology



### APPENDIX – B

Layout of D-latch design by using traditional methodology

