# Fitted Elmore Delay: A Simple and Accurate Interconnect Delay Model

Arif Ishaq Abou-Seido, Brian Nowak, and Chris Chu

Abstract—In this paper, we present a new interconnect delay model called Fitted Elmore delay (FED). FED is generated by approximating Hspice delay data using a curve fitting technique. The functional form used in curve fitting is derived based on the Elmore delay model. Thus our model has all the advantages of the Elmore delay model. It has a closed form expression as simple as the Elmore delay model and is extremely efficient to compute. Interconnect optimization with respect to design parameters can also be done as easily as in the Elmore delay model. In fact, most previous algorithms and programs based on Elmore delay model can use our model without much change. Most importantly, FED is significantly more accurate than the Elmore delay model. The maximum error in delay estimation is at most 2% for our model, compared to 8.5% for the scaled Elmore delay model. The average error is less than 0.8%. We also show that FED can be more than 10 times more accurate than Elmore delay model when applied to wire sizing.

*Index Terms*—VLSI CAD, Physical Design, Delay Model, Interconnect Optimization, Curve Fitting Technique

## I. INTRODUCTION

As the physical dimensions in VLSI technologies scale down, interconnect delay increasingly dominates gate delay in determining circuit performance [1]. As a result, highlevel synthesis, logic synthesis, and physical layout tools are becoming more interconnect-centric. In order to take the impact of interconnect delay into account, it is very important to have computationally inexpensive and accurate interconnect delay models.

In the past, many interconnect delay models have been proposed by analyzing the moments of the impulse response [2]. Asymptotic waveform evaluation (AWE) [3] is a generalized approach to response approximation by moment matching. It is very accurate but computationally very expensive. Hence, many moment-matching variants using the first two to four moments have been proposed [4]–[11]. Those variants are relatively much more efficient but less accurate. Nevertheless, they may still be too expensive to be used within the tight optimization loops of design synthesis and layout tools. Moreover, for all the models above, the delay is either computed by an iterative procedure or expressed as a sophisticated implicit function of the design parameters. Sensitivity information cannot be easily calculated. Therefore, these models provide

Arif Abou-Seido is with Intel Corporation, Santa Clara, Ca 95054 (email: arif.i.abou-seido@intel.com).

Brian Nowak is with IBM Corporation, Rochester, MN 55901 (email: bnowak@us.ibm.com).

Chris Chu is with the Department of Electrical and Computer Engineering, Iowa State University, Ames, IA 50011 (email: cnchu@iastate.edu). little insight into determining the design parameters during design or optimization.

As a result, the Elmore delay (ED) [12], which is the first moment of the impulse response, is the most widely used interconnect delay model during design synthesis and layout [13]. It can be written as a simple, closed form expression in terms of design parameters. It is extremely efficient to compute and it provides useful insight for optimization algorithms. It has also been shown to have good fidelity with respect to Hspice simulation [14]–[16]. The primary disadvantage of the Elmore delay model is that it has limited accuracy. It always overestimates the delay [17]. So a commonly used variant is to scale the Elmore delay by ln 2 [2]. We call this the scaled Elmore delay (SED). However, it was observed that SED can significantly underestimate a large portion of delays.

In this paper, we propose a new model called *Fitted Elmore* delay (FED). Let r be the sheet resistance,  $c_a$  be the unit area capacitance, and  $c_f$  be the unit fringing capacitance for a certain metal layer. For an interconnect wire of length l and width w connecting a driver with driver resistance  $r_d$  and a load with load capacitance  $c_l$ , the Fitted Elmore delay is given by:

$$FED(r_d, c_l, l, w)$$

$$= A \cdot r_d c_a l w + B \cdot r_d c_f l + C \cdot r_d c_l$$

$$+ D \cdot \frac{r c_a l^2}{2} + E \cdot \frac{r c_f l^2}{2w} + F \cdot \frac{r l c_l}{w}$$
(1)

The coefficients A, B, C, D, E, and F are determined by a curve fitting technique to approximate Hspice simulation data. The functional form (1) used in fitting the Hspice data is derived based on the Elmore delay model. Although Elmore delay model is not very accurate by itself, it provides useful insight into the dependence of interconnect delay with various parameters. This insight is used by our model.

FED is as simple and as efficient to compute as the Elmore delay model. However, it is significantly more accurate than both ED and SED. The maximum error is only 2% for our model, compared to around 8.5% for the scaled Elmore delay model. Since it is written as a simple analytical expression, optimization of delay with respect to design parameters can be done easily. In fact, because of its striking similarity to Elmore delay model, most interconnect optimization algorithms based on Elmore delay can use our model without much change.

The remainder of the paper is organized as follows: In Section II, we present the Fitted Elmore delay model for a single wire. We also present some experimental results to show that FED is 4 to 7 times more accurate than SED in delay estimation. In Section III, we show that FED can be more

This work made use of computer equipment provided by a grant from the Roy J. Carver Charitable Trust.

than 10 times more accurate when applied to wire sizing. In Section IV, we generalize the Fitted Elmore model to handle interconnect trees. In Section V, we present a *Transformed Elmore delay* (TED) model which has basically the same form as the Elmore delay model, but almost as accurate as FED. In Section VI, we discuss future research directions.

## II. FITTED ELMORE DELAY

In this section, we present the derivation and some experimental results for Fitted Elmore delay model on a uniform width wire. The extension to consider interconnect trees is presented in Section IV.

The basic idea is to approximate accurate delay data by curve fitting to an equation. In order to have a simple closed form model, the functional form used in curve fitting is derived based on the Elmore delay model. Although Elmore delay model is not very accurate by itself, it provides useful insight into the dependence of interconnect delay with various parameters.

The notations of technology parameters for a certain metal layer are listed below.

- $W_{min}$ : the minimum wire width
- $r_g$ : the output resistance of a minimum device
- $c_g$ : the input capacitance of a minimum device
- *r*: the sheet resistance
- $c_a$ : the unit area capacitance
- $c_f$ : the unit fringing capacitance

For an interconnect wire of length l and width w connecting a driver with driver resistance  $r_d$  and a load with load capacitance  $c_l$ , the Elmore delay is given by:

$$ED(r_d, c_l, l, w)$$

$$= r_d(c_a lw + c_f l + c_l) + \frac{rl}{w}(\frac{c_a lw}{2} + \frac{c_f l}{2} + c_l)$$
(2)

$$= r_d c_a l w + r_d c_f l + r_d c_l + \frac{r c_a l^2}{2} + \frac{r c_f l^2}{2w} + \frac{r l c_l}{w}$$
(3)

There are six terms in expression (3). By scaling the six terms appropriately, equation (3) can become a better approximation to accurate delay data. The Fitted Elmore delay (FED) model is defined as follows.

$$\begin{aligned} FED(r_d, c_l, l, w) \\ &= A \cdot r_d c_a l w + B \cdot r_d c_f l + C \cdot r_d c_l \\ &+ D \cdot \frac{r c_a l^2}{2} + E \cdot \frac{r c_f l^2}{2w} + F \cdot \frac{r l c_l}{w} \end{aligned}$$

where the coefficients A, B, C, D, E, and F are determined by a multiple linear regression [18] to accurate delay data.

The resulting model has several advantages over previous interconnect delay models:

- 1) FED is as efficient to compute as the Elmore delay model. Other accurate interconnect delay models are at least tens of times slower than our model.
- As shown below, FED is significantly more accurate than Elmore delay and scaled Elmore delay models.
- FED is written as a simple, explicit formula containing design parameters. This feature is very useful when designing interconnect optimization algorithms.

 Because of its striking similarity to Elmore delay model, most previous interconnect optimization algorithms based on Elmore delay can use FED without much change.

To demonstrate the accuracy of the Fitted Elmore delay model, we test it on the  $0.25\mu m$ ,  $0.18\mu m$ ,  $0.13\mu m$ , and  $0.07\mu m$  technologies described in [19]. The technology parameters are listed in Table I.

| Tech. $(\mu m)$    | 0.25  | 0.18  | 0.13  | 0.07  |
|--------------------|-------|-------|-------|-------|
| $W_{min} (\mu m)$  | 0.25  | 0.18  | 0.13  | 0.07  |
| $r_g(\Omega)$      | 16200 | 17100 | 22100 | 22100 |
| $c_g(fF)$          | 0.282 | 0.234 | 0.135 | 0.066 |
| $r(\Omega/\Box)$   | 0.073 | 0.068 | 0.081 | 0.095 |
| $c_a (fF/\mu m^2)$ | 0.059 | 0.060 | 0.046 | 0.056 |
| $c_f (fF/\mu m)$   | 0.082 | 0.064 | 0.043 | 0.040 |

TABLE I TECHNOLOGY PARAMETERS.

Hspice is used to generate the accurate delay data. In the Hspice simulation, each wire is modeled as  $30 \pi$ -type RC segments. The accuracy of our model is limited by the accuracy of Hspice data generated. So it is very important to generate accurate Hspice data. We are using the "ACCURATE" option and set the "MEASDGT" option to 10 in Hspice, which can make a 2-3% difference in the values generated.

To properly curve fit the Hspice delay data, we must generate an adequate number of data points in the region of interest for each technology. The region of interest and the number of values used for each design parameter are given in Table II. We notice that just a few values for each design parameters are enough. For driver size, load size, and wire width, we are using 6 values uniformly distributed in the region of interest. However, for wire length, we observe that our model has a larger relative error when the wire is very short (i.e., the delay is very small). So 10 values are used for wire length and more values are chosen for small wire length. In addition, we start from  $l = 450 \mu m$  so that  $l = 500 \mu m$  will not be at the boundary of our model. In particular, we are using  $l_i = 450 \cdot \gamma^{2^i - 1}$  where  $i = 0, \dots, 9$ and  $\gamma = (18000/450)^{1/511}$ . For each technology, we run Hspice on all combinations of design parameter values (i.e.,  $6 \times 6 \times 6 \times 10 = 2160$  points). The total CPU time for each technology is about 2 hours on a HP C360 machine with a 367 MHz processor and 512 MB of memory.

| Design parameter        | Region of interest                      | # points |
|-------------------------|-----------------------------------------|----------|
| Driver size $(r_g/r_d)$ | $10 \times$ to $510 \times$ min. device | 6        |
| Load size $(c_l/c_g)$   | $10 \times$ to $510 \times$ min. device | 6        |
| Wire width $(w)$        | $1 \times$ to $20 \times W_{min}$       | 6        |
| Wire length $(l)$       | 500 to 18000 $\mu m$                    | 10       |

TABLE II

REGION OF INTEREST AND NUMBER OF POINTS USED FOR DESIGN PARAMETERS.

The statistical package SAS [20] is used to perform a multiple linear regression on the Hspice data generated. The run time of SAS is negligible. The coefficients of the Fitted

Elmore models for all technologies are given in Table III. Since ED can easily overestimates delay by more than 30%, we use SED for comparison with FED. In order to make the relationship between FED and SED more apparent, the coefficients divided by  $\ln 2$  are listed. Note that all the values in Table III are greater than 1. That means delay values by SED is always smaller than those by FED. If wire resistance and capacitance dominate (i.e., terms associated with D and E are the most important), delay values by SED can be more than 10% smaller than those by FED.

| Tech. $(\mu m)$ | 0.25    | 0.18    | 0.13    | 0.07    |
|-----------------|---------|---------|---------|---------|
| A / ln 2        | 1.00724 | 1.00962 | 1.01258 | 1.01863 |
| B / ln 2        | 1.02993 | 1.03047 | 1.03010 | 1.02619 |
| C / ln 2        | 1.00332 | 1.00426 | 1.00511 | 1.00530 |
| D / ln 2        | 1.12520 | 1.12524 | 1.12673 | 1.13639 |
| E / ln 2        | 1.10598 | 1.10582 | 1.10463 | 1.09722 |
| F / ln 2        | 1.04665 | 1.04468 | 1.04836 | 1.06471 |
| TABLE III       |         |         |         |         |

COEFFICIENTS FOR THE FITTED ELMORE DELAY MODELS.

Our model is now compared with SED for delay estimation. For each technology, delays by SED, FED, and Hspice are found for 3800 random points covering the whole region of interest. Then the absolute values of the relative error of SED and FED with respect to Hspice are calculated. The maximum and average error over the 3800 points are reported in Table IV. One can see that for our model, the maximum error is only 2% and the average error is less than 0.8%.

|                 | Error in Delay |       |         |       |
|-----------------|----------------|-------|---------|-------|
|                 | Maximum        |       | Average |       |
| Tech. $(\mu m)$ | SED            | FED   | SED     | FED   |
| 0.25            | 8.48%          | 1.68% | 2.82%   | 0.69% |
| 0.18            | 8.48%          | 1.79% | 3.13%   | 0.73% |
| 0.13            | 8.49%          | 1.94% | 3.53%   | 0.79% |
| 0.07            | 8.49%          | 2.00% | 4.88%   | 0.73% |

TABLE IV

ERROR IN DELAY FOR SCALED ELMORE DELAY AND OUR MODEL.

Figure 1 shows the delay by Hspice, our model, and scaled Elmore delay model for  $r_d = r_g/100$ ,  $c_l = c_g \times 100$ ,  $w = 6 \times W_{min}$  on 0.18µm technology. Figure 2 shows an enlarged portion of Figure 1. Our model is virtually indistinguishable from the Hspice data.

We notice that our model is still very accurate for points outside of the region of interest. For each technology, we generate 500 random points such that driver size and load size are from  $5 \times$  to  $1020 \times$  min. device, wire width is from  $0.5 \times$ to  $40 \times W_{min}$ , and wire length is from 500 to  $36000 \mu m$ . The maximum and average errors in delay are reported in Table V. There is no significant difference from the results in Table IV.

#### III. APPLICATION TO WIRE SIZING

In this section, we compare the accuracy of FED and SED when applied to sizing of uniform wires. We consider two wire sizing problems. The first problem is to optimize wire



Fig. 1. Delay comparison for one case on  $0.18\mu m$  technology.



Fig. 2. An enlarged portion of Figure 1.

width to minimize delay. The second problem is to minimize wire width subject to delay bound. The delay bound is set to 10% over the optimal delay. All four technologies are tested. To fairly represent all possible design parameters, 100 random points in the region of interest are generated.

Note that to minimize delay, the optimal widths by SED and FED can be found by differentiating (1) and (2) with respect to w respectively.

Optimal width by SED = 
$$\sqrt{\frac{r(c_f l/2 + c_l)}{r_d c_a}}$$
  
Optimal width by FED =  $\sqrt{\frac{r(Ec_f l/2 + Fc_l)}{Ar_d c_a}}$ 

It is obvious that the minimize wire width subject to delay bound by SED and by FED can also be written in simple closed forms. To perform wire sizing in Hspice, a binary search is used to obtain the solutions.

|                 | Error in Delay |       |       |       |  |
|-----------------|----------------|-------|-------|-------|--|
|                 | Maximum        |       | Ave   | rage  |  |
| Tech. $(\mu m)$ | SED            | FED   | SED   | FED   |  |
| 0.25            | 8.42%          | 1.57% | 2.28% | 0.69% |  |
| 0.18            | 8.47%          | 1.91% | 2.51% | 0.80% |  |
| 0.13            | 8.48%          | 1.92% | 2.89% | 0.90% |  |
| 0.07            | 8.49%          | 2.41% | 4.05% | 0.99% |  |
| TABLE V         |                |       |       |       |  |

ERROR IN DELAY FOR POINTS OUTSIDE OF THE REGION OF INTEREST.

The results on delay minimization are summarized in Table VI. The delay versus width for one of the random cases on the  $0.18\mu m$  technology is plotted in Figure 3. For this case,  $r_d = r_g/13.59$ ,  $c_l = c_g \times 121.40$ , and  $l = 2674\mu m$ . This case generates an error of 6.16% for SED and an error of 2.47% for FED when compared with Hspice. This gives us a  $2.5 \times$  improvement over Elmore delay for this case. On average, for the  $0.18\mu m$  technology, our model produces a  $3.4 \times$  improvement.

|                 | Error in Wire Width |       |         |       |
|-----------------|---------------------|-------|---------|-------|
|                 | Maximum             |       | Average |       |
| Tech. $(\mu m)$ | SED                 | FED   | SED     | FED   |
| 0.25            | 6.32%               | 2.44% | 5.40%   | 1.55% |
| 0.18            | 6.28%               | 2.62% | 5.41%   | 1.61% |
| 0.13            | 6.31%               | 2.68% | 5.37%   | 1.68% |
| 0.07            | 6.30%               | 2.97% | 5.13%   | 1.81% |

TABLE VI Error in wire width for delay minimization.



Fig. 3. The delay versus wire width for one case on the  $0.18\mu m$  technology.

The results on wire width minimization subject to delay bound are summarized in Table VII.

SED performs poorly in this experiment. The average errors in wire width are more than 18%. In fact, because SED tends to underestimate the delay, all the wire widths computed according to SED are significantly less than those by Hspice. In other words, all the solutions by SED cannot satisfy the delay bound. If ED is used instead, since ED always significantly

TABLE VII

ERROR IN WIRE WIDTH FOR WIRE WIDTH MINIMIZATION SUBJECT TO DELAY BOUND.

overestimates delay, there is no feasible solution (i.e., the delay bound is not achievable by ED) in most cases. However, if a feasible solution is found, that solution is guaranteed to satisfy the delay bound. FED underestimates delay on about half of the cases. However, since FED is much more accurate, we observe that for all cases, FED solutions only violate the delay bound by much less than 0.1%.

### IV. EXTENSION TO INTERCONNECT TREE

In this section, we extend the Fitted Elmore delay model to handle an interconnect with tree topology. A simple tree as shown in Figure 4 is used to illustrate the idea.



Fig. 4. An example of a routing tree.

Elmore delay for node 2

 $= r_d (c_a l_1 w_1 + c_f l_1 + c_a l_2 w_2 + c_f l_2 + c_a l_3 w_3 + c_f l_3$  $+ c_{l2} + c_{l3})$  $+ \frac{r l_1}{w_1} (\frac{c_a l_1 w_1 + c_f l_1}{2} + c_a l_2 w_2 + c_f l_2 + c_a l_3 w_3 + c_f l_3$  $+ c_{l2} + c_{l3})$  $+ \frac{r l_2}{w_2} (\frac{c_a l_2 w_2 + c_f l_2}{2} + c_{l2})$  $= r_d c_a (l_1 w_1 + l_2 w_2 + l_3 w_3)$  $+ r_d c_f (l_1 + l_2 + l_3)$  $+ r_d (c_{l2} + c_{l3})$  $+ \frac{r c_a}{2} (l_1^2 + \frac{2 l_1 l_2 w_2}{w_1} + \frac{2 l_1 l_4 w_4}{w_1} + l_2^2)$  $+ \frac{r c_f}{2} (\frac{l_1^2}{w_1} + \frac{2 l_1 l_2}{w_1} + \frac{2 l_1 l_4}{w_1} + \frac{l_2^2}{w_2})$  $+ r (\frac{l_1}{w_1} c_{l2} + \frac{l_1}{w_1} c_{l3} + \frac{l_2}{w_2} c_{l2})$  The Fitted Elmore delay model for interconnect trees is obtained by scaling the six terms above by the constants A, B, C, D, E, and F found by multiple linear regression for a single wire. There is no need to perform curve fitting again.

Fitted Elmore delay for node 2

$$= A \cdot r_d c_a (l_1 w_1 + l_2 w_2 + l_3 w_3) + B \cdot r_d c_f (l_1 + l_2 + l_3) + C \cdot r_d (c_{l_2} + c_{l_3}) + D \cdot \frac{r c_a}{2} (l_1^2 + \frac{2l_1 l_2 w_2}{w_1} + \frac{2l_1 l_4 w_4}{w_1} + l_2^2) + E \cdot \frac{r c_f}{2} (\frac{l_1^2}{w_1} + \frac{2l_1 l_2}{w_1} + \frac{2l_1 l_4}{w_1} + \frac{l_2^2}{w_2}) + F \cdot r(\frac{l_1}{w_1} c_{l_2} + \frac{l_1}{w_1} c_{l_3} + \frac{l_2}{w_2} c_{l_2})$$

The idea above can be generalized to trees with any topology. For a general tree, let T be the set of indices of all tree edges. Let T(i) be the set of indices of tree edges at the downstream of edge i. Let S be the set of indices of all sinks. Let S(i) be the set of indices of sinks at the downstream of edge i. Let P(k) be the set of indices of tree edges along the path from the driver to node k. Then

Fitted Elmore delay for node k

$$= A \cdot r_d \sum_{i \in T} c_a l_i w_i$$

$$+ B \cdot r_d \sum_{i \in T} c_f l_i$$

$$+ C \cdot r_d \sum_{j \in S} c_{lj}$$

$$+ D \cdot \sum_{i \in P(k)} \frac{r l_i}{w_i} (\frac{c_a l_i w_i}{2} + \sum_{j \in T(i)} c_a l_j w_j)$$

$$+ E \cdot \sum_{i \in P(k)} \frac{r l_i}{w_i} (\frac{c_f l_i}{2} + \sum_{j \in T(i)} c_f l_j)$$

$$+ F \cdot \sum_{i \in P(k)} \frac{r l_i}{w_i} (\sum_{j \in S(i)} c_{lj})$$

Similar to Elmore delay, the Fitted Elmore delay for all nodes of an interconnect tree can also be calculated recursively in linear time.

To test the accuracy of our model, SED and FED of several trees with different number of sinks are calculated on the  $0.18\mu m$  technology. The error in delay with respect to Hspice simulation are reported in Figure VIII. One can see that FED is again significantly better than SED. However, like the Elmore delay model, we observe that the accuracy of our model is adversely affected by the resistive shielding effect. This will be discussed in Section VI.

# V. A MODEL HAVING THE SAME FORM AS ELMORE DELAY MODEL

Almost all previous algorithms and programs based on Elmore delay model can be used FED instead directly. However, for some results which depend heavily on the functional form of Elmore delay model (e.g., [21] [22]), it is not completely

|      |         | Error in Delay |       |       |       |
|------|---------|----------------|-------|-------|-------|
|      |         | Maximum        |       | Ave   | rage  |
| Tree | # sinks | SED            | FED   | SED   | FED   |
| T1   | 2       | 3.06%          | 1.23% | 2.34% | 0.65% |
| T2   | 3       | 4.66%          | 0.32% | 4.53% | 0.17% |
| T3   | 4       | 9.36%          | 0.26% | 9.32% | 0.17% |

TABLE VIII Error in delay for interconnect trees.

obvious whether FED can replace ED. It would be nice if there is a model with the same form as the Elmore delay model. In this section, we present such a model called *Transformed Elmore Delay* (TED):

$$TED(r_d, c_l, l, w) = \alpha r_d(\hat{c_a} lw + \hat{c_f} l + \beta c_l) + \frac{\hat{r}l}{w} (\frac{\hat{c_a} lw}{2} + \frac{\hat{c_f} l}{2} + \beta c_l) \quad (4)$$

This model is basically the same as the Elmore delay model as in (2). The only differences are the technology parameters are changed, and the driver resistance and load capacitance are scaled. As a result, all programs and algorithms based on the Elmore delay model can be changed to use our model very easily and obtain much better results.

In order to obtain the coefficients  $\alpha$ ,  $\beta$ ,  $\hat{c_a}$ ,  $\hat{c_f}$ , and  $\hat{r}$  so that TED is a good approximation of the Hspice data, we can equate the equations (1) and (4). So we want to have the following equalities.

$$\begin{array}{rcl} \alpha \hat{c_a} &=& A c_a \\ \alpha \hat{c_f} &=& B c_f \\ \alpha \beta &=& C \\ \hat{r} \hat{c_a} &=& D r c_a \\ \hat{r} \hat{c_f} &=& E r c_f \\ \hat{r} \beta &=& F r \end{array}$$

By taking the logarithm of these six equalities, we have the following system of linear equations.

Mx = b

where

$$\boldsymbol{M} = \begin{pmatrix} 1 & 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 & 1 \\ 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 & 0 \end{pmatrix}, \ \boldsymbol{x} = \begin{pmatrix} \log \alpha \\ \log \beta \\ \log \beta \\ \log \hat{c_a} \\ \log \hat{c_f} \end{pmatrix}, \ \boldsymbol{b} = \begin{pmatrix} \log Ac_a \\ \log Bc_f \\ \log C \\ \log Drc_a \\ \log Erc_f \\ \log Fr \end{pmatrix}$$

However, there are six equations but only five unknowns. So it is an overdetermined system. Thus we cannot expect to find an x that satisfies the system exactly. Instead we will seek an x which minimizes  $||Mx - b||_2$ . This is called the least-square problem and can be solved by QR factorization [23]. The parameters obtained by QR factorization for each technology are listed in Table IX. As before, in order to make the comparison with SED easier,  $\alpha/\ln 2$  and  $\hat{r}/\ln 2$  are listed. Notice that we can multiply  $\alpha$  and  $\hat{r}$  by a constant factor and divide  $\beta$ ,  $\hat{c_a}$ , and  $\hat{c_f}$  by the same factor without changing the

delay value. We normalize the coefficients so that  $\beta$  is equal to 1.

| Tech. $(\mu m)$   | 0.25    | 0.18    | 0.13    | 0.07    |
|-------------------|---------|---------|---------|---------|
| $\alpha/\ln 2$    | 0.98460 | 0.98765 | 0.98975 | 0.99505 |
| β                 | 1.00000 | 1.00000 | 1.00000 | 1.00000 |
| $\hat{r} / \ln 2$ | 1.06378 | 1.06225 | 1.06464 | 1.07567 |
| $\hat{c_a}$       | 1.03887 | 1.04061 | 1.04055 | 1.03994 |
| $\hat{c_f}$       | 1.04150 | 1.04218 | 1.03917 | 1.02565 |
| TABLE IX          |         |         |         |         |

COEFFICIENTS FOR THE TRANSFORMED ELMORE DELAY MODELS.

The error in delay for the Transformed Elmore delay model is reported in Table X. The maximum error of TED is only 0.8-0.89% worse than that of FED. On average, TED is only 0.13-0.55% worse than FED.

|                 | Error in Delay |       |  |
|-----------------|----------------|-------|--|
|                 | Maximum Averag |       |  |
| Tech. $(\mu m)$ | TED            | TED   |  |
| 0.25            | 2.51%          | 1.24% |  |
| 0.18            | 2.68%          | 1.23% |  |
| 0.13            | 2.79%          | 1.18% |  |
| 0.07            | 2.80%          | 0.86% |  |

TABLE X Error in delay for Transformed Elmore delay model.

#### VI. DISCUSSION AND FUTURE WORK

We observe that when FED is applied to interconnect trees, resistive shielding can cause it to overestimate the delay of sinks closer to the driver. We illustrate this point using the tree T1 in Table VIII. Its topology is shown in Figure 4. For T1,  $r_d = 500\Omega$ ,  $l_1 = 1080\mu m$ ,  $w_1 = 0.87\mu m$ ,  $l_2 = 1120\mu m$ ,  $w_2 = 0.31\mu m$ ,  $l_3 = 810\mu m$ ,  $w_3 = 0.31\mu m$ ,  $c_{l2} = 62 fF$ , and  $c_{l3} = 75 fF$ . The errors for sink 2 and sink 3 are 0.08% and 1.23% respectively. However, if we change  $w_2$  to  $0.09\mu m$  (i.e., half of the minimum width), our model will overestimate the delay for sink 2 and sink 3 by 5.48% and 7.51% respectively.

In the future, we would like to derive a model which takes resistive shielding into consideration. We would also like to incorporate inductive consideration into our model. The simple RLC delay model in [24] can be used instead of Elmore delay model. Another direction for future research is to include slope of input signal as a parameter.

# ACKNOWLEDGMENT

The authors would like to thank Amjad Odet-Allah for his help on SAS.

#### REFERENCES

- [1] Semiconductor Industry Association. The International Technology Roadmap for Semiconductors. 1999.
- [2] L. Pileggi. Timing metrics for physical design of deep submicron technologies. In Proc. Intl. Symp. on Physical Design, pages 28–33, 1998.
- [3] L. T. Pillage and R. A. Rohrer. Asymptotic waveform evaluation for timing analysis. *IEEE Trans. Computer-Aided Design*, 9(4):352–366, April 1990.

- [4] A. B. Kahng and S Muddu. Two-pole analysis of interconnection trees. In Proc. IEEE Multi-Chip Module Conf., pages 105–110, January 1995.
- [5] A. B. Kahng and S. Muddu. An analytical delay model for RLC interconnects. In *Proc. IEEE Intl. Symp. on Circuits and Systems*, pages 4.237–4.240, May 1996.
- [6] B. Tutuianu, F. Dartu, and L. Pileggi. An explicit RC-circuit delay approximation based on the first three moments of the impulse response. In Proc. ACM/IEEE Design Automation Conf., pages 611–616, 1996.
- [7] R. Kay and L. Pileggi. PRIMO: Probability interpretation of moments for delay calculation. In *Proc. ACM/IEEE Design Automation Conf.*, pages 463–468, 1998.
- [8] T. Lin, E. Acar, and L. Pileggi. h-gamma: an RC delay metric based on a gamma distribution approximation of the homogeneous response. In *Proc. IEEE/ACM Intl. Conf. on Computer-Aided Design*, pages 19–25, 1998.
- [9] C. Alpert, A. Devgan, and C. Kashyap. A two moment RC delay metric for performance optimization. In *Proc. Intl. Symp. on Physical Design*, pages 69–74, 2000.
- [10] Frank Liu, C. V. Kashyap, and C. J. Alpert. A delay metric for RC circuits based on the Weibull distribution. In *Proc. IEEE/ACM Intl. Conf. on Computer-Aided Design*, pages 620–624, 2002.
- [11] Charles J. Alpert, Frank Liu, Chandramouli V. Kashyap, and Anirudh Devgan. Delay and slew metrics using the lognormal distribution. In Proc. ACM/IEEE Design Automation Conf., pages 382–385, 2003.
- [12] W. C. Elmore. The transient response of damped linear network with particular regard to wideband amplifiers. J. Applied Physics, 19:55–63, 1948.
- [13] Jason Cong, Lei He, Cheng-Kok Koh, and Patrick H. Madden. Performance optimization of VLSI interconnect layout. *INTEGRATION, the* VLSI Journal, 21:1–94, 1996.
- [14] K. D. Boese, A. B. Kahng, B. A. McCoy, and G. Robins. Fidelity and near-optimality of Elmore-based routing constructions. In *Proc. IEEE Intl. Conf. on Computer Design*, pages 81–84, 1993.
- [15] Jason Cong and Lei He. Optimal wiresizing for interconnects with multiple sources. ACM Trans. Design Automation of Electronic Systems, 1(4), October 1996.
- [16] J. Cong, A. B. Kahng, C.-K. Koh, and C.-W. A. Tsao. Bounded-skew clock and Steiner routing under Elmore delay. In *Proc. IEEE/ACM Intl. Conf. on Computer-Aided Design*, pages 66–71, 1995.
- [17] R. Gupta, B. Krauter, B. Tutuianu, Willis J., and L. T. Pillage. The Elmore delay as a bound for RC trees with generalized input signals. In *Proc. ACM/IEEE Design Automation Conf.*, pages 364–369, June 1995.
- [18] R. Lyman Ott. An Introduction to Statistical Methods and Data Analysis. Duxbury, 4th edition, 1993.
- [19] Jason Cong and David Pan. Interconnect delay estimation models for logic and high level synthesis. In SRC Techcon, 1998.
- [20] Ravindra Khattree and Dayanand N. Naik. Applied multivariate statistics with SAS software. SAS Institute, NC, 2nd edition, 1999.
- [21] Chung-Ping Chen, Yao-Ping Chen, and D. F. Wong. Optimal wire-sizing formula under the Elmore delay model. In *Proc. ACM/IEEE Design Automation Conf.*, pages 487–490, 1996.
- [22] Chung-Ping Chen and D. F. Wong. Optimal wire-sizing function with fringing capacitance consideration. In *Proc. ACM/IEEE Design Automation Conf.*, pages 604–607, 1997.
- [23] David Watkins. Fundamentals of Matrix Computations. John Wiley & Sons, 1991.
- [24] Y. I. Ismail, E. G. Friedman, and J. L. Neves. Equivalent Elmore delay for RLC trees. *IEEE Trans. Computer-Aided Design*, 19(1):83–97, January 2000.



Arif Ishaq Abou-Seido received his B.S. degree and M.S. degree in Electrical Engineering from Iowa State University in 1999 and 2001, respectively.

Arif is currently a senior design engineer at Intel Corporation in microprocessor logic technology development group. His tasks are focused on Full-Chip timing convergence activities such as planning solutions for design issues via optimization techniques. He is also involved in timing related methodology development, implementation, optimization, verification and execution. PLACE PHOTO HERE Brian Bowak received his B.S. degree in Electrical Engineering from Michigan Technological University in 1999 and his M.S. degree in Computer Engineering from Iowa State University in 2001.

Brian is currently an ASIC Design Engineer for Engineering and Technology Services division of IBM. His current design interests are in High Speed I/O interfaces.



**Chris Chu** received the B.S. degree in computer science from the University of Hong Kong, Hong Kong, in 1993. He received the M.S. degree and the Ph.D. degree in computer science from the University of Texas at Austin in 1994 and 1999, respectively.

Dr. Chu is currently an Assistant Professor in the Electrical and Computer Engineering Department at Iowa State University. His research interests include design and analysis of algorithms, CAD of VLSI physical design, and performance-driven intercon-

nect optimization. He received the IEEE TCAD best paper award at 1999 for his work in performance-driven interconnect optimization. He received also the Bert Kay Best Dissertation Award for 1998-1999 from the Department of Computer Sciences in the University of Texas at Austin.