Equivalent Waveform Propagation for Static Timing Analysis

Masanori Hashimoto, Member, IEEE, Yuji Yamada, and Hidetoshi Onodera, Member, IEEE

Abstract—This paper proposes a scheme that captures diverse input waveforms of CMOS gates for static timing analysis (STA). Conventionally, latest arrival and transition times are calculated from the timings when a transient waveform goes across predetermined reference voltages. However, this method cannot accurately consider the impact of waveform shape on gate delay when crosstalk-induced nonmonotonic waveforms or inductance-dominant stepwise waveforms are injected. We propose a new timing analysis scheme called “equivalent waveform propagation.” The proposed scheme calculates the equivalent waveform that makes the output waveform close to the actual waveform, and uses the equivalent waveform for timing calculation. The proposed scheme can cope with various waveforms affected by resistive shielding, crosstalk noise, wire inductance, etc. In this paper, we devise a method to calculate the equivalent waveform. The proposed calculation method is compatible with conventional methods in gate delay library and characterization and, hence, our method is easily implemented with conventional STA tools.

Index Terms—Crosstalk noise, inductance, resistive shielding, slope propagation, static timing analysis (STA), waveform diversity.

I. INTRODUCTION

As circuit scale grows, static timing analysis (STA) becomes a common approach to verify timing constraints, or rather it is currently the only way to perform full-chip timing analysis. In STA, we propagate latest arrival time (LAT) and transition time throughout a circuit and derive the longest/shortest path delays. CMOS circuits consist of CMOS gates and interconnects, and currently delay times of each part, i.e., the gate propagation delay and the interconnect propagation delay, are separately calculated. As for interconnect delay, it is well known that PRIMA [1], or other similar techniques, can estimate accurate transition waveforms propagating through linear device networks. On the other hand, CMOS gates are nonlinear devices and the estimation of gate delay is inherently more complicated. Therefore, delay calculation based on look-up tables is widely used [2], [3]. This approach usually requires a prior characterization process to build look-up tables using a circuit simulator. Due to the limitation of circuit simulation costs, gate characterization is usually performed in two-dimensional (2-D) space, output loading, and transition time of input waveform (slope). The parameter of the slope aims to capture the influence of waveform shape on gate delay.

Recently, there have arisen many factors that make transition waveforms more diverse in nanometer technologies, such as crosstalk noise, interconnect inductance, and resistive shielding, and, hence, capturing waveform shape by using a single parameter of slope is getting harder. Nevertheless, the number of parameters to express waveform shapes does not increase because of gate characterization costs.

This paper proposes a new scheme for the propagation of timing information in STA called “equivalent waveform propagation.” Our scheme aims to accurately capture the effects of diverse waveforms on timing. The proposed scheme does not calculate the LAT and the slope from the timings when the waveform goes across reference voltages. The proposed scheme derives an equivalent input waveform with a standard shape, such that the equivalent input waveform produces an output that matches with the actual output waveform. In equivalent waveform calculation, we need to know which part of the input waveform dominantly determines the output transition. We then devise a metric to point out the important waveform region, and we develop an equivalent waveform calculation method based on the least square fitting with the devised metric. The proposed method does not change other parts of delay calculation, i.e., no library extension and no additional gate characterization are necessary and, hence, our method is easy to work with conventional STA methods. In this paper, we demonstrate that the proposed method can calculate the equivalent waveform with the same procedure over various waveforms, such as crosstalk-induced waveform and deteriorated waveforms with overshoot and ringing due to inductance. We also evaluate the computational costs and discuss the tradeoff between accuracy and calculation costs. Throughout this paper, we assume that the distorted input waveform applied to the gate can be obtained by other methods, such as [1], and focus on the problem of finding the equivalent waveform from which we can derive the LAT and the slope for gate delay calculation. Once the LAT and slope are obtained, the path delay can be calculated, for example, by [4] and [5].

The rest of this paper is organized as follows. Section II points out the problem of conventional methods and proposes a concept of equivalent waveform propagation. In Section III, we describe the method for deriving the equivalent waveform. Section IV demonstrates that the proposed method can handle various gate input waveforms that appear in nanometer technologies. Concluding remarks are presented in Section V.
II. NECESSITY OF EQUIVALENT WAVEFORM PROPAGATION

STA is a procedure used to calculate the LAT of signal transitions at each node in a circuit and propagate it to the next gate [5]. Input waveform shape is an important factor that affects gate delay. STA is condensed into accurate LAT and slope propagation. Conventionally, LAT is defined as the 0.5Vdd (or other threshold voltages) crossing timing. Slope is also calculated as the difference between crossing timings at Vth1 and Vth2 (e.g., 0.2Vdd and 0.8Vdd). We hereafter refer to these definitions of LAT and slope as the conventional reference-voltage-base approach.

A. Motivation

Recently, there are many factors that make transition waveforms more diverse. One major factor is capacitive coupling noise, and others are on-chip inductance and the resistive shielding effect. As these factors become significant, it is getting harder to capture the impact of waveform shape on gate delay using only a single parameter of slope. Even if two waveforms have the same value of slope, the waveform shapes are sometimes totally different, which results in a considerable gate delay difference. Fig. 1 shows an example of waveform diversity, a crosstalk-induced nonmonotonic waveform, an inductance-dominant stepwise waveform, and a highly-strained waveform by resistive shielding. As far as we define the LAT and the slope based on the reference voltages, these three waveforms have the same LAT and the same transition time and, hence, these waveforms are regarded as the same waveforms in STA, whereas the actual waveforms are much different. Needless to say, the output waveforms are much different and a considerable error of timing estimation occurs. Due to a large diversity in waveform shapes, this problem is hardly solved by adjusting the reference voltages that define delay and transition time, because the modification of the reference voltages sometimes improves the accuracy of timing analysis for a certain type of waveforms, but it degrades the accuracy for other type of waveforms.

Gate delay calculation widely adopts table look-up models in order to consider nonlinear characteristics of CMOS transistors. Typically, output load and slope of input waveform are parameters of look-up tables, and then 2-D tables are prepared. The tables are generated with a long process of huge amounts of circuit simulation. Therefore, even if we want to increase the number of waveform parameters to express diverse waveform shapes, it is prohibitively difficult to extend table dimensions due to characterization cost. Moreover, it is essentially difficult to develop a new waveform representation for such different waveforms shown in Fig. 1. Considering conformity to conventional STA tools and managing characterization cost, it is highly desirable to keep the number of waveform parameters to just one. This paper aims to realize accurate timing analysis while satisfying the above requirements.

Now, we demonstrate two examples that the conventional LAT and slope propagation scheme based on reference-voltage-base approach does not work well. Fig. 2 shows a pair of fully coupled interconnects. The length is 1 mm. The transition waveform at the victim is affected by the transition at the aggressor. Conventional methods (e.g., [6]) evaluate the final crossing timing of 0.5Vdd at Gate 2 input as LAT. The conventional methods propagate the slope of the noiseless waveform. The crosstalk-induced delay variations are evaluated by circuit simulation. We use the transistor parameters of an actual 0.13-μm CMOS technology and the intermediate interconnect parameters in the 0.13-μm process predicted in [7]. The wire parameters used for the experiments are coupling capacitance $C_t = 0.058 \, \text{fF/\mu m}$, capacitance to ground $C_{gg} = 0.006 \, \text{fF/\mu m}$, and resistance $R = 0.085 \, \Omega/\mu m$. The supply voltage is 1.2 V.

Fig. 3 shows an example of transition waveforms. A noise is injected around 0.5Vdd. The transition waveform becomes nonmonotonic and crosses 0.5Vdd multiple times. On the other hand, the fall transition at Gate 2 output and the rise transition at Gate 3 output are so fast and the transitions finishes before the final crossing timing of 0.5Vdd at Gate 2 input, since capacitances $C_1$ and $C_2$ are not large. The conventional methods

![Fig. 1. Diverse waveforms that have the same LAT and slope.](image1)

![Fig. 2. Experimental circuit for crosstalk-induced input waveform.](image2)

![Fig. 3. Crosstalk-induced waveform that conventional method fails (Gates 1–3 are 4x and C1 and C2 are 1 fF).](image3)
define LAT as the final crossing timing of $0.5V_{dd}$. As long as we follow this definition of LAT, we never obtain the accurate output transitions at Gates 2 and 3. The adjustment of the transition time (slope) does not help. In this case, output transitions of Gates 2 and 3 finish before the LAT. Therefore, even if we make the transition very fast, we never obtain the output transition before LAT. Conversely, we change the transition time in the increasing direction and experimentally evaluate output waveforms. Fig. 4 demonstrates an example with long transition time. The output waveforms are much different. While clinging to the crossing timing of $0.5V_{dd}$, accuracy degradation is unavoidable. This implies that we have to devise a new scheme to propagate timing information in STA.

We demonstrate another example of inductive wires. Fig. 5 shows the experimental circuit. The cross section of inductive interconnect is also shown in Fig. 5. The interconnect between Gates 1 and 2 is inductive and its length is 3 mm. The interconnect parameters of resistance, capacitance, and inductance are 12 Ω/mm, 67 fF/mm, and 1.8 nH/mm. With interconnect inductance, transmission line effects appear, and the waveform becomes stepwise like in Fig. 6. This case reveals that the conventional reference-voltage-base method is incompetent. Suppose the upper reference voltage for slope evaluation is below the firstly-rising voltage like in Fig. 6. The conventional method approximates the step-wise waveform neglecting the step-wise behavior above the reference voltage. This ignorance causes the slope estimation error at Gate 2 and arrival time error at Gate 3. On the other hand, if the upper reference voltage is just above the first step voltage, the approximated waveform becomes much different. Though the output waveforms corresponding to this approximation are not described in Fig. 6 to avoid a too complicated figure, a considerable timing estimation error occurs. The approximated waveform is much sensitive to the reference voltage, and a little difference of reference voltage is confronted with the discontinuous waveform approximation. As long as the reference-voltage-base method is used, we cannot escape from this problem. Therefore, we have to devise a new waveform propagation scheme that is independent of reference voltage definitions.

B. Previous Work

Recently, the problem of crosstalk noise discussed in Section II-A is raised in [8], which estimates the output waveforms against noisy input waveforms using look-up tables. This look-up table has two additional parameters of noise width and noise height as well as usual load and input slope. This method requires a prior gate characterization process and library extension, and is one of the disadvantages. However, the true problem is that this method can only cope with crosstalk noise and it cannot provide accurate timing analysis against other types of waveforms; such as a waveform with resistive shielding and a waveform in an inductance-dominant interconnect. Also it is not clear if this method can cope with multiple aggressors. One solution is increasing another parameter to express waveform shape, such as [9]. However, the cost of gate characterization increases exponentially, according to parameter addition. The other approach is smoothing a nonmonotonic waveform using a cumulative density function-like technique. However, the smoothed waveform is still different from the shape used in gate characterization. Moreover, reshaping a waveform without considering the output loading cannot contribute to accurate timing analysis, which will be discussed in Section III-A. As far as we investigate, no methods provide a complete solution against the waveform diversity problem in STA discussed in Section II-A, and this paper is a first attempt to tackle and solve this problem.

C. Equivalent Waveform Propagation and Its Goal

We propose a new scheme called “equivalent waveform propagation” so as to perform timing analysis overcoming the problems discussed so far. The proposed scheme derives an equivalent waveform that produces an output waveform...
matching that produced by the actual input waveform. This concept is shown in Fig. 7. The big difference from the reference-voltage-base approach is that the $0.5 V_{dd}$ crossing timing of the equivalent waveform is not necessary the same with that of the actual waveform, whereas the conventional method clings to estimate the accurate $0.5 V_{dd}$ crossing timing. Without the restriction of $0.5 V_{dd}$, the freedom in the expression of the equivalent waveform expands considerably, which enables the accurate propagation of timing information. Here, the issue is how to derive the equivalent waveform.

In order to keep the compatibility with conventional STA tools, we must avoid increasing table parameters. To cope with various waveforms, we should devise a generic method that is independent of injected waveform shapes. The expression of the equivalent waveform shape must be a typical waveform, such that a CMOS gate drives a capacitive load, because most gates in a circuit drive capacitive load and then gate characterization are performed under the assumption of the typical waveforms. In the following section, we propose a waveform-calculation method that can derive an equivalent waveform with small computational cost while keeping the compatibilities with conventional STA tools.

III. EQUIVALENT WAVEFORM CALCULATION

The previous section revealed that the conventional slope propagation scheme based on reference voltages did not work over diverse waveforms. We then proposed a concept of equivalent waveform propagation that is potentially able to cope with diverse waveforms. In this section, we propose a heuristic method to calculate an equivalent waveform. Practical implementation issues, such as integral calculation, are also discussed.

A. Least Square Method (LSM) Focusing on Critical Waveform Region

The problem with deriving the equivalent waveform is finding the arrival time and the slope that produce an output waveform that matches with the actual output waveform. The important thing is that equivalent waveform depends not only on the input waveform shape, but also on the output loading. Let us recall the example of Fig. 3. The significant estimation error in this situation comes from the fast output transitions at Gates 2 and 3. When output load is large or when gate driving strength is weak, the output transitions become slow and the injected noise affects the output transition waveform.

The aspect of error occurrence is totally different. We, thus, have to consider the output transitions.

One of the straightforward methods to derive arrival time and slope is the LSM. However, a simple LSM just approximates the input waveform and it does not consider any information on output transitions. Fig. 8 shows a typical example that the simple LSM fails. Although the output transitions almost finish before noise injection, the LSM derives the approximated waveform that is close to the entire actual waveform.

The key issue of equivalent waveform derivation is how to find a critical region that strongly affects the output waveform. As a heuristic metric to extract a critical region, we propose to use $\partial v_{out} / \partial v_{in}$, which is the output voltage ($v_{out}$) gain subject to input voltage ($v_{in}$). When the metric is small, $v_{in}$ scarcely varies the output voltage. Conversely, when the metric is large, a slight change of $v_{in}$ affects $v_{out}$ considerably. Fig. 9 shows an example of the input waveform $v_{in}$, the output waveform $v_{out}$ and $\partial v_{out} / \partial v_{in}$. With this metric, we can effectively extract the critical waveform region from the input transition waveform.

The metric $\partial v_{out} / \partial v_{in}$ is transformed as follows:

$$\frac{\partial v_{out}}{\partial v_{in}} = \frac{\partial v_{out}}{\partial t} \cdot \frac{\partial t}{\partial v_{in}} = \frac{\partial v_{out}}{\partial t} \cdot \frac{1}{\frac{\partial t}{\partial v_{in}}} \quad (1)$$

We can calculate the value of $\partial v_{out} / \partial v_{in}$ from $v_{in}(t)$ and $v_{out}(t)$. Here, the gain curve obtained by dc analysis is different with (1), because dc analysis cannot consider the conditions of output loading and driving strength. STA methods
usually have the waveforms of \(v_{\text{in}}(t)\) and \(v_{\text{out}}(t)\) irrespective of gate delay models, e.g., k-factor (nonlinear) model [2] or Thevenin-equivalent circuit model [3]. Rigidly speaking, \(v_{\text{out}}(t)\) can be built from the information on propagation delay and output slope when k-factor (nonlinear) model is used. Therefore, no additional information is necessary to calculate the metric.

We then devise an improved objective function of the LSM using the metric of \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\) as

\[
\int_{t_1}^{t_2} \left( \frac{\partial v_{\text{out}}}{\partial v_{\text{in}}} \right) [f(t) - g(t)]^2 dt
\]

(2)

where \(g(t)\) is the actual waveform of gate input and \(f(t)\) is the equivalent waveform. We use the approximated input waveform by the conventional reference-voltage-based approach as \(v_{\text{in}}\). When crosstalk noise induced, the noiseless waveform is used for \(v_{\text{in}}\). Time \(t_1\) is the timing of starting input transition, and time \(t_2\) is the timing when input transition finishes. The time window between \(t_1\) and \(t_2\) should not include the region before the input transition and after the input transition. This is because the metric of \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\) cannot be defined when \(\frac{\partial v_{\text{in}}}{\partial v_{\text{in}}}\) is 0. We search the equivalent waveform that minimizes (2) with two variables of arrival time and slope of \(f(t)\). Please note that the expression of \(f(t)\) should be the same with the waveform used in gate characterization, but the proposed method does not limit the expression itself. We can use ramp, exponential or their mixed expressions as far as a single parameter expresses waveform shape.

The procedure of our method is summarized as follows.

1. Calculate the approximated input waveform by the conventional reference-voltage-based approach. In this step, noiseless transition waveform is used.
2. Calculate the output waveform corresponding to the approximated input waveform using look-up tables. The metric of \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\) is calculated using the input and output waveforms derived in Steps 1 and 2.
3. Set the input waveform calculated in Step 1 as the initial approximated waveform, and minimize (2).

In the minimization, the current implementation of the proposed method does not update \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\), because, in our some experiments, we did not observe that the accuracy was improved so much by the recalculation of \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\).

The proposed method does not need any additional information, and uses the only information that every STA tool already has. So, our method requires no library extension and no additional gate characterization. Therefore, our method is easy to be implemented into existing STA tools.

B. Integration Issues

In Step 3, we execute integration in the time interval from \(t_1\) to \(t_2\). When the functional expression of both \(f(t)\) and \(g(t)\) are known and we use a series expansion technique, we can calculate (2) without numerical integration. We think this situation is common. That is, when we calculate the actual waveforms by using PRIMA [1], or other similar techniques, the typical waveform expression consists of several exponential and linear terms, which are easy to be integrated. On the other hand, when \(f(t)\) is defined numerically, or when the series expansion is difficult, we should perform numerical integration.

When we cannot avoid performing numerical integration, tighter integral range without accuracy degradation is desirable from the point of computational cost. As for \(t_1\), it is reasonable to set \(t_1\) to the timing when the input transition starts. The issue here is how to decide \(t_2\). We want to match the output waveforms that are produced by the actual and the equivalent input waveforms. Therefore, even while the input is changing, the input waveform in the time region after the output transition finishes is unimportant. We, hence, should choose the earlier timing either of: 1) the input transition finishes or 2) the output transition finishes. This is consistent with the behavior of \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\), i.e., \(\frac{\partial v_{\text{out}}}{\partial v_{\text{in}}}\) and the integration become zero after the output transition finishes. We experimentally verify that this policy is reasonable and helpful to reduce computation cost on numerical integration in the next section. We also discuss the number of split segments in numerical integration.

IV. Experimental Results

We experimentally verify the proposed method on three conditions; crosstalk noise is induced, resistive shielding is prominent, and wire inductance is dominant.

We first explain the expression of equivalent waveform used in the experiments. We use a waveform expression composed of a linear (0–60%) and an exponential functions (60%-) with a single parameter of \(T_{12}\) [11]. The parameter of \(T_{12}\) is originally defined as the crossing time difference between 0.4\(V_{dd}\) and 0.6\(V_{dd}\). The rise waveform \(f_{\text{rise}}\) is expressed as

\[
f_{\text{rise}} = \begin{cases} 
0, & 0 \leq t \leq t_s \\ V_{dd} \frac{0.2(t-t_s)}{T_{12}} , & t_s < t \leq t_s + 3T_{12} \\
V_{dd} \left( 1 - 0.4e^{-\frac{t-t_s}{2T_{12}}} \right) , & t_s + 3T_{12} < t
\end{cases}
\]

(3)

where \(V_{dd}\) is the power supply voltage and \(t_s\) is the offset time when the voltage begins to rise. We experimentally verify that this expression is close to actual transition waveforms as far as a single parameter is used, and hence we adopt this expression as the shape of the equivalent waveform. Please note that the proposed scheme of the equivalent waveform propagation is independent of the waveform definition as explained in Section III-A. Other waveform expressions also can be used as the equivalent waveform expression.

We performed the minimization in Step 3 explained in Section III-A by Levenberg–Marquardt method [10].

A. Capacitive Coupling

We first evaluate the accuracy of the proposed method against crosstalk-induced noisy waveforms. Fig. 2 shows the experimental circuit. We assume that the accurate noise waveforms are given by some existent methods. We suppose that the conventional method propagates the final timing of crossing 0.5\(V_{dd}\) and the slope of the noiseless waveform in STA. We evaluate both the proposed method and the conventional method. Fig. 10 shows the result in the same condition of Fig. 3. The derived equivalent waveform and the actual
Fig. 10. Crosstalk-induced waveform and the equivalent waveform (same condition with Fig. 3).

waveform do not cross each other at the final timing when the actual waveform crosses 0.5Vdd, as we expected. Fig. 11 shows the metric of ∂v_{out}/∂v_{in} that corresponds to Fig. 10. With the metric, the proposed method focuses on the important region before the fall transition at Gate 2 finishes. Thus the proposed method overcomes the drawback of the simple least-square fitting shown in Fig. 8.

We next evaluate the crosstalk-induced variation of the propagation delay from Gate 2 input to Gate 3 output, and we verify the estimation accuracy of delay variation. In STA, the output waveform of Gate 2 is used for the delay estimation of Gate 3. From this point of view, we should evaluate the accuracy of the output waveform of Gate 2. However, the output waveform of Gate 2 may contain some amount of distortion caused by induced noise at the input. We, therefore, evaluate the arrival time at Gate 3 output, which is our interest and can be explicitly calculated, assuming that the waveform shape is reshaped by Gate 3 and the waveform shapes at Gate 3 output become almost the same except the arrival time is different. We can see that this assumption is reasonable from Fig. 10 and the figures that will be shown in the following part of this paper.

The transition timing of the aggressor (Gate 4) input is varied with 5-ps time step. This means that the noise injection timing is changed with 5-ps time step. The transition time of the aggressor input is 100 ps. We change the timing of inducing noise waveform, driver strength of each gate, and the values of C1, C2, and C3. The evaluated configurations are as follows:

- Gate 1: 4x, 8x.
- Gates 2 and 3: 1x, 4x, 8x, 16x (changed simultaneously).
- Gates 4 and 5: 16x.
- C1 and C2: 10 fF, 100 fF (changed simultaneously).
- C3: 10 fF.
- range of noise injection timing: 300 ps with 5-ps step.

The total number of evaluated configurations is 976. In this paper, we basically assume a proper design, which means that too large crosstalk noise is eliminated. We think that a design guideline does not basically allow such a large noise that is amplified by the receiver gate. Under this assumption, we chose the experimental conditions such that the maximum crosstalk noise becomes 33% of supply voltage.

Fig. 12 shows one of the experimental results. The horizontal axis is the noise-injection timing, and the vertical axis is the variation of delay time caused by crosstalk noise. The curve evaluated by the conventional method changes abruptly, although the actual curve changes smoothly. This abrupt change comes from the conventional definition of the LAT. The LAT varies discontinuously even though the crosstalk-induced waveforms are almost the same. The conventional method, thus, has a serious problem. On the other hand, the proposed method estimates the delay variation curve as we expected. The maximum estimation error of delay variation is reduced
Table I

<table>
<thead>
<tr>
<th>Error [ps]</th>
<th>Proposed</th>
<th>Conventional</th>
</tr>
</thead>
<tbody>
<tr>
<td>Maximum</td>
<td>16</td>
<td>36</td>
</tr>
<tr>
<td>Average</td>
<td>1.6</td>
<td>2.4</td>
</tr>
<tr>
<td>Standard Deviation</td>
<td>2.2</td>
<td>8.2</td>
</tr>
</tbody>
</table>

The reference voltages used for delay change evaluation in the conventional method are 0.4V_{dd}, 0.5V_{dd}, and 0.6V_{dd}. We show another result in Fig. 13. The maximum error decreases from 36 to 16 ps, and the standard deviation decreases from 5.2 to 2.2 ps. We can see that the proposed method improves the timing estimation for noisy input waveforms. We also observe that the adjustment of the reference voltage does not contribute to solve the problem on waveform diversity.

We also examined early mode noise that reduces delay time. We inject fall transitions to Gate 4, and evaluate the delay variation at Gate 3 output. Fig. 15 shows an example. The proposed method estimates delay variation accurately. The equivalent waveform provides the accurate output waveforms at Gates 2 and 3.

We next examine the other gates instead of inverters. We replace Gate 2 with AOI21 gate and NAND2 gate, whose driving strength is 4x. The result of AOI21 gate is shown in Fig. 17. The maximum error is reduced from 32 ps to 5 ps. As for NAND2 gate, the result is similar, and the maximum error becomes 34 to 5 ps. We also evaluate the skewed gates whose PMOS/NMOS ratio is intentionally unbalanced. We use two skewed inverters for the experiment. The P/N ratios are 8.6 \mu m/2.3 \mu m and 2.9 \mu m/7.0 \mu m. The P/N ratio of the normal inverter (1x) in the library is 2.15 \mu m/1.75 \mu m. In both cases, the maximum error decreases, and it becomes 36 to 9 ps, and 24 to 14 ps. In the case that NMOS is strong, the error of the proposed method increases. The logic threshold voltage that is much lower than 1/2V_{dd} might be one of the reasons.

We can see that the proposed method works with usual single-stage gates. However, we know that our heuristic method for equivalent waveform derivation cannot handle the multistage
gates in the current form, because the metric of $\partial u_{\text{in}}/\partial v_{\text{in}}$ does not provide the critical region of the input waveform. Some improvement is necessary to cope with multistage gates.

We also verify the effectiveness of the proposed method against the interconnect with two aggressors. Fig. 18 shows the results. As you see, the proposed method works well in the same procedure even when there are multiple aggressors.

We finally show a result in the case that we use a ramp waveform as the equivalent waveform shape instead of (2). The experimental condition is the same with those of Figs. 3 and 11. Fig. 19 indicates that the proposed method with ramp waveform shape also estimates the output waveform at Gates 2 and 3 accurately.

B. Resistive Shielding

We examine the effectiveness of the proposed method in resistive shielding. The experimental circuit is shown in Fig. 20. We assume intermediate interconnects in the 0.10-$\mu$m process predicted by ITRS [7]. The interconnect parameters of resistance and capacitance are 0.74 $\Omega$/$\mu$m and 0.20 fF/$\mu$m. The interconnect length between Gates 1 and 2 is 100 $\mu$m. The length of the branch part, i.e., the interconnect between Gates 1 and 4, is varied and the variations are 100, 500, 1000, 2000, and 3000 $\mu$m. This branch interconnect strains the input waveform of Gate 2. Gate 1 is 4x or 8x inverters. Gates 2 and 3 are 1x or 4x inverters, and the load capacitances $C_1$ and $C_2$ are 1, 10, 50, or 100 fF. The total number of evaluation is 80. Table II lists the statistics of the estimation error. The maximum error of the proposed method is 15 ps, whereas that of the conventional method is 31 ps. The proposed method reduces the amount of error by more than 50%. The standard deviation decreases from 6.4 to 3.1 ps.

Fig. 21 shows an example that the conventional method does not work well. As you see, the output waveform of Gate 3, using the proposed method, is very close to the actual waveform. On the other hand, the conventional reference-voltage-based method causes 16 ps error. When resistive shielding is significant, the proposed equivalent-waveform scheme provides more accurate timing analysis than the conventional method.
Fig. 22. Equivalent waveform calculation against inductive interconnect (Gate 1 is 5x, Gates 2 and 3 are 4x, wire length is 3 mm, C1 and C2 are 200 fF).

<table>
<thead>
<tr>
<th></th>
<th>Proposed</th>
<th>Conventional</th>
</tr>
</thead>
<tbody>
<tr>
<td>Maximum</td>
<td>14</td>
<td>23</td>
</tr>
<tr>
<td>Average</td>
<td>3.2</td>
<td>4.9</td>
</tr>
<tr>
<td>Standard Deviation</td>
<td>3.4</td>
<td>4.9</td>
</tr>
</tbody>
</table>

**TABLE III**

STATISTICS OF ESTIMATION ACCURACY ON INDUCTIVE INTERCONNECTS

C. Inductive Interconnects

Gate input waveforms become more complicated when interconnect inductance is dominant. We experimentally verify the effectiveness against inductive interconnects. The experimental condition is the same with Section II-A. The proposed method and the conventional reference-voltage-based method are evaluated.

We show the result in Fig. 22. The gate-input waveform bends after passing the reference voltage of 0.6\(I_{\text{th}}\), so the reference-voltage-based method cannot capture the stepwise waveform well. On the other hand, thanks to the metric of \(\frac{\partial V_{\text{on}}}{\partial V_{\text{th}}}\), the proposed method can capture the change of input waveform. The timing estimation error is reduced from 20 to 2 ps by the proposed method.

We vary the driving strength of Gate 1, the interconnect length and \(C_1\) and \(C_2\), and evaluate the accuracy under several conditions. The configurations are as follows:

- Gate 1: 4x, 5x, 6x, 8x, 12x, and 16x.
- Interconnect length: 1, 2, and 3 mm.
- \(C_1\) and \(C_2\): 10, 50, 100, and 200 fF (changed simultaneously).

The total number of configurations is 72.

The statistical summary of the estimation accuracy is shown in Table III. We confirm that the proposed method provides accurate output waveforms in most cases. However, we found a case that the accuracy of our method is degraded. Fig. 23 shows the case that produces the maximum error. There is an overshoot at Gate 2 input. The proposed method sets \(t_2\) in (2) as the first crossing timing of \(V_{\text{clk}}\) and calculates the equivalent waveform. Since the gate voltage becomes over \(V_{\text{th}}\) after \(t_2\), the fall output transition of Gate 2 is accelerated, which results in an error of timing estimation at Gate 3 output.

D. Accuracy Versus Number of Segments in Integration

The calculation of (2) in equivalent waveform derivation can be done analytically if waveform expressions are easily integrated and/or we use a series expansion technique. However, in a more generalized case, numerical integration is performed. In this case, the integral region of (2), the numbers of segments used in numerical integration and accuracy tightly related. We here examine this relationship.

Section III discusses the integral region of (2) and explains how to decide \(t_2\). We verify that this criterion is adequate. The proposed method determines \(t_2 = \min(t_{\text{in}}, t_{\text{out}})\) to integrate (2), where parameter \(t_{\text{in}}\) (\(t_{\text{out}}\)) is the timing when the input (output) voltage swings by 0.9\(V_{\text{th}}\). We evaluate the accuracy varying the number of segments used for the integration. We also evaluate the case of \(t_2 = t_{\text{in}}\) for comparison. Here we assume a waveform strained by resistive shielding. When the number of segments is three, the error is reduced from 10 to 3 ps by choosing the earlier timing of \(t_{\text{in}}\) or \(t_{\text{out}}\). When the given error limit is below 3 ps, six segments are necessary if we do not select the earlier timing. The proposed criterion of \(t_2\) thus helps to improve accuracy and to reduce calculation costs.

We next evaluate the number of required segments. We assume two types of waveforms; output load is purely capacitive and resistive shielding is significant. Table IV shows the relationship between accuracy and the number of segments. The column “Conventional” represents the error when the conventional reference-voltage-based method is used. When resistive shielding is negligible, the required number of segments is only three. This is consistent with the fact that the conventional reference-voltage-base approach worked well so far. However, when the effect of resistive shielding becomes strong, the conventional method fails and the error is over 10%. On the other hand, the proposed method with eight-segment-integration achieves small error of 1%.

The above discussion supposes that crosstalk noise is not induced. In order to capture the effect of crosstalk noise, we need some evaluation points while noise is injected. We then decide the number of segments according to the noise width, where the definition of noise width is found in [12]. In our experiment,
four to five evaluation points are adequate. We have two requirements of time step \( \Delta t \) in numerical integration; the time step decided by the transition waveform without crosstalk noise \( \Delta t_{\text{tran}} \), and the time step determined by the crosstalk noise width \( \Delta t_{\text{noise}} \). We then choose \( \Delta t = \min(\Delta t_{\text{tran}}, \Delta t_{\text{noise}}) \).

In the case of inductive interconnects, the time of flight is an important factor. When the time of flight is much shorter than the transition time, the inductive effect scarcely appears. On the other hand, the time of flight is comparable or larger than the transition time, we need some evaluation points in the time of the flight.

### E. Computation Costs Versus Number of Segments in Integration

We implemented the proposed method into a STA tool and evaluated the calculation costs. Following is the delay calculation procedure of the implemented STA tool. Gate delay calculation is executed using Thevenin equivalent circuit model [3]. Interconnect RC trees are once reduced into a \( \pi \) circuit [13], and it is used to calculate effective capacitance [14] and gate output waveform. The output waveforms of interconnects are calculated from gate-output waveform and quadratic-transfer function. The transfer function is calculated by [15]. In minimizing (2), three to five iterations are needed.

We evaluated the computational costs of the proposed method. Please note that, in this evaluation, we execute numerical integration of (2) as the worst case. When we can calculate (2) analytically, the calculation cost increase is much less. The circuit used for the experiment is a simple circuit of inverter chain. Table V shows the experimental results. The calculation cost is normalized by that of the conventional reference-voltage-based implementation. We here evaluate the calculation time purely required for timing propagation excluding the time of reading and writing files and RC reduction. When resistive shielding is not significant and crosstalk noise is not injected, the required number of segments is three and it corresponds to 12% increase of computation costs. When resistive shielding is significant, the increase is about 25%. As far as we investigate, the required number of segments for crosstalk-induced waveform is around 10 to 15. We then conclude that the proposed method can provide accurate timing analysis with a CPU time increase of 15%–30% at most.

### V. Conclusion

In this paper, we propose a new scheme called “equivalent waveform propagation” to capture diverse gate-input waveforms in accurate gate delay calculation. In order to realize the proposed scheme, we develop an equivalent waveform calculation method based on the LSM. With the metric developed to extract the critical region, the proposed calculation method can derive the equivalent waveform successfully. The proposed method requires no library extension and no additional characterization, which means the high conformity of our method to conventional STA tools. We experimentally verify that the proposed method is more accurate in delay calculation than the conventional reference-voltage-based approach under various conditions; resistive shielding is significant, crosstalk noise is injected, and interconnect is inductive. The proposed scheme of “equivalent waveform propagation” is promising in nanometer technologies.

### Table IV

<table>
<thead>
<tr>
<th>Proposed(#segments)</th>
<th>Conventional</th>
</tr>
</thead>
<tbody>
<tr>
<td>w/o res. shielding</td>
<td>1.6</td>
</tr>
<tr>
<td>w/ res. shielding</td>
<td>11.3</td>
</tr>
</tbody>
</table>

### Table V

<table>
<thead>
<tr>
<th>#segments</th>
<th>3</th>
<th>5</th>
<th>8</th>
<th>10</th>
<th>20</th>
<th>40</th>
</tr>
</thead>
<tbody>
<tr>
<td>calculation costs</td>
<td>1.12</td>
<td>1.17</td>
<td>1.27</td>
<td>1.48</td>
<td>1.71</td>
<td></td>
</tr>
</tbody>
</table>

### REFERENCES


Masanori Hashimoto (S’00–A’01–M’03) received the B.E., M.E., and Ph.D. degrees in communications and computer engineering from Kyoto University, Kyoto, Japan, in 1997, 1999, and 2001, respectively. Since 2001, he has been an Instructor in the Department of Communications and Computer Engineering, Graduate School of Informatics, Kyoto University. His research interests include computer-aided design for digital integrated circuits and high-speed circuit design. Prof. Hashimoto is a member of ACM, IEICE, and IPSJ.

Yuji Yamada received the B.E. degree in electronic engineering and the M.S. degree in communication and computer engineering from Kyoto University, Kyoto, Japan, in 2001 and 2003, respectively. Since 2003, he has been with Matsushita Electric Industrial Co., Ltd., Osaka, Japan.

Hidetoshi Onodera (M’87) received the B.E., M.E., and Dr. Eng. degrees in electronic engineering from Kyoto University, Kyoto, Japan, in 1978, 1980, 1984, respectively. He joined the Department of Electronics, Kyoto University, in 1983, where he is currently a Professor in the Department of Communications and Computer Engineering, Graduate School of Informatics. His research interests include design technologies for digital, analog, and RF LSIs, with particular emphasis on high-speed and low-power design, design and analysis for manufacturability, and system-on-chip architectures. Dr. Onodera was the Program Chair and General Chair of the ACM/IEEE International Conference on Computer-Aided Design (ICCAD) in 2003 and 2004, respectively. He has served on the Technical Program Committees for several international conferences, including DAC, DATE, ASP-DAC, and CICC. He was the Chairman of the IEEE Kansai SSCS Chapter from 2001 to 2002, and the Chairman of the Technical Group on VLSI Design Technologies, IEICE, Japan, from 2000 to 2001.