Designers of
systems-on-a-chip (SoC) must integrate a wide variety of
intellectual property (IP). This includes all kinds of memory, logic
and control functions, with many memory sizes and types now being
placed on chip. The upshot is that IC design, once logic dominant,
has become memory dominant. In fact, it is not unusual to see more
than 100 different instances of SRAM, ROM and other specialty
memories on a single SoC.
But to use these embedded memories in IC designs, accurate timing
and power models are essential. And to generate such models,
characterization is required. This often implies many simulations
with updated Spice models.
At the same time, technology is moving rapidly from 0.25 micron
to 0.18, 0.15 and 0.13 micron. Along the way, generic CMOS
technology has diversified into variants such as high speed, low
power and high density. Now, each process variant must have its own
characterization. With high-volume products, it is increasingly
common to manufacture a part at several foundries to meet customer
needs. Since each process variant at each foundry is slightly
different, it is necessary to recharacterize memory for the specific
process and foundry to be used. Thatís why a tool that automatically
characterizes embedded memory has become indispensable.
Memory Characterization
Memory characterization is a very time-consuming and error-prone
process. Due to the complexities and varieties of the timing
parameters involved, it has become a key part in the memory compiler
development flow as shown in Figure 1.
Figure 1: Memory Compiler Development
Flow
Typically, the characterization process builds timing models,
which can be applied to all memory instances in the compiler.
Variations on memory instances could be, among others, word count,
bit count or aspect ratio. Memory is usually characterized in the
following steps:
1. Select corner instances. Typical selections
are based upon criteria such as small, large, narrow and wide.
2. Characterize the selected instances.
3. Build the lookup
table or equations by fitting curves.
4. Generate timing model.
The conventional way of characterizing memory is a two-step
method. First, manually build a parameterized composite Spice
netlist of the tileable blocks. The major memory blocks, such as
column, row and memory array are parameterized as functions of
number of words, bits per word and multiplexing. Second, iteratively
run simulations over the netlist and obtain timing information over
the range of parameters.
The conventional method separately builds the netlists of major
blocks for simulation instead of treating the memory instance as an
entity from layout extraction. To extend the model to different
instance variations, compiler designers need to develop a
parameterized method for timing estimation, which is sometimes
called a black box model.
Moreover, the conventional method is always based on the
following simplified assumptions:
1. A memory can be
segmentedóthe timing of a memory instance is just the summation of
the timing of its major blocks. This assumption can lead to ignoring
coupling and distributed effects, which become more pronounced for
more advanced technology.
2. A memory can be parameterized--the
Spice netlist of major blocks can be made as a function of words,
bits per word and multiplexing to accurately reflect the actual
layout over a broad range of multidimensional variations. The actual
memory circuit is often nonlinear, and such things as special power
and spacer cells add to this nonlinearity. If a mistake has been
made in parameterization it will likely not be detected.
To produce high-quality models, the conventional method needs
numerous correlations between hand-built critical-path netlists and
the real ones extracted from the layout. For the present technology,
the task of detailed correlation is getting more difficult because
the function is multidimensional and multiordered.
To compensate for inaccuracy and inconsistency, the
conventional-style memory compiler adds large margins into its
timing models. We have found that these margins often exceed 20
percent and could make it difficult to meet design specifications
when the performance requirement is critical.
New Needs
In todayís technology, with chips having a heavy memory content
built with different processes and going to multiple foundries, the
conventional approach to memory characterization needs to be
revisited. For memory compilers, a change in process models requires
resimulation of the full models. Now simulation of the various
device parameters is necessary for porting, debugging, performance
optimization and overall yield improvement.
During technology porting, foundries characterize their
compiler-generated memories. However, other kinds of embedded
memories (such as three-port SRAM) still require resimulation. When
porting from a slow technology to a fast one, setup time is normally
not a problem, but hold time may cause functional failures.
Therefore, memories require careful simulation on an instance basis.
Memory resimulation can be an overwhelming task for both
engineering and simulation. A memory compiler data sheet usually
contains 20 to 30 timing parameters, most of which require an
optimization such as bisectional analysis for accuracy and speed in
characterization. For each technology corner, there will be a huge
number of circuit parameters to simulate. And for each parameter,
the simulation stimuli and controls, measurement statements and
optimization setups require special arrangements. Therefore, if not
automated, memory characterization requires an enormous amount of
human effort.
Not only does memory characterization needs to be automatic, it
also must be universal, meaning that the same setup can be applied
to all instances of a memory compiler. With a universal approach,
automatic characterization can be rerun any time a Spice model
changes. This is particularly useful when a process gets new models
or when you need to analyze loss of yield on a particular set of
wafers.
On-Chip Memory IP Characterization
On large chip designs there are often 100 or even more blocks of
SRAM, ROM and other specialty memories. It is common for embedded
memory to be in the critical path of the design, as shown in Figure
2.
Figure 2: Embedded Memory on 'Critical-Path' of IC
Design
Critical path simulation and verification are very important,
especially for high-performance IC designs. Usually, the cells and
gates along the path can be characterized easily and accurately.
However, memory IP blocks on the critical path may still be treated
as a black boxes with timing models interpolated or extrapolated
from the corner instances of memory compilers. If so, it is very
difficult to optimize the critical paths.
To enhance the performance of high-speed designs, it is necessary
to have the so-called white box model from actual simulations. Since
this characterization is based on extraction and simulation of the
layout, it is more accurate than the models from lookup tables.
Thus, the dependency on technology and layout can be thoroughly
measured, especially for verification and debugging.
Legend Design Technology Inc. has developed an EDA tool called
MemChar to address the need to characterize memory IP automatically.
MemChar takes, as an input, specifications for each parameter found
in the memory data sheet. The other input needed is the extracted
netlist of the memory circuit, which can be produced by tools such
as StarRC and Arcadia. The circuit data file in Spice format
normally includes MOSFETs, resistors and capacitors. If the memory
instance is too large for extraction, the GDSII layout of the memory
array must be reduced to a ring shape for extraction. This can be
done either by a layout editor or by Legendís GDSCut program.
Legendís SpiceCut has been designed to handle ring-shaped memory
arrays and can emulate them to full-size arrays in building
critical-path netlists for characterization.
Based upon the parameter specifications from the data sheets,
MemChar can automatically generate the simulation stimulus and
controls. The controls for optimization, such as bisection models,
are very critical for the setup time, hold time, minimum clock
width, etc. Since characterization could take hundreds of
simulations, the CPU time of each one does have an impact on the
performance of the whole process. To enhance the performance,
SpiceCut is used to build the critical-path netlist for circuit
reduction and RC reduction. A number of Spice netlists are generated
with the necessary stimulus and measurement statements. Circuit
Simulation Manager is then called to run simulations automatically
in sweep loops or optimization loops. Users can specify the
preferred circuit simulator, after which timing data is obtained
from the simulation results and organized as the timing database for
the models. Figure 3 shows MemCharís process and data flow.
Figure 3: Process and Data Flow of MemCharTM
Program
MemChar has been developed to automate all processes in the flow,
including simulation and optimization. Although a memory compiler
can generate several tens of thousands of instances, only one
instance is needed to set up the automatic characterization flow of
MemChar. For all other configurations of the memory compiler, the
same setup can be used.
The bottleneck of memory characterization always resides in the
circuit reduction. This is the process of building critical-path
circuits for simulation. The patented SpiceCut-Memory tool has been
used for circuit reduction in many designs. To further enhance the
performance, an Asymptotic Waveform Evaluation (AWE)-based RC
reduction has been built into SpiceCut. The critical coupling
effects that are necessary for memory simulation are always taken
into account. To get optimized setup times, hold times, minimum
pulse width, etc., SpiceCut-Memory can automatically generate the
stimulus and controls of the bisection models used for optimization.
Typical is the way Conexant Systems uses MemChar for both
characterization and debugging. It has developed all setup files for
characterizing our 0.18-micron memory compilers and also verified
the timing for the memories on several products.
Recently, Conexant had a memory failure on a new part. Before
MemChar, when a product failed there were two choices: use a relaxed
circuit simulator and simulate the whole circuit without the full
analog resolution of spice, or spend several days setting up a test
bench to run spice on a handcrafted circuit. Neither approach is
desirable or fast.
With MemChar, the companyís engineers can resimulate the failing
memory with parasitic in one day, at Spice-level accuracy. The
accurate timing models obtained from this characterization have
helped in debugging the new part. In addition to reducing the
simulation CPU time, the greatest benefit of using MemChar is
reducing engineering time.
For high-performance networking and communication designs,
MemChar is especially useful for its unique capability of performing
on-chip embedded memory characterization. Since the input is the
layout-extracted data of the exact configuration, the
characterization results directly reflect the on-chip models, not
interpolated values. Therefore, the margins can be well controlled
and system performance can be accurately simulated.