CAE+ CAD = EDA

Schematic-Based Design Flows

Tools like logic simulators that were used in the front-end (logical design capture and functional verification) portion of the design flow were originally gathered together under the umbrella name ofcomputer-aided engineering (CAE). By comparison, tools like layout (place-and-route) that were used in

The drafting department is referred to as the “drawing office” in the UK.

CAEis pronounced by spelling it out as “C-A-E.”

CAD is pronounced to rhyme with “bad.”

the back-end (physical) portion of the design flow were originally gathered together under the name ofcomputer-aided

design (CAD).

For historical reasons that are largely based on the origins of the terms CAE and CAD, the termdesign engineer—or sim-

plyengineer—typically refers to someone who works in the

front-end of the design flow; that is, someone who performs tasks like conceiving and describing (capturing) the functionality of an IC (what it does and how it does it). By comparison, the termlayout designer—or simplydesigner—commonly refers to someone who is ensconced in the back-end of the design flow; that is, someone who performs tasks such as laying out an IC (determining the locations of the gates and the routes of the tracks connecting them together).

Sometime during the 1980s, all of the CAE and CAD tools used to design electronic components and systems were gathered under the name ofelectronic design automation, or EDA, and everyone was happy (apart from the ones who weren’t, but no one listened to their moaning and groaning, so that was alright).

A simple (early) schematic-driven ASIC flow

Toward the end of the 1970s and the beginning of the 1980s, companies like Daisy, Mentor, and Valid started pro- viding graphicalschematic captureprograms that allowed engineers to create circuit (schematic) diagrams interactively. Using the mouse, an engineer could select symbols represent- ing such entities as I/O pins and logic gates and functions from a specialsymbol libraryand place them on the screen. The engineer could then use the mouse to draw lines (wires) on the screen connecting the symbols together.

Once the circuit had been entered, the schematic capture package could be instructed to generate a corresponding gate- level netlist. This netlist could first be used to drive a logic simulator in order to verify the functionality of the design. The same netlist could then be used to drive the place-and-route software (Figure 8-6).

The term CAD is also used to refer to computer-aided design tools used in a variety of other engineering disciplines, such as mechanical and architectural design. EDA is pronounced by spelling it out as “E-D-A.”

Any timing information that was initially used by the logic simulator would be estimated—particularly in the case of the tracks—and accurate timing analysis was only possible once all of the logic gates had been placed and the tracks connecting them had been routed. Thus, following

place-and-route, anextractionprogram would be used to calcu- late the parasitic resistance and capacitance values associated with the structures (track segments, vias, transistors, etc.) forming the circuit. A timing analysis program would then use these values to generate a timing report for the device. In some flows, this timing information was also fed back to the logic simulator in order to perform a more accurate simulation.

The important thing to note here is that, when creating the original schematic, the user would access the symbols for the logic gates and functions from a special library that was associated with the targeted ASIC technology.1_{Similarly, the}

Gate-level netlist

BEGIN CIRCUIT=TEST INPUT SET_A, SET-B,

DATA, CLOCK, CLEAR_A, CLEAR_B; OUTPUT Q, N_Q; WIRE SET, N_DATA, CLEAR; GATE G1=NAND (IN1=SET_A,

IN2=SET_B, OUT1=SET); GATE G2=NOT (IN1=DATA, OUT1=N_DATA); GATE G3=OR (IN1=CLEAR_A, IN2=CLEAR_B, OUT1=CLEAR); GATE G4=DFF (IN1=SET, IN2=N_DATA,

IN3=CLOCK, IN4=CLEAR, OUT1=Q, OUT2=N_Q); END CIRCUIT=TEST; Logic Simulator Place-and- Route Functional verification Extraction and timing analysis

Detect and fix problems

Schematic capture

Figure 8-6. Simple (early) schematic-driven ASIC flow.

1_{There are always different ways to do things. For example, some flows}

were based on the concept of using a generic symbol library containing a subset of logic functions common to all ASIC cell libraries. The netlist 1873:England

James Clerk Maxwell describes the

electromagnetic nature of light and publishes his theory of radio waves.

simulator would be instructed to use a corresponding library of simulation models with the appropriate logical functionality2 and timing for the targeted ASIC technology. The end result was that the gate-level netlist presented to the place-and-route software directly mapped onto the logic gates and functions being physically implemented on the silicon chip (this is a tad different from the FPGA flow, as is discussed in the following topic).

A simple (early) schematic-driven FPGA flow

When the first FPGAs arrived on the scene in 1984, it was natural that their design flows would be based on existing schematic-driven ASIC flows. Indeed, the early portions of the flows were very similar in that, once again, a schematic capture package was used to represent the circuit as a collection of primitive logic gates and functions and to generate a corresponding gate-level netlist. As before, this netlist was

subsequently used to drive the logic simulator in order to perform the functional verification.

The differences began with the implementation portion of the flow because the FPGA fabric consisted of an array ofcon-

figurable logic blocks (CLBs), each of which was formed from a

number of LUTs and registers. This required the introduction of some additional steps calledmappingandpackinginto the flow (Figure 8-7).

generated from the schematic capture application could then be run through a translator that converted the generic cell names to their equivalents in the targeted ASIC library.

2_{With regard to functionality, one might expect a primitive logical entity}

like a 2-input AND gate to function identically across multiple libraries. This is certainly the case when “good” (logic 0 and 1) values are applied to the inputs, but things may vary when high-impedance ‘Z’ values or unknown ‘X’ values are applied to the inputs. And even with good 0 and 1 values applied to their inputs, more complex functions like D-type latches and flip-flops can behave very differently for “unusual” cases such as the set and clear inputs being driven active at the same time.

1874:America. Alexander Graham Bell conceives the idea of the telephone.

Mapping

In this context,mappingrefers to the process of associating entities such as the gate-level functions in the gate-level netlist with the LUT-level functions available on the FPGA. Of course, this isn’t a one-for-one mapping because each LUT can be used to represent a number of logic gates (Figure 8-8).

Portion of gate-level netlist Contents of 3-input LUT a b c y 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 1 0 1 0 0 1 a XOR | NOT b c XNOR | d e y

Figure 8-8. Mapping logic gates into LUTs.

Gate-level netlist BEGIN CIRCUIT=TEST

INPUTSET_A, SET-B, DATA, CLOCK, CLEAR_A, CLEAR_B; OUTPUT Q, N_Q; WIRESET, N_DATA, CLEAR; GATE G1=NAND (IN1=SET_A,

IN2=SET_B, OUT1=SET); GATE G2=NOT(IN1=DATA, OUT1=N_DATA); GATE G3=OR(IN1=CLEAR_A, IN2=CLEAR_B, OUT1=CLEAR); GATE G4=DFF(IN1=SET, IN2=N_DATA,

IN3=CLOCK, IN4=CLEAR, OUT1=Q, OUT2=N_Q); END CIRCUIT=TEST; Fully-routed physical (CLB-level) netlist Schematic capture Mapping Packing Place-and-

Route Timing analysis and timing report

Gate-level netlist for simulation

SDF (timing info) for simulation

Figure 8-7. Simple (early) schematic-driven FPGA flow. 1875:America.

Edison invents the Mimeograph.

Mapping (which is still performed today, but elsewhere in the flow, as will be discussed in later chapters) is a nontrivial problem because there are a large number of ways in which the logic gates forming a netlist can be partitioned into the smaller groups to be mapped into LUTs. As a simple example, the functionality of the NOT gate shown in Figure 8-8 might have been omitted from this LUT and instead incorporated into the upstream LUT driving wire c.

Packing

Following the mapping phase, the next step waspacking, in which the LUTs and registers were packed into the CLBs. Once again, packing (which is still performed today, but elsewhere in the flow, as will be discussed in later chapters) is a nontrivial problem because there are myriad potential combi- nations and permutations. For example, assume an incredibly simple design comprising only a couple of handfuls of logic gates that end up being mapped onto four 3-input LUTs that we’ll call A, B, C, and D. Now assume that we’re dealing with an FPGA whose CLBs can each contain two 3-input LUTs. In this case we’ll need two CLBs (called 1 and 2) to contain our four LUTs. As a first pass, there are 4! (factorial four = 4 3 2 1 = 24) different ways in which our LUTs can be packed into the two CLBs (Figure 8-9).

Only 12 of the 24 possible permutations are shown here (the remainder are left as an exercise for the reader). Further-

CLB 1 A B CLB 2 C D A B D C A C B D A C D B A D B C A D C B B A C D B A D C B C A D B C D A B D A C B D C A etc. Different permutations Functionally equivalent

Figure 8-9. Packing LUTs into CLBs.

1875:England. James Clerk Maxwell states that atoms must have a structure.

more, in reality there are actually only 12 permutations of significance because each has a “mirror image” that is functionally its equivalent, such as the AC-BD and BD-AC pairs shown in Figure 8-9. The reason for this is that when we come to place-and-route, the relative locations of the two CLBs can be exchanged.

Place-and-route

Following packing, we move toplace-and-route. With regard to the previous point, let’s assume that our two CLBs need to be connected together, but that—purely for the pur- poses of this portion of our discussions—they can only be placed horizontally or vertically adjacent to each other, in which case there are four possibilities (Figure 8-10).

In the case of placement (i) for example, if CLB 1 contained LUTs A-C and CLB 2 contained LUTs B-D, then this would be identical to swapping the positions of the two CLBs and exchanging their contents.

If we only had the two CLBs shown in figure 8-10, it would be easy to determine their optimal placement with respect to each other (which would have to be one of the four options shown above) and the absolute placement of this two-CLB group with respect to the entire chip.

Prior to the advent of FPGAs, the equivalent functionality to place- and-route in “CPLD land” was performed by an application known as a “fitter.”

When FPGAs first arrived on the scene, people used the same “fitter” appella- tion, but over time they migrated to using the term “place-and-route” because this more accu- rately reflected what was actually occurring.

As opposed to using a symbol library of primitive logic gates and registers, an interesting alternative circa the early 1990s was to use a symbol library corresponding to slightly more complex logical functions (say around 70 functions). The output from the schematic was a netlist of functional blocks that were already de facto mapped onto LUTs and packed into CLBs.

This had the advantage of giving a better idea of the number of levels of logic between register stages, but it limited such activi- ties as optimization and swapping.

The placement problem is much more complex in the real world because a real design can contain extremely large num- bers of CLBs (hundredsorthousands in the early days, and hundreds of thousands by 2004). In addition to CLBs 1 and 2 being connected together, they will almost certainly need to be connected to other CLBs. For example, CLB 1 may also need to be connected to CLBs 3, 5 and 8, while CLB 2 may need to be connected to CLBs 4, 6, 7, and 8. And each of these new CLBs may need to be connected to each other or to yet more CLBs. Thus, although placing CLBs 1 and 2 next to each other would be best for them, it might be detrimental to their relationships with the other CLBs, and the most optimal solution overall might actually be to separate CLBs 1 and 2 by some amount.

Although placement is difficult, deciding on the optimal way to route the signals between the various CLBs poses an even more Byzantine problem. The complexity of these tasks is mind-boggling, so we’ll leave it to those guys and gals who write the place-and-route algorithms (they are the ones sport- ing size-16 extra-wide brains with go-faster stripes) and quickly move onto other things.

Timing analysis and post-place-and-route simulation

Following place-and-route, we have a fully routed physical (CLB-level) netlist, as was illustrated in Figure 8-7. At this point, astatic timing analysis (STA)utility will be run to calcu- late all of the input-to-output and internal path delays and also to check for any timing violations (setup, hold, etc.) associated with any of the internal registers.

One interesting point occurs if the design engineers wish to resimulate their design with accurate (post-place-and-route) timing information. In this case, they have to use the FPGA tool suite to generate a new gate-level netlist along with associated timing information in the form of an industry-standard file format called—perhaps not surprisingly—standard delay

format (SDF). The main reason for generating this new gate-

STA is pronounced by spelling it out as “S-T-A” (see also Chapter 19).

SDF is pronounced by spelling it out as “S-D-F” (see also Chapter 10).

level netlist is that—once the original netlist has been

coerced into its CLB-level equivalent—it simply isn’t possible to relate the timings associated with this new representation back into the original gate-level incarnation.

Flat versus hierarchical schematics Clunky flat schematics

The very first schematic packages essentially allowed a design to be captured as a humongous, flat circuit diagram split into a number of “pages.” In order to visualize this, let’s assume that you wish to draw a circuit diagram comprising 1,000 logic gates on a piece of paper. If you created a single large diagram, you would end up with a huge sheet of paper (say eight-feet square) with primary inputs to the circuit on the left, primary outputs from the circuit on the right, and the body of the circuit in the middle.

Carrying this circuit diagram around and showing it to your friends would obviously be a pain. Instead, you might want to cut it up into a number of pages and store them all together in a folder. In this case, you would make sure that your partitioning was logical such that each page contained all of the gates relating to a particular function in the design. Also, you would use interpage connectors (sort of like pseudo inputs and outputs) to link signals between the various pages.

This is the way the original schematic capture packages worked. You created a single flat schematic as a series of pages linked together by interpage connector symbols, where the names you gave these symbols told the system which ones were to be connected together. For example, consider a simple circuit sketched on a piece of paper (Figure 8-11).

Assume that the gates on the left represent some control logic, while the four registers on the right are implementing a 4-bit shift register. Obviously, this is a trivial example, and a real circuit would have many more logic gates. We’re just try- ing to tie down some underlying concepts here, such as the fact that when you entered this circuit into your schematic 1876:America.

10th_{March. Intelligible}

human speech heard over Alexander Graham Bell’s telephone for the first time.

capture system, you might split it into two pages (Figure 8-12).

Sleek hierarchical (block-based) schematics

There were a number of problems associated with the flat schematics discussed above, especially when dealing with real- world circuits requiring 50 or more pages:

■ It was difficult to visualize a high-level, top-down view

of the design.

■ It was difficult to save and reuse portions of the design

in future projects.

■ In the case of designs in which some portion of the

circuit was repeated multiple times (which is very common), that portion would have to be redrawn or copied onto multiple pages. This became really

Schematic capture system Page 1 (Control logic) Page 2 (Shift register)

Figure 8-12. Simple two-page flat schematic. Figure 8-11. Simple schematic drawn on a piece of paper.

1876:America. Alexander Graham Bell patents the telephone.

painful if you subsequently realized that you had to make a change because you would have to make the same change to all of the copies.

The answer was to enhance schematic capture packages to support the concept of hierarchy. In the case of our shift register circuit, for example, you might start with a top-level page in which you would create two blocks called control and shift, each with the requisite number of input and output pins. You would then connect these blocks to each other and also to some primary inputs and outputs.

Next, you would instruct the system to “push down” into the control block, which would open up a new schematic page. If you were lucky, the system would automatically pre- populate this page with input and output connector symbols (and with associated names) corresponding to the pins on its parent block. You would then create the schematic corresponding to that block as usual (Figure 8-13).

In fact, each block could contain a further block-level schematic, or a gate-level schematic, or (very commonly) a mixture of both. These hierarchical block-based schematics answered the problems associated with flat schematics:

Top-level page Contents of “control” block Contents of “Shift” block C ont ro l Sh if t

Figure 8-13. Simple hierarchical schematic.

In document The Design Warriors Guide to FPGAs pdf (Page 156-169)