Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC Spring 2013

(1)

Introduction to

Xilinx System Generator

Part II

Evan Everett and Michael Wu ELEC 433 - Spring 2013

(2)

Outline

✓Introduction to FPGAs and Xilinx System Generator

• System Generator basics

• Fixed point data representation • Sample times

(3)

Fixed Point Binary Numbers

• MATLAB generally uses high precision values • 64-bit floating point - huge dynamic range • Impractical in hardware

• System Generator uses fixed point numbers instead • Limited, but flexible, range & precision

• Pro: smaller hardware

(4)

5 fractional

bits

8-5 = 3

integer bits

Fixed-Point Representation

UFix8_5

= 5.5625

8 total bits

fractional bits

total bits

unsigned

1 0 1 1 0 0 1 0

(5)

4 fractional

bits

8-4 = 4

integer bits

Fixed-Point Representation

Fix8_4

= -4.875

8 total bits

fractional bits

total bits

signed

1 0 1 1 0 0 1 0

(6)

Range vs. Precision

Unsigned

Signed

UFix4_0 Fix4_0 0 0000 ₀₀₀₁1 ₂ 0010 3 0011 4 0100 5 0101 6 0110 7 0111 8 1000 15 1111 14 1110 13 1101 12 1100 11 1011 10 1010 9 1001 0 0000 ₀₀₀₁1 ₂ 0010 3 0011 4 0100 5 0101 6 0110 7 0111 -8 1000 -1 1111 -2 1110 -3 1101 -4 1100 -5 1011 -6 1010 -7 1001

(7)

Unsigned

Signed

0 0000 ₀₀₀0.5₁ _1.0 0010 1.5 0011 2.0 0100 2.5 0101 3.0 0110 3.5 0111 4.0 1000 7.5 1111 7.0 1110 6.5 1101 6.0 1100 5.5 1011 5.0 1010 _4.5 1001 UFix4_1 Fix4_1 0 0000 ₀₀₀0.5₁ _1.0 0010 1.5 0011 2.0 0100 2.5 0101 3.0 0110 3.5 0111 -4.0 1000 -0.5 1111 -1.0 1110 -1.5 1101 -2.0 1100 -2.5 1011 -3.0 1010 _-3.5 1001

Range vs. Precision

(8)

Unsigned

0 0000 ₀₀0.25₀₁ _0.5 0010 0.75 0011 1.0 0100 1.25 0101 1.5 0110 1.75 0111 2.0 1000 3.75 1111 3.5 1110 3.25 1101 3.0 1100 2.75 1011 2.5 1010 _2.25 1001 UFix4_2

Signed

Fix4_2 0 0000 ₀₀0.25₀₁ _0.5 0010 0.75 0011 1.0 0100 1.25 0101 1.5 0110 1.75 0111 -2.0 1000 -0.25 1111 -.05 1110 -0.75 1101 -1.0 1100 -1.25 1011 -1.5 1010 _-1.75 1001

Range vs. Precision

(9)

Fixed Point Arithmetic

• Addition & multiplication are provided • Adders use general logic

• Multipliers use dedicated blocks

• Division expensive to implement and rarely used • Multi-cycle operation

(10)

Fixed Point Arithmetic

+

N bit

M bit

max(N,M)+1 bit

(integer growth) _(integer_and fractional growth)

×

N bit M bit N+M bit (integer overflow; fractional quantization)

×

N bit N bit N bit

+

N bit

N bit (overflow risk)

N bit

• More bits needed with each operation for full precision

• May not always want to expand bitwidth, but must

(11)

Fixed Point Quantization

• Occurs when available fractional bits are insufficient

• Truncate (default): just drop bits past LSB; more efficient • Round: choose nearest representable value

5.58203125

5.5625

(Δ=0.01953125)

5.59375

(Δ=0.01171875)

Full Precision (

UFix_11_8

)

Truncated (

UFix_8_5

)

Rounded (

UFix_8_5

)

1 0 1 1 0 0 1 0

1 0 1 1 0 0 1 1

(12)

Fixed Point Overflow

• Occurs when available integer bits are insufficient • Required bits increase with every operation

• This can add up very fast • Think of a long FIR filter

• Most blocks “Error on Overflow” option

• Great for debugging in simulation

• Sim stops with error when overflow occurs

• Overflow in hardware is very hard to isolate: simulate to

(13)

Fixed Point Overflow

• Notice the bit growth

2.50

+ 3.50

+ UFix_4_2

UFix_4_2

10

11

10

6.00

UFix_5_2

110

00

2.50

3.50

6.00

Full Precision - No overflow

Overflow Options

(14)

Fixed Point Overflow

Wrap

Overflow Options

2.50

+ 3.50

+ UFix_4_2

UFix_4_2

6.00

UFix_4_2

2.50

3.50

2.00

10

11

10

00

• Happens by default in hardware if you don’t give enough bits • Not always bad; sometimes this is intentional

(15)

Fixed Point Overflow

Saturate

Overflow Options

2.50

+ 3.50

+ UFix_4_2

UFix_4_2

6.00

UFix_4_2

2.50

3.50

3.75

10

11

10

11

• Stops at max/min to prevent overflow • Sign of answer will be correct

• More expensive in hardware (requires comparator & mux for

(16)

System Generator Clocking

• Both simulation and hardware are discrete time • Model has a master “system sample period”

• Related to FPGA clock in System Generator token • An x “sec” system period = 1 FPGA clock period

System Generator

(17)

Multiple Clock Domains

• All clock domains are multiples of master “System Period”

• Every other clock period is derived from master FPGA clock period • System sample period must be the smallest period in the model

System Generator

(18)

System Generator Clocking

• Sample periods propagate with signals

• Some blocks can override the propagation

• Feedback loops often require explicit sample periods • Most blocks are single rate (eq. logic & arithmetic)

• Many blocks are multi-rate: upsample & downsample,

(19)

Multiple Clock Domains Example

• Upsample, filter, and downsample a 25 MHz (40 ns) signal

25 MHz 50 MHz

Error: sample rates not multiple of sample period

1

12

System Generator

FPGA Clock period = 40 ns

System period = 1

(20)

Multiple Clock Domains Example

• Upsample, filter, and downsample a 25 MHz (40 ns) signal

25 MHz 50 MHz

2

System Generator

FPGA Clock period = 20 ns

System period = 1

Sample period = 2

(21)

Upsample Blocks

Downsample Blocks

• Use sample time colors!

(22)

Resource Estimation

• Any model of any size can be simulated

• Device resource limitations affect HW implementation • Sysgen provides Resource Estimator block

• Adds up resource requirements before synthesis • Good estimate - but not always right!

• Only post-place & route report is guaranteed

Slice count usually matters most

(23)

System Generator Tips

• Show port data types and sample times

• Use variables instead of constants, initialize in a script

• Avoid explicit sample periods (except for feedback loops) • Use keyboard shortcuts

• ctrl-click to wire blocks

• ctrl-drag to duplicate selected blocks • ctrl-D to update/error check model

• Use subsystems and give them meaningful names

• Too much precision is okay at first, use “Error on Overflow” to

optimize later

(24)

System Generator Example

Gateway In:

Gate way Out: UFix10_0 Accumulator: 10 bit output Add Wrap ROMs: Output: Fix16_15 Depth: 1024 Initial Values: cos(2π[0:1023]/1024) sin(2π[0:1023]/1024)

(25)

System Generator Example

ROMs: Depth: 1024 Initial Values: cos(2π[0:1023]/1024) sin(2π[0:1023]/1024)