Course web page:
ECE 545
Digital System Design with VHDL
ECE web page
→
Courses
→
Course web pages
→
ECE 545
Kris Gaj
Office hours: Thursday, 7:30-8:30 PM,
Tuesday, 7:30-8:30 PM,
and by appointment
Research and teaching interests:
•
reconfigurable computing
•
computer arithmetic
•
cryptography
•
network security
Contact:
The Engineering Building, room 3225
kgaj@gmu.edu
ECE 545
Part of:
MS in Electrical Engineering
MS in Computer Engineering
Digital Systems Design
Fundamental course for the specialization area:
Elective
Elective
course in the remaining specialization areas
One of five core courses
ECE 545
Part of:
PhD in Electrical and Computer Engineering
Knowledge tested at the
Technical Qualifying Exam (TQE)
I am interested in…
I want to specialize primarily in…
VLSI
Digital Systems Design ASICs & FPGAs
VHDL/Verilog CAD Tools Reconfigurable Computing Microelectronics VLSI Fabrication Nanoelectronics
CAD tools & Design Automation Hardware Description Languages FPGAs & Reconfigurable computing Computer Arithmetic
Front-end ASIC Design
(algorithmic downto gate level) Back-end ASIC Design
(circuit and mask layout levels) Analog & Digital Circuit Design VLSI Fabrication
Microelectronics Nanoelectronics
Semiconductor Devices
MS CpE
Digital Systems Design
MS EE Microelectronics/ Nanoelectronics Recommended program & specialization
algorithmic
Design level
register-transfer gate transistor layout devicesCourses
Computer Arithmetic Digital System Design with VHDL Digital Integrated Circuits Physical VLSI Design VLSI Test Concepts ECE 545 ECE 645 ECE 586 ECE 680 ECE 682ECE684 MOS Device
Electronics ECE 584 Semiconductor Device Fundamentals ECE 681 VLSI Design for ASICs
CpE
Digital Systems Design
Pre-
Approved Electives
Suggested Electives
ECE 545 Digital System Design with VHDL
ECE 586 Digital Integrated Circuits ECE 645 Computer Arithmetic
ECE 681 VLSI Design for ASICs ECE 682 VLSI Test Concepts
ECE 584, 684, … (technology)
ECE 511, 611, … (microprocessors) ECE 646, 746, … (applications)
K. Gaj, K. Hintz, H. Homayoun, J. Kaps, T. Storey
CpE
Microprocessors and Embedded Systems
ECE 510 Real-Time Concepts ECE 511 Microprocessors
ECE 611 Advanced Microprocessors ECE 612 Real-Time Embedded
Systems
ECE 641 Computer System Architecture
CS 540, 583 (languages, algorithms) CS 635 (parallel machines)
ECE 542, 642, 742 (networks) ECE 645, 681 (digital design)
ECE 548 (sequential mach. theory)
H. Homayoun, J. Kaps, P. Pachowicz, C. Sabzevari
DIGITAL SYSTEMS DESIGN
Concentration advisors: Kris Gaj, Jens-Peter Kaps, Ken Hintz
1. ECE 545 Digital System Design with VHDL
– K. Gaj, project, FPGA design with VHDL, Aldec/Mentor Graphics, Xilinx/Altera
2. ECE 645 Computer Arithmetic
– K. Gaj, project, FPGA design with VHDL Aldec/Mentor Graphics, Xilinx/Altera
3. ECE 681 VLSI Design for ASICs
– H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools
4. ECE 586 Digital Integrated Circuits – D. Ioannou, R. Mulpuri,
5. ECE 682 VLSI Test Concepts – T. Storey
Grading Scheme
•
Homework
- 10%
•
Project
- 40%
•
Midterm Exam - 20%
Midterm exam 1
ü
2 hours 30 minutes
ü
in class
ü
design-oriented
ü
open-books, open-notes
ü
practice exams available on the web
Last week of October
Tentative date:
Final exam
ü
2 hours 45 minutes
ü
in class
ü
design-oriented
ü
open-books, open-notes
ü
practice exams available on the web
Thursday, December 13, 4:30-7:15pm
Date:
12
Required Textbook
Pong P. Chu, RTL Hardware Design Using VHDL,
Wiley-Interscience, 2006.
IKC ?8I;N8I<;<J@>E LJ@E>M?;C :f[`e^]fi<]ÔZ`\eZp#GfikXY`c`kp#Xe[JZXcXY`c`kp GFE>G%:?L I KC ? 8 I ; N 8 I < ; <J @> E LJ @E > M ? ; C :?L :f[`e^]fi<]ÔZ`\eZp # Gfi kXY`c`kp #Xe[ JZXcXY`c`kp K?<JB@CCJ8E;>L@;8E:<E<<;<;KF D8JK<IIKC?8I;N8I<;<J@>EK_`j Yffb k\XZ_\j i\X[\ij _fn kf jpjk\dXk`ZXccp [\j`^e \]ÔZ`\ek# gfikXYc\# Xe[ jZXcXYc\ I\^`jk\i KiXej]\i C\m\c IKC [`^`kXc Z`iZl`kj lj`e^ k_\ M?;C _Xi[nXi\ [\jZi`gk`fe cXe^lX^\ Xe[ jpek_\j`j jf]knXi\% =fZlj`e^ fe k_\ df[lc\$c\m\c [\j`^e# n_`Z_ `j Zfdgfj\[ f] ]leZk`feXc le`kj# iflk`e^ Z`iZl`k# Xe[ jkfiX^\# k_\ Yffb `ccljkiXk\j k_\ i\cXk`fej_`g Y\kn\\e k_\M?;CZfejkilZkjXe[k_\le[\icp`e^_Xi[nXi\Zfdgfe\ekj#Xe[j_fnj_fnkf[\m\cfg Zf[\j k_Xk ]X`k_]lccp i\Õ\Zk k_\ df[lc\$c\m\c [\j`^e Xe[ ZXe Y\ jpek_\j`q\[ `ekf \]ÔZ`\ek ^Xk\$c\m\c`dgc\d\ekXk`fe% J\m\iXcle`hl\]\Xkli\j[`jk`e^l`j_k_\Yffb1 :f[`e^jkpc\k_Xkj_fnjXZc\Xii\cXk`fej_`gY\kn\\eM?;CZfejkilZkjXe[ _Xi[nXi\Zfdgfe\ekj :feZ\gklXc[`X^iXdjk_Xk`ccljkiXk\k_\i\Xc`qXk`fef]M?;CZf[\j <dg_Xj`jfek_\Zf[\i\lj\ GiXZk`ZXc\oXdgc\jkf[\dfejkiXk\Xe[i\`e]fiZ\[\j`^eZfeZ\gkj# gifZ\[li\j#Xe[k\Z_e`hl\j KnfZ_Xgk\ijfei\Xc`q`e^j\hl\ek`XcXc^fi`k_dj`e_Xi[nXi\ KnfZ_Xgk\ijfejZXcXYc\Xe[gXiXd\k\i`q\[[\j`^ejXe[Zf[`e^ Fe\Z_Xgk\iZfm\i`e^k_\jpeZ_ife`qXk`feXe[`ek\i]XZ\Y\kn\\edlck`gc\ ZcfZb[fdX`ej 8ck_fl^_k_\]fZljf]k_\Yffb`jIKCjpek_\j`j#`kXcjf\oXd`e\jk_\jpek_\j`jkXjb]ifdk_\ g\ijg\Zk`m\ f] k_\ fm\iXcc [\m\cfgd\ek gifZ\jj% I\X[\ij c\Xie ^ff[ [\j`^e giXZk`Z\j Xe[ ^l`[\c`e\jkf\ejli\k_XkXeIKC[\j`^eZXeXZZfddf[Xk\]lkli\j`dlcXk`fe#m\i`ÔZXk`fe#Xe[ k\jk`e^e\\[j#Xe[ZXeY\\Xj`cp`eZfigfiXk\[`ekfXcXi^\ijpjk\dfii\lj\[%;`jZljj`fe`j`e$ [\g\e[\ekf]k\Z_efcf^pXe[ZXeY\Xggc`\[kfYfk_8J@:Xe[=G>8[\m`Z\j% N`k_ X YXcXeZ\[ gi\j\ekXk`fe f] ]le[Xd\ekXcj Xe[ giXZk`ZXc \oXdgc\j# k_`j `j Xe \oZ\c$ c\ek k\okYffb ]fi lgg\i$c\m\c le[\i^iX[lXk\ fi ^iX[lXk\ Zflij\j `e X[mXeZ\[ [`^`kXc cf^`Z% <e^`e\\ijn_fe\\[kfdXb\\]]\Zk`m\lj\f]kf[XpËjjpek_\j`jjf]knXi\Xe[=G>8[\m`Z\j j_flc[Xcjfi\]\ikfk_`jYffb%
GFE>G%:?L#G?;#`j8jjfZ`Xk\Gif]\jjfi`ek_\;\gXikd\ekf]<c\Zki`ZXcXe[:fdglk\i <e^`e\\i`e^# :c\m\cXe[ JkXk\ Le`m\ij`kp% ?\ _Xj i\Z\`m\[ ^iXekj ]ifd Yfk_ E8J8 Xe[ k_\ EXk`feXcJZ`\eZ\=fle[Xk`feXe[_XjkXl^_kle[\i^iX[lXk\Xe[^iX[lXk\$c\m\c[`^`kXcjpjk\dj Xe[Zfdglk\iXiZ_`k\Zkli\Zflij\j]fidfi\k_XeX[\ZX[\%
Supplementary Textbook – Basics Refresher
Stephen Brown and Zvonko Vranesic,
Fundamentals of Digital Logic with VHDL Design,
McGraw-Hill, 3
rdor 2
ndEdition
Supplementary Textbook – Advanced
Hubert Kaeslin, Digital Integrated Circuit Design:
From VLSI Architectures to CMOS Fabrication,
Cambridge University Press; 1st Edition, 2008.
Used in ECE 681
16
Technology
&
Bl ock R AMs Bl ock R AMs Configurable Logic Blocks I/O Blocks
What is an FPGA?
Block RAMsFPGA Design process (1)
Design and implement a simple unit permitting tospeed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…..
Library IEEE;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity RC5_core is
port(
clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31downto0); data_output: out std_logic_vector(31downto0); out_full: in std_logic;
key_input: in std_logic_vector(31downto0); key_read: out std_logic;
);
end AES_core;
Specification / Pseudocode
VHDL description (Your Source Files)
Functional simulation
Post-synthesis simulation Synthesis
On-paper hardware design (Block diagram & ASM chart)
FPGA Design process (2)
Implementation
Configuration
Timing simulation
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC;
signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin
A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else
not B; Y<=Y1 when (NEG_Y='0') else
not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1;
with (L1 & L0) select
Y1<=MUX_0 when "00",
MUX_1 when "01", MUX_2 when "10", MUX_3 when others;
end MLU_DATAFLOW;
VHDL description
Circuit netlist
FPGA Implementation
•
After synthesis the entire implementation
Xilinx FPGA Tools
Aldec Active-HDL (IDE)
Xilinx XST &
Synopsys Synplify Premier Xilinx ISE Design Suite
ECE Labs
Mentor Graphics ModelSim SE Xilinx XST
&
Synopsys Synplify Premier
Xilinx ISE Design Suite (IDE) Aldec Active-HDL Design Flow Xilinx ISE Design Flow simulation synthesis implementation
Xilinx FPGA Tools
Aldec Active-HDL
Student Edition (IDE)
Xilinx XST (restricted)
Home
Aldec Active-HDL Design Flow Xilinx ISE Design Flow simulation synthesis implementationXilinx ISE WebPACK (restricted)
Mentor Graphics ModelSim PE
Student Edition
Xilinx XST (restricted)
Xilinx ISE WebPACK (IDE)
Altera FPGA Tools
ECE Labs
Mentor Graphics ModelSim-Altera
Altera Quartus II Subscription Edition
Altera
Design Flow
simulation
Altera FPGA Tools
Home
Mentor Graphics ModelSim-Altera Starter (restricted)
Altera Quartus II Web Edition (restricted)
Altera
Design Flow
simulation
32
Project
ü
semester-long
ü
related to the research project conducted by
Cryptographic Engineering Research Group (CERG)
at GMU
ü
supporting NIST (National Institute of Standards
and Technology) in the evaluation of candidates
for a new cryptographic standard
34
Cryptography is Everywhere
Buying a book on-line Withdrawing cash from ATM
Teleconferencing over Intranets
Backing up files on remote server
Cryptographic Standards Before 1997
time 1970 1980 1990 2000 2010
DES – Data Encryption Standard
1977 1999
Triple DES
SHA-1–Secure Hash Algorithm SHA-2
Secret-Key Block Ciphers
Hash Functions 1993 1995 2003 SHA 2005 NSA IBM & NSA
Why a Contest for
a Cryptographic Standard?
•
Avoid back-door theories
•
Speed-up the acceptance of the standard
•
Stimulate non-classified research on methods of
designing a specific cryptographic transformation
•
Focus the effort of a relatively small cryptographic
Cryptographic Standard Contests
time 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 AES NESSIE CRYPTREC eSTREAM SHA-334 stream ciphers → 4 HW winners + 4 SW winners
51 hash functions → 1 winner
15 block ciphers → 1 winner
IX.1997 X.2000
I.2000 XII.2002
V.2008
X.2007 XII.2012 XI.2004
40
Cryptographic Contests - Evaluation Criteria
Security
Software Efficiency
Hardware Efficiency
Simplicity
FPGAs ASICs
Flexibility
Licensing
Specific Challenges of Evaluations
in Cryptographic Contests
• Very wide range of possible applications, and as a result
performance and cost targets
throughput: single Mbits/s to hundreds Gbits/s cost: single cents to thousands of dollars
• Winner in use for the next 20-30 years, implemented using
technologies not in existence today • Large number of candidates
• Limited time for evaluation
Mitigating Circumstances
• Security is a primary criterion
• Performance of competing algorithms tend to very significantly
(sometimes as much as 500 times)
• Only relatively large differences in performance matter
(typically at least 20%)
• Multiple groups independently implement the same algorithms
(catching mistakes, comparing best results, etc.) • Second best may be good enough
AES
Contest
1997-2000
Rules of the Contest
Each team submits
Detailed
cipher
specification
Justification
of design
decisions
Tentative
results
of cryptanalysis
Source
code
in C
Source
code
in Java
Test
vectors
AES: Candidate Algorithms
USA:
Mars RC6 Twofish Safer+ HPCCanada:
CAST-256 DealCosta Rica:
FrogAustralia
:
LOKI97Japan:
E2Korea:
CryptonBelgium:
RijndaelFrance:
DFCGermany:
MagentaIsrael, UK,
Norway:
Serpent 8 4 2 1AES Contest Timeline
15 Candidates
CAST-256, Crypton, Deal, DFC, E2, Frog, HPC, LOKI97, Magenta, Mars, RC6, Rijndael, Safer+, Serpent, Twofish,
June 1998
August 1999
October 2000
1 winner: Rijndael
Belgium 5 final candidatesMars, RC6, Twofish (USA) Rijndael, Serpent (Europe)
Round 1
Round 2
Security Software efficiency Security Software efficiency Hardware efficiencySecurity
Simplicity
High
Adequate
Simple
Complex
NIST Report: Security & Simplicity
MARS
Rijndael
Serpent
Twofish
0
5
10
15
20
25
30
Serpent
Rijndael
RC6
Twofish
Mars
Efficiency in software: NIST-specified platform
128-bit key 192-bit key 256-bit key
200 MHz Pentium Pro, Borland C++ Throughput [Mbits/s]
NIST Report: Software Efficiency
Encryption and Decryption Speed
32-bit
processors
64-bit
processors
DSPs
high
medium
low
RC6 Rijndael Mars Twofish Serpent Rijndael Twofish Mars RC6 Serpent Rijndael Twofish Mars RC6 SerpentEfficiency in FPGAs: Speed
0 50 100 150 200 250 300 350 400 450 500 Throughput [Mbit/s] Serpentx8 Rijndael Twofish Serpent x1 RC6 Mars
431 444 414 353 294 177 173 104 149 62 143 112 88 102 61
Worcester Polytechnic Institute University of Southern California George Mason University
0 100 200 300 400 500 600 700
Rijndael Serpent Twofish RC6 Mars x1 606 202 105 103 57 443 202 105 104 57
3-in-1 (128, 192, 256 bit) key scheduling 128-bit key scheduling
Efficiency in ASICs: Speed
Results for ASICs matched very well results for FPGAs, and were both very different than software
FPGA ASIC
Serpent fastest in hardware, slowest in software GMU+USC, Xilinx Virtex XCV-1000 NSA Team, ASIC, 0.5µm MOSIS
Lessons Learned
x8
x1
Hardware results matter!
Speed in FPGAs Votes at the AES 3 conference
Final round of the AES Contest, 2000
Lessons Learned
•
Optimization for maximum throughput
•
Single high-speed architecture per candidate
•
No use of embedded resources of FPGAs
(Block RAMs, dedicated multipliers)
•
Single FPGA family from a single vendor:
Xilinx Virtex
FPGA Evaluations
AES eSTREAM SHA-3
Multiple FPGA families No No Yes
Multiple architectures No Yes Yes
Use of embedded resources No No Yes Primary optimization target Throughput Area Throughput/ Area Throughput/ Area
Experimental results No No Yes
Availability of source codes
No No Yes
ASIC Evaluations
AES eSTREAM SHA-3
Multiple processes/ libraries
No No Yes
Multiple architectures No Yes Yes
Primary optimization target
Throughput Power x Area x Time
Throughput /Area
Post-layout results No Yes Yes
Experimental results No Yes Yes
Availability of source codes
No No Yes
Benchmarking
Tools
Tools for Benchmarking
Implementations of Cryptography
Software FPGAs ASICs
eBACS D. Bernstein (UIC) T. Lange (TUE)
?
ATHENa K. Gaj, J. Kaps, et al. (GMU) 2006-present 2009-present59
Benchmarking
60
eBACS: ECRYPT Benchmarking of
Cryptographic Systems:
• measurements on multiple machines (currently over 90)
• each implementation is recompiled multiple times
(currently over 1600 times) with various compiler options • time measured in clock cycles/byte for multiple
input/output sizes
• median, lower quartile (25th percentile), and upper quartile
(75th percentile) reported
• standardized function arguments (common API)
SUPERCOP - toolkit developed by D. Bernstein and T. Lange for measuring performance of cryptographic software
SUPERCOP Extension for Microcontrollers –
XBX: 2009-present
Ø Christian Wenzel-Benner,
ITK Engineering AG, Germany
Ø Jens Gräf, LiNetCo GmbH,
Heiger, Germany
Developers:
Allows on-board timing measurements Supports at least the following
microcontrollers: 8-bit:
Atmel ATmega1284P (AVR) 32-bit:
TI AR7 (MIPS)
Atmel AT91RM9200 (ARM 920T) Intel XScale IXP420 (ARM v5TE) Cortex-M3 (ARM)
62
Benchmarking
ATHENa
–
A
utomated
T
ool for
H
ardware
E
valuatio
N
63
Open-source benchmarking environment, written in Perl, aimed at
AUTOMATED generation of OPTIMIZED results for
MULTIPLE hardware platforms.
The most recent version 0.6.2 released in June 2011.
Full features in ATHENa 1.0 to be released in 2012.
Why Athena?
64
"The Greek goddess Athena was frequently called upon to settle disputes between
the gods or various mortals.
Athena Goddess of Wisdom was
known for her superb logic and intellect.
Her decisions were usually well-considered, highly ethical, and seldom motivated
by self-interest.”
from "Athena, Greek Goddess of Wisdom and Craftsmanship"
ATHENa Server
FPGA Synthesis and Implementation Result Summary + Database Entries 2 3 HDL + scripts + configuration files 1 Database Entries Download scripts and configuration files8 Designer 4 HDL + FPGA Tools User Database query Ranking of designs 5 6
Basic Dataflow of ATHENa
0
Interfaces
Three Components of the ATHENa
Environment
•
ATHENa Tool
•
ATHENa Database of Results
•
ATHENa Website
67
ATHENa – Database
of Results
68
ATHENa Database
69
ATHENa Database – Result View
• Algorithm parameters
• Design parameters
§ Optimization target
§ Architecture type
§ Datapath width
§ I/O bus widths
§ Availability of source code
§ Platform
§ Vendor, Family, Device
§ Timing
§ Maximum clock frequency
§ Maximum throughput
§ Resource utilization
§ Logic blocks (Slices/LEs/ALUTs)
§ Multipliers/DSP units
§ Tools
§ Names & versions
§ Detailed options
§ Credits
70
ATHENa Database – Compare Feature
Matching fields in grey
71
72
ATHENa Website
http://cryptography.gmu.edu/athena/
• Download of ATHENa Tool
• Links to related tools
SHA-3 Competition in FPGAs & ASICs
• Specifications of candidates
• Interface proposals
• RTL source codes
• Testbenches
• ATHENa database of results
73
ATHENa Result Replication Files
• Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations)
• Automatically created by ATHENa for all results generated using ATHENa
• Stored in the ATHENa Database
In the same spirit of Reproducible Research as:
• Patrick Vandewalle1, Jelena Kovacevic2, and Martin Vetterli1 (1EPFL, 2CMU)
Reproducible research in signal processing - what, why, and how. IEEE Signal Processing Magazine, May 2009. http://rr.epfl.ch/17/ • J. Claerbout (Stanford University)
“Electronic documents give reproducible research a new meaning,”
in Proc. 62nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992, http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92
74
Benchmarking Goals Facilitated by ATHENa
1. cryptographic algorithms
2. hardware architectures or implementations
of the same cryptographic algorithm
3. hardware platforms from the point of view
of their suitability for the implementation of a given algorithm, (e.g., choice of an FPGA device or FPGA board)
4. tools and languages in terms of quality
of results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST,
ISE v. 13.1 vs. ISE v. 12.3)
75
Your Project:
Implementation and
Benchmarking of
Authenticated
Ciphers
Features of Authenticated Ciphers
1. Confidentiality
2. Message integrity
3. Message authentication
Bob Alice Charlie Bob Alice Charlie Bob Alice CharlieAll Projects - Organization
•
Projects divided into phases
•
Deliverables for each phase submitted through
Blackboard at selected checkpoints and evaluated
by the instructor and/or TA
•
Feedback provided to students on a best effort basis
•
Final report and codes submitted using Blackboard
Honor Code Rules
•
All students are expected to write and debug
their codes individually
•
Students are encouraged to help and support each
other in all problems related to the
- operation of the CAD tools
- understanding of an investigated algorithm and
existing implementations
79