Srini Devadas
Massachuse0s Ins2tute of Technology
What Role Does Hardware
Design Have to Play in
Agenda
•
Cyber threats today
•
Defensive strategies
•
Role of hardware design
• Worm enters system through downloaded file.
• Payload encrypts
user’s hard drive and deletes the original files – user cannot
decipher his/her own files
• Pay $500 in Bitcoin to
get your files back!
AKacks on Individuals
Ransomware
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
https://nakedsecurity.sophos.com/2012/11/02/ anonymous-ransomware/
AKacks on Services
Target in 2013
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
40 million: The number of credit and debit cards thieves stole.
70 million: The number of records stolen that included the name, address, email address and phone number of Target shoppers.
46: % drop in profits in the 4th quarter of 2013, compared to 2012.
200 million: Estimated cost for reissuing 21.8 million cards.
53.7 million: The income that hackers likely generated from the sale of 2 million cards stolen and sold at $26 per card.
0: Number of customer cards with AVAILABLE hardware security technology that would have been able to stop the bad guys from stealing.
AKacks on Infrastructure
The Stuxnet Cyberphysical AKack
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
•
A 500 Kbyte computer worm that infected the
soTware of at least 14 industrial sites in Iran
including a nuclear facility
•
Goal was to cause fast-‐spinning centrifuges to
tear themselves apart
•
Stuxnet was tracked down by Kaspersky Labs
but not before it did some damage
AKacks on Infrastructure
The Stuxnet Cyberphysical AKack
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
• Cybersecurity is a property of computer systems similar to performance and energy
• AKackers take a holis=c view by aKacking any component or interface of system
• Diverse threat models dictate different desirable security proper=es
– Viruses and worms: Bug-‐free programs
– Denial-‐of-‐Service aKacks: Redundant resources
– Cyberphysical aKacks: Tamper-‐resistant hardware
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
• Computer systems are so complex that it is impossible to design them without vulnerabili=es.
• Therefore, the best we can do is to:
– Focus on exis=ng compu=ng systems and their aKacks to
discover flaws
– Design mechanisms into these systems to protect against
these aKacks
– Manage risk and administer systems well
• Unfortunately, new flaws are always discovered…
• We need to do beKer than this “Patch & Pray,
Perimeter Protec=on” mindset
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
• Security property cannot be ar=culated well when
isolated to a component or layer
à need a systems-‐wide, architectural viewpoint
• New theore=cal and prac=cal founda=ons of secure
compu=ng that integrate security in the design process
à security “by default”
à Remove program error as a source of vulnerability
• Need researchers from diverse disciplines, e.g., systems and applica=on soTware designers, architects and
digital system designers to tackle the problem
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
•
Preven=on
: Increasing the difficulty of aKacks
•
Resilience
: Allowing a system to remain
func=onal despite aKacks
•
Detec=on and Recovery:
Allowing systems to
more quickly detect and recover from aKacks to
fully func=onal state.
Can implement security func=onality in
hardware, e.g.,
– Encryp=on
– Message authen=ca=on
– Network packet inspec=on
– Etc.
to improve performance and lower energy
The Obvious Role of Hardware
in Cybersecurity
Preven=on
Tradi=onal Device Authen=ca=on
• Each IC needs to be unique
– Embed a unique secret key SK in on-‐chip non-‐
vola=le memory
• Use cryptography to authen=cate an IC
• Cryptographic opera=ons can address other problems such as protec=ng IP or secure communica=on
Sends a random number
Sign the number with a secret key
à Only the IC’s key can generate a valid signature IC with a secret key IC’s Public Key
BUT…
• How to generate and store secret keys on ICs in a secure and
inexpensive way?
– Adversaries may physically extract
secret keys from non-‐vola=le memory
– Trusted party must embed and test
secret keys in a secure loca=on
• What if cryptography is NOT available?
– Extremely resource (power)
constrained systems such as passive RFIDs
– Commodity ICs such as FPGAs
Invasive probing
Non-invasive measurement
Physical Unclonable Func=ons
(PUFs)
• Extract secrets from a complex physical system
• Because of random process varia=ons, no two
Integrated Circuits even with the same layouts are iden=cal
– Varia=on is inherent in fabrica=on process
– Hard to remove or predict
• Delay-‐Based Silicon PUF concept (2002)
– Generate keys from unique delay features of chips
Combinatorial Circuit" Challenge"
Why PUFs?
• PUF can enable secure, low-‐cost authen=ca=on w/o crypto
– Use PUF as a func=on: challenge response
– Only an authen=c IC can produce a correct response for a challenge
– Inexpensive: no special fabrica=on technique
• PUF can generate a unique secret key / ID
– Physically secure: vola=le secrets, no need for trusted
programming
– Can integrate key genera=on into a secure processor
PUF n
An Arbiter-‐Based Silicon PUF
• Compare two paths with an iden=cal delay in design
– Random process varia=on determines which path is faster
– An arbiter outputs 1-‐bit digital response
• Mul=ple bits can be obtained by either duplicate the circuit or use different challenges
– Each challenge selects a unique pair of delay paths
…
"
n-bit! Challenge! Rising Edge! 1 if top! path is ! faster,! else 0! D Q 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 0 1 0 1 G Response!18
Arbiter Experiments
0 5 10 15 20 25 0 16 32 48 64 80 96 112 128 M il li on s
Code distance [Bits]
PUF Response: Average Code Distances
128 (2x64) bit, RFID MUX PUF Rev.Ax1 M3 vs. Rev.Ax8 M3 @ +25°C
Intra-chip @ Rev.Ax1 Inter-chip @ Rev.Ax1 Intra-chip @ Rev.Ax8 Inter-chip @ Rev.Ax8 64 stage 512 stage
19
Arbiter is not a PUF (clonable!)
• Introduced in 2003 paper, shown in same paper to be suscep=ble to a machine learning model-‐building aKack
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 M od el ou tpu t m at ch le ve l t o 16 ,3 84 bi ts of r ea l R ev .A P U F da ta ( te ac hi ng s et inc lud ed)
Number of challenges (= single response bits) taught
Rev.A PUF Model/Data Correlation Levels
CFMin CFAvg CFMax Need to add nonlinearity to circuit
XOR Arbiter PUF
•
Can process and combine outputs of mul=ple
PUFs
•
Simplest version: XOR opera=on
PUF Circuit PUF Circuit PUF Circuit PUF Circuit n-bit! Challenge! XOR of k PUFs each with n stages
XOR Arbiter PUF Security
•
Machine learning complexity appears to grow
as O(n
k+1) for k-‐way XOR over n-‐stage PUFs
– Size of circuit grows as O(nk)
•
N = 64, k = 6 is on the edge of being broken
8-‐way XOR experiments
22 0 5 10 15 20 25 30 0 16 32 48 64 80 96 112 128 M il li on sCode distance [Bits]
PUF Response: Average Code Distances
128 (2x64) bit, RFID MUX PUF Rev.B vs. (synthesized) Rev.Bx2XOR @ +25°C
Intra-chip @ Rev.B Inter-chip @ Rev.B Intra-chip @ Rev.Bx2 Inter-chip @ Rev.Bx2 8-way XOR 4-way XOR
Current Limita=ons of PUFs
• PUF-‐based authen=ca=on not cryptographically secure, i.e., not reducible to established hard problems
– Combined machine learning and side channel aKacks have
broken many candidates
– New candidates con=nually being proposed
• Key genera=on needs helper data
– Many more bits than key bits
• Proofs of no leakage from helper data make some untested assump=ons
Resilience Under AKack
Trusted Compu=ng Base
The trusted compu=ng base (TCB) is the set of soTware and hardware components that need to be trusted by a user
In TPM-‐based systems, the TPM aKests the veracity of several millions of lines of (buggy) OS code
The TPM does not provide real security
In the cloud, the TCB is > 20M lines of code from tens of soTware vendors
Applica=on: Secure Cloud
Compu=ng
• Separa=ng processing from access via encryp=on:
– I will encrypt my data before sending it to the
cloud
– They will apply their processing on the encrypted
data, send me processed (s=ll encrypted) result
– I will decrypt the result and get my answer
I want to delegate processing of my data, without giving away access to it.
An Analogy: Alice
’
s Jewelry Store
Courtesy: C. Gentry
•
Alice
’
s workers need to assemble raw
materials into jewelry
•
But Alice is worried about theT
How can the workers process the raw materials without having access to them?
An Analogy: Alice
’
s Jewelry Store
Courtesy: C. Gentry
•
Alice puts materials in locked glove box
– For which only she has the key
•
Workers assemble jewelry in the box
•
Alice unlocks box to get
“
results
”
The Analogy
• Encrypt: puvng things inside the box
– Alice does this using her key
– ci ß Enc(mi)
• Decrypt: Taking things out of the box
– Only Alice can do it, requires the key – m* ß Dec(c*)
• Process: Assembling the jewelry
– Anyone can do it, compu=ng on ciphertext – c* ß Process(c1,…,cn)
•
Encrypted computa=on can thus be achieved
using Fully Homomorphic Encryp=on (FHE)
without trus2ng anything on the server side
– Server does not need to store a secret key
•
Unfortunately, FHE overheads are about 10
8to
10
9for straight-‐line code and overheads grow if
there is complex control in the program
•
Only usable for simple computa=ons
What About Hardware Approaches?
Tamper Resistant Hardware
•
Tamper resistant hardware
– The secure processor is trusted, shares secret key
with client.
– Private informa=on stored in the hardware is not
accessible through external means.
Tamper Resistant Hardware
•
Limita=ons
– Just trus=ng the tamper resistance of the chip is
not enough!
– I/O channels of the secure processor can be
monitored by soTware and leak informa=on
– Examples: address channel, I/O =ming channel
Main Memory
Leakage through Address Channel
Address sequence: 0x00, 0x01, 0x02
Address sequence: 0x00, 0x00, 0x00
for i = 1 to N if (x == 0)
sum += A[i] else
sum += A[0]
•
The value of
x
is leaked through the access
paKern
•
Sensi=ve data exposed by observing the
A Typical Computer TCB
TRUSTED! TRUSTED! CPU! ! D R AM ! Chipset! Network! Thread! L1 $! L2 $! L3 $! D R AM C trl . ! Disk! Main! board! Pr iv e le g e ! BIOS (SMM)!Hypervisor (Ring 0, VMX root) !
OS Kernel (Ring 0) ! App! App! (Ring 3) ! Software…! … Running on hardware! Secure App!
• SGX protects a small codebase!
• Doesn’t trust OS!
• Protected app = “Enclave”!
• Provides a trusted environment:!
- app integrity!
- protects data!
• TCB is the Intel CPU – no off-chip interfaces to secure! Pr iv e le g e ! BIOS (SMM)" Hypervisor (VMX root) " OS Kernel (Ring 0) " App" (Ring 3) " TRUSTED! Enclave" TRUSTED! CPU" " D R AM " Chipset" Network" Thread" L1 $" L2 $" L3 $" D R AM C trl . " Disk" Main" board"
Cache Timing AKack
• (Modern x86 CPU: 4-way set associative L1, L2$, 64B lines)!
• Simultaneous multithreading (hypertheading!)!
• A spy, sharing physical core with victim:!
- malloc enough data to blow the $!
- For each $ line, read to populate all 4 ways!
- Use RDTSC to time reads, log results:!
- Fast: no evictions occurred.!
- Slow: eviction! Victim loaded.!
- Slower: writeback! victim stored !!
Physical"
Memory"
L1 or L2 $"
4 ways"
S sets"
Memory Access PaKern Leakage
• Assume a malicious OS, own scheduling! !Mount the same attack as before.!
• We can do even better!!
!Orchestrate page mappings!
! !cache partitioning!!
!Kernel data separate from attack! !Very low noise!!
Kernel data structures"
Cache timing attack"
on enclave, as before" Kernel" data" structures" Allocation" for" snooping" RAM" Enclave data"
• Protect against all soTware-‐based and some
hardware-‐based aKacks when running untrusted soTware
• An adversary cannot learn a user’s private informa=on by observing the pin traffic of Ascend.
Main Memory
Ascend Security Goal
Ascend Processor:
Two Interac=ve Protocol
•
Data transfer only
happens twice
•
Time=0
– Input data fed into
Ascend (stored in ORAM)
•
Time=T
– Output data returned
Oblivious RAM
Periodic Access CPU Ascend ORAM Interface A E SPublic Input (from the server)
Enc(Private input) (from the user)
AES
L2$ L1$
Time = 0
Enc( final result )
Time = T
Main Memory (4GB ORAM)
Oblivious RAM
•
Oblivious RAM (ORAM) [1]
– ORAM allows a client to conceal its access paKern
to the remote storage by con=nuously shuffling and re-‐encryp=ng data as they are accessed.
– Any two access sequences of same length are
computa=onally indis=nguishable.
– ORAM does not protect =ming channel, i.e., when
accesses are made can s=ll leak informa=on.
[1] O. Goldreich and R. Ostrovsky. Software protection and simulation on oblivious RAMs. J. ACM, 1996.
Naïve Oblivious RAM
•
Naïve ORAM
– Each access touches all the N data blocks in main
memory
– Blocks are read, re-‐encrypted using probabilis=c
encryp=on and wriKen back
– Dummy blocks are filled to obfuscate memory
footprint.
Path ORAM
• Path ORAM is organized as a binary tree.
• Each node contains Z
data blocks (cache lines)
• Unoccupied nodes are filled with dummy blocks
– Dummy and real blocks are indis=nguishable aTer encryp=on Binary Tree Trusted Coprocessor ORAM Interface on-chip off-chip 0 1 2 3 L levels
Path Oblivious RAM
•
Path ORAM*
– Each access only touches O(log(N)) data blocks.
– The most prac=cal ORAM scheme for hardware
* Path ORAM: An Extremely Simple Oblivious RAM Protocol, Emil Stefanov, Marten van Dijk, Elaine Shi, Christopher Fletcher, Ling Ren, Xiangyao Yu, Srinivas Devadas, CCS, 2013.
Compute-‐bound applica2ons: < 2X overhead Memory-‐bound applica2ons: 5-‐10X overhead
Current Limita=ons of Ascend
•
Batch computa=on model
– The channel between the user and Ascend is only
used at beginning and end of computa=on.
•
No interac=on during computa=on
– User input/output, network, disk, etc.
Network
Disk
Oblivious RAM
Detec=on and Recovery
AKacks on Integrity
•
Some=mes one is only concerned with
obtaining correct results, not privacy leakage
•
Integrity of storage (malicious errors) implies
reliability of storage (random errors)
– Solu=on: Cryptographic hash func=ons
•
Reliability and integrity of computa=on is a
harder problem
– Errors can have catastrophic effects
– Many possible aKacks on computa=on
Redundant Computa=ons
•
Redundancy in the form of retries or parallel
computa=ons is key to recovery
•
Challenge is to keep overheads manageable
à
hardware can help
•
Key idea
: Hardware computes confidence
informa=on for each computa=on
– Confidence low on data from an external source,
high on data from trusted sources
Informa=on Flow Tracking
Tracking Confidence
• Architect a processor to track the flow of informa=on
through the code
This can be done in soTware albeit with greater overhead
Worked well for buffer overflow aKacks
Tracking “calculus” becomes more complicated under
more sophis=cated aKacks
Abort computa=on or retry when confidence falls
below threshold
Public-‐Model
Physical Unclonable Func=ons
• Concept proposed by Koushanfar, Potkonjak
• Simula=ng the public model takes much longer than evalua=ng the system
Give me response to challenge
Hardware Trojans
•
Suppose the
manufacturer of
the chip is not
trusted
•
How can we
protect against a
malicious
Securing Interac=on Under
Untrusted SoTware
•
Securing interac=ons with memory is easy
– Encryp=on, integrity verifica=on, ORAM
•
What about keyboards, displays, network, and
other I/O devices?
– How to authen=cate a keypress?
– How to authen=cate what is being shown on a
display?