www.surrey.ac.uk
Cloud Security and
Performance
Dr Lee Gillam
Department of Computing
University of Surrey
Motivation 1
Security and performance (resilience) are
key Cloud issues.….. and price.
See, for example, the Cloud Circle report:
Motivation 2
Deperimeterization
of any kind presents risks, not
least to Intellectual Property
£9.2bn for UK
OCSIA/Detica report – notion of insider assistance
– vs –
$300bn recently mooted in US study
The National Bureau of Asian Research - “literally copy patents from any country and have them filed and granted in China as a utility
Motivation 2
Deperimeterization
is inherent in Supply Chains and
in Cloud.
So, who is an insider? (How) Do we control them?
Structure
Part 1: When perimeters down, information can leak.
Policing (monitoring) versus command and control
But can you search for what you don’t want to expose?
Part 2: Are all Clouds (infrastructures) made the same.
What can you expect in performance
General Cloud Security
Protect data in transit - Protect data at rest - Design for privacy Protect Cloud credentials (e.g. AWS keys / certificates / “login”)
Secure your application, O/S etc.
Secure the wetware?
Adapted from:
From /var/log/auth.log (ssh “open”)
Apr 22 21:10:09 ip-10-2-34-82 sshd[14681]: Failed password for invalid user harvey from 85.214.234.251 port 55948 ssh2
Apr 22 21:14:54 ip-10-2-34-82 sshd[14683]: Failed password for invalid user hasok from 85.214.234.251 port 38196 ssh2
Apr 22 21:19:41 ip-10-2-34-82 sshd[14688]: Failed password for invalid user hassan from 85.214.234.251 port 51093 ssh2
…
Apr 22 22:21:39 ip-10-2-34-82 sshd[14717]: Failed password for invalid user helen from 85.214.234.251 port 45258 ssh2
Apr 22 22:26:25 ip-10-2-34-82 sshd[14719]: Failed password for invalid user helena from 85.214.234.251 port 40891 ssh2
Cloud Security – Privacy Design
Strict
separation of sensitive
data from nonsensitive;
encryption of sensitive. See: Reese, p80 et seq.
And heterogeneous systems good!
Organisation-centric consideration: -
customer data?
(77m users of the PlayStation)
Addresses without names? Credit card numbers without names / CCV?
Cloud Security - Tokenization
Systematic data substitution – generate randomly
Credit Card Number: 5434 0201 1199 4444 Tokenized Card Number 9271 8843 2088 7143
Secure the lookup, use tokenized value everywhere.
Cannot derive data from token.
Credit Card Number: 5434 0201 1199 4444 Encrypted Card Number Azk92l1f#*xz&#Ynv#*….
All well and good …..
But what about when you have to
share real stuff for people to
read
?
IPCRESS
IP Protecting Cloud Services in Supply Chains (IPCRESS)
R&D project involving Jaguar Land Rover, GeoLang, Surrey, co-funded by TSB, Innovating in the Cloud – see 2pm, Andrew Tyrer
People need to see stuff in order to do stuff. Especially
relevant in new supply chains.
Can you build parts for a car without knowing shape, etc.?
IPCRESS
- Analogies
X is a secure system. Y is not; Wetware/meatware bridge: X compromised.
And we don’t want to connect Y to X to detect leaks The ‘superinjunction’ (??) problem:
How to find out if data in X has been exposed without suggesting the identity of the
[Manchester United and Wales footballer / Downton Abbey actor]
“private search” – without revealing (or
encrypting / hashing) the query
IPCRESS
– Can it work?
Research prototype evaluated for plagiarism detection
(PAN*), first run on AWS
Latest precision figure (April 2013): 0.88487
‘On data that cannot be seen’
IPCRESS
- How
IPCRESS’ patterns “difficult” to reverse engineer –
hash-like, but ‘highly ambiguous’ or ‘lossy’ hash:
make patterns public
…..
and seriously scare the data security specialists!
Compact representation – efficient
Patents filed - US and PCT
A kind of search engine (document [passages] [passages] documents), in a federation-like supply chain interaction;
IPCRESS
– and beyond
Applicability to other kinds of data – new
research questions – video/audio?
And, how fast it can run will depend on the
performance
of the Cloud systems it runs
on……
PART 2: CLOUD
PERFORMANCE
Clouds fail…
“
Everything fails, all the time. We lose
whole datacenters! Those things
happen
”. Vogels, CTO Amazon.com
Cloud Performance
We used micro-benchmarks as a means to
an end.
Performance of applications predicated on
performance of underlying (virtualized) resources
We know what the label on the box says, but what
kind of present do we actually get?
Probability of good/bad performance? Scaling?
What measurements are important for whom?
Cloud Performance
Tested AWS (several regions), Rackspace (UK, US),
IBM SmartCloud, OpenStack at Surrey.
Different machine types (flavours), and 2 Linux distros
Using Bonnie++ and IOZone for disk, LINPACK for CPU flops, STREAM for memory bandwidth, iPerf for network bandwidth, MPPTEST for MPI, and a bzip2 application benchmark
We want “simplicity so the results are understandable”, following Gray – non-optimized: “out of the box”
Gray, J. (ed.) (1993), “The Benchmark Handbook For Database and Transaction Processing Systems”. Morgan Kaufmann.
Cloud Performance
Quality of Service (QoS) for Service Level Agreements
(SLAs)
SLAs for
Brokers
(provider agreements may remain
unchanged or barely worth the RoI of even reading)
A Cloud Broker has no resources of its own
Cloud Broker(s)
Opportunity value overall: Gartner, $100bn by 2014
Types [Gartner / NIST]
Aggregation: multiples of similar stuff
Arbitrage: dynamic pricing (credit scoring cloud providers?)
Customization/Intermediation: “better” view / value add for extant
services
Integration: simplified view of multiples of different stuff NIST Special Publication 500‐291 – 3 types, underlined
Cloud Broker(s)
As a means to an end?
AWS CloudWatch, alarms for various metrics.
But, how do predict what to scale up/down to? – Cost management
QoS parameters in SLAs allows for Cloud Brokers: quality (performance) as differentiator.
Fixed price not a dynamic market; performance-based … part way there?
Probability of, and penalty for, failure. And, so, a market in “insurance”?
An inspiration:
Collateralized Debt Obligations (CDOs)
• Underlying assets – CDS, a spread indicates level of risk
• Potential for default
• Default correlations important
• Lower order tranches take
losses first Financial CDO
Li, B., Gillam, L., and O'Loughlin, J. (2010) Towards Application-Specific Service Level Agreements: Experiments in Clouds and Grids, In Antonopoulos and Gillam (Eds.), Cloud Computing: Principles, Systems and Applications. Springer-Verlag. Li, B., and Gillam, L. (2009), Towards Job-specific Service Level Agreements in the Cloud, Cloud-based Services and Applications, in 5th IEEE e-Science International Conference, Oxford, UK.
Take away messages
Part 1: When perimeters down, information leaks.
Protect that which is truly high value
Expect it to leak, and think about how to monitor for leaks
Part 2: All Clouds (infrastructures) are not the same.
Performance can be known after purchase
Take away messages
The Cloud-qualified are out there
: 79 MSc students
over 4 years on a module about Cloud Computing
Principles, definitions, etc.
Relatives in Grids, HPC, P2P, Mainframes
Management, Security, Governance
AWS, Google App Engine, MapReduce, OpenStack
Gillam, L., Li, B. and O’Loughlin, J. (2012). Teaching Clouds: Lessons Taught and Lessons Learnt. In Cloud Computing for Teaching and Learning: Strategies for Design and
Take away messages
Literature is available – JoCCASA, articles
free to everyone, for ever….
A financial brokerage model for cloud computing
A constraints-based resource discovery model for
multi-provider cloud environments
A multi-level security model for partitioning
workflows over federated clouds
Contact:
[email protected]
Qs?
The work presented here has been supported in part by the TSB (IPCRESS) and in the recent past by EPSRC, JISC, TSB (KTP), amongst others.