Storage in Cloud Environments
Cloud Storage – Storage Cloud
Dietmar Noll | dnoll@de.ibm.com
IBM Storage Software Development
11 March 2010
Objectives
Understand the role of storage in cloud environments
Understand the storage-related requirements / goals in a cloud, both from
the consumer and from the provider perspective
Know the different solution approaches available today to implement
storage in cloud environments
Introduction
Requirements
Implementation Options
Cloud Goals
First and foremost, clients want a lower cost option
They also want to only pay for what they use.
And they want to get it fast, when they need it.
Finally, they want it to be really easy to manage.
Monitor & Manage
Services & Resources
Cloud
Administrator
Datacenter
Infrastructure
Service Catalog,
Component
Library
Service Consumers
Component Vendors/
Software Publishers
Publish & Update
Components,
Service Templates
IT Cloud
Access
Services
Why is storage so relevant in the cloud environment?
Data Growth is Exponential
2003 2006 2010
0.8 GB/
person
128 GB/
person
24 GB/
person
The World’s
total data
per person.
Digital Information
Created, Captured,
Replicated WW
2006: 180 exabytes
2007: 280 exabytes
...
2011: 1800 exabytes
(1800 billion gigabytes)
Expected compound annual
growth rate is almost
60%
●
Variety of Information
●
Information Technology
holds the promise of
bringing a variety of new
types of information to the
people who need it
●
Volume of Data
●
Data is growing exponentially
●
Velocity of Change
●
Acquisitions
●
Mergers
●
Consolidations
●
ILM, Data Retention
initiatives
Sources:
IDC, Worldwide Disk Storage Systems 2007-2011 Forecast Update, Doc #209490
IDC Whitepaper: The Diverse and Exploding Digitall Universe, March 2008
Why is storage so relevant in the cloud environment?
The Information Tidal Wave Continues...
External disk shipments & price (History & Forecast)
319 382 600 870 1403 2217 3402 5341 8360 13184 20786 32838 $96.93 $58.27 $34.90 $24.46 $15.93 $11.11 $7.71 $5.22 $3.48 $2.30 $1.51 $0.99 0 5000 10000 15000 20000 25000 30000 35000 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 0 20 40 60 80 100 120 PB $/GB
■
Hard Drives are CHEAP!
■
Yes, it’s a good thing that the
price of disk drives keeps
coming down because we’re
going to need a lot more of
them to hold all that data.
■
Buy more disk?
■
More floor space?
■
More power requirements?
■More people?
■
More skills?
■
More technologies?
■
What’s our strategy?
■
Leverage an integrated
management approach,
providing smart support
to better utilize the
storage environment
Source: IDC 2008 DC, 2008Globally, storage requirement is 80% file-based unstructured data,
and growing
Explosion of data, transactions, and
digitally-aware devices
strains IT
infrastructure and operations. Storage
capacity is doubling every 18 months.
Majority of this data is unstructured
file-based,
such as user files, medical images,
web and rich media content, growing at
63%
Block storage, while still well suited for
existing OLTP/database workloads, is not
where majority of strategic analytics-based
applications and strategic storage initiatives
are being deployed
Source: IDC, State of File-Based Storage Use in Organizations: Results from IDC's 2009 Trends in File-Based Storage Survey: Dec 2009: Doc # 221138
Worldwide Storage Capacity Shipped by
Segment, 2008–2013
Primary Workloads for Storage Clouds
Web Content
Store
Collaboration
Data & General
File Storage
Energy &
Geo-Sciences
High Performance
Analytics
Digital Media
Hyper-scalable storage for large Web 2.0
stores and for other vendors looking to
build their own Cloud/SaaS applications
General purpose file storage environments
where clients are challenged with the
manageability of current NAS systems
Energy exploration and geo-sciences
require huge addressable namespaces and
very high performance.
Business applications such as financial
services interested in cloud deployments
with single namespace
High performance, simplified management
for widely varying use cases in digital
media environments.
CAE
Auto / Aero / Electronics design processes
experiencing rapid file-centric storage
growth as simulation expands.
Retail Banking & Financial Markets
Chemical & Petroleum
Cloud Computing Reference Architecture Overview
Cloud Storage Focus
Cloud Service Developer
Cloud Service Provider
Security & Resiliency
Service Development
Tools
Common Cloud Management Platform
OSS – Operational Support Services
Operational-level functionality for management of Cloud Services
BSS – Business Support Services
Business-level functionality for management of Cloud Services
Cloud Services
IT capability provided to Cloud Service Consumer
(Virtualized) Infrastructure – Server, Storage, Network, Facilities
Infrastructure for hosting Cloud Services and Common Cloud Management Platform
Cloud Service Consumer
Partner Clouds
Consumer In-house IT
•
File Storage Systems
•Block Storage Systems
•Backup Systems
•
Archiving Systems
•SAN Connectivity
•
Storage Provisioning
•Storage Monitoring and
Event Management
•
Capacity and Performance
Management
•
Storage Virtualization
Management
Storage-related Requirements
Consumer Perspective
“Pay as you go”
“Elasticity”
“Simplicity”
Reduce Costs
– Reduce CAPEX and OPEX
– Increase Efficiency through
virtualization and optimization
– Flexible sourcing
Manage Risk
– Security
– Resiliency
– Compliance
Improve Service
– Higher Quality
– Higher Availability
– Higher Flexibility
Storage-related Requirements
Consumer Perspective
Reduce Costs
– Reduce CAPEX and OPEX
– Increase Efficiency through
virtualization and optimization
– Flexible sourcing
Manage Risk
– Security
– Resiliency
– Compliance
Improve Service
– Higher Quality
– Higher Availability
– Higher Flexibility
Reduce Information Management Costs
– Reduce/avoid investments in storage HW and SW
– Leverage latest virtualization technology and
efficient processes
– Avoid vendor dependencies
Manage Storage-related Risks
– Ensure physical and logical security
– Leverage backup and disaster recovery skills and
technologies
– Ensure information retention periods and
accessibility
Improve Storage Services
– Provide service based on application requirements
– Leverage storage availability (HA) skills and
technologies
– Quickly react on requests without dependency on
own HW/SW.
Storage-related Requirements
Provider Perspective
Multi-Tenancy
– Provide appropriate management separation for the storage
environment while still allowing integrated management of all storage aspects
Security
– Prevent (logically and physically) unauthorized access or even modification to
data stored in the environment
High Availability
– Ensure the data can be accessed according to the needs at any time
Utilization
– Optimally leverage the assets available in the infrastructure to avoid
unnecessary investments
Monitoring
– Achieve situation awareness about the health of the environment and react on
events in the environment
Storage-related Requirements
Provider Perspective - continued
Metering
– Capture information about the environments capabilities (storage
capacity, performance, etc.) and usage and keep historical records of it
Reporting
– Allow visualization and analysis of the data collected about the storage
environment to be used for regular reports, problem determination, etc.
Planing
– Combine the information about the current and historic environment as
well as other sources to plan future changes to the environment to
optimize ROI
Automation
– Achieve a high level of automation of configuration tasks to minimize
OPEX.
Defining Requirements for Storage Services
Service Level Categories – Service Level Objectives
Accessability
– Initial Access Time
– Data Sharing
– Requires Access Transparency
– May Out Of Space Duration
Availability
– Availability Period
– Planned Downtime
– Max. Unplanned Downtime Aggregate
– Max Unplanned Downtime Per Instance
– Recovery Point Objective (RPO)
– Recovery Time Objective (RTO)
– Consistency
– Number of Copies
– Number of Versions
– Retain Deleted
Performance
– Avg. I/O Rate
– Avg. Data Throughput
Retention / Compliance
– Immutability
– Disposal
– Durability
– Retention Period
Security
– Accountability
– Integrity
– Authenticity
– Confidentiality
– Physical Security
Defining Requirements for Storage Services
Service classes / SLA templates to allow simple classification
Accessability
Availability
Performance
Consistency
Retention / Compliance
Security
Accessability
Availability
Performance
Consistency
Retention / Compliance
Security
Service Class “Platinum”
Service Class “Platinum”
Service Class “Gold”
Service Class “Gold”
Service Class “Silver”
Service Class “Silver”
Service Class “Bronze”
Storage Approaches
Block Storage – File Storage | DAS – SAN – NAS
FC Network
LAN
Block Storage
File Storage
DAS
SAN
NAS
ATA SATA SAS SCSI FCP FCoE iSCSI NFS SMB/CIFS FTP SCP
Traditional Storage Overview
Type/Technology
Type/Technology
Benefits
Benefits
Constraints
Constraints
Application
Application
Media – CD/DVD/Diskettes…
cheap, convenient, portable, compatible, long life, ….
low capacity, moderate speed, ….. Archiving, distribution, migration, local sharing, ….
HD - Internal/External most common form of storage, high speed
limited capacity, local environment storage in single computer
Tape low cost, portability, unlimited capacity
Slow/uneasy recovery of individual files/groups of files
data archiving, low-budget businesses, offsite storage DAS - Direct-Attached
Storage
simplicity, low initial cost, ease of management
individual server, admin, and data transfer in network env.
Data sharing, backup, archiving, app sharing
Disk Library high speed, capacity, availability not quickly accessible as DAS "write once, read rarely“ Data
RAID - Redundant Array of Independent Disks
high speed, capacity, availability, reliability
security & fault tolerance
recovery could be difficult high cost for system optimization
swap files
Internet Service Providers redundant storage SAN - Storage Area
Network
large block data storage Reliability, availability, fault
tolerance, scalability
high cost lack of standardization management complexity
large databases applications need bandwidth
mission-critical applications NAS – Network-Attached
Storage
multi-client fast file access easy file sharing, replication,
redundancy, consolidation
Less convenient than SAN for moving large blocks of data
data backup data archiving redundant storage Fibre Channel for data
transmission in SAN
gigabit speed large data xfer flexible distance btw devices
high cost
management complexity
see SAN
iSCSI for IP-based data transmission in SAN
IP protocols longer distance than Fibre
Slower than Fibre management complexity
Computer
Management
System
OSS
Computer
Virtualized
Computer
Operating System
Applications
Infrastructure
Storage in the Cloud
Implementation Approaches and Management
Block
Storage
File
Storage
Data
Services
Data
Path
Storage
Management
System
Mgmt. Path Mgmt. Path M a na ge men t Pa th Different interfaces used for application level / infrastructurelevel data access Integration between
different management domains to coordinate
operations
Additional management path needed to give
applications/operating systems control over certain
aspects of storage management
Data Management in Cloud Storage
Cloud Storage
Online Storage
Mirror
Copies
Backup
Copies
Archive
Copies
Meta Data
&
Policies
Storage
Management
System
Primary copy of data managed, which the application is using during its normal
operation.
Supplemental data about the data managed as well as
policies related to the management of this data
Multiple copies managed by the
cloud to fulfill SLAs
Storage Technologies in Cloud Storage
Storage Virtualization
Storage Technologies in Cloud Storage
Block Storage Virtualization
Virtualized
Block Storage
Large number of direct connections make management complex and
inefficient
Distinct storage systems mean separation of resources, which leads to
imbalanced utilization
Changes to the storage systems have direct impact
on the storage consumers, potentially causing downtime
Virtualization hides complexity of the environment to ease
management from a consumer perspective, increase utilization and decrease dependency Virtualizer needs to have
appropriate scalability and redundancy characteristics Virtualized environment reduces dependencies and increases performance and availability
Block Storage Virtualization – IBM SAN Volume Controller
How does it work?
Backend Volumes
Unmanaged Disks
MDisks Groups
Virtual Disks
Logical Volumes
Managed Disks
Block Storage Virtualization – IBM SAN Volume Controller
Dynamic Infrastructure Support
Dynamically scale performance
9
For performance sensitive applications,
dynamically add more performance to your
existing capacity by adding controller pairs
9
…or mix with additional capacity…
Dynamically scale capacity
9
For high capacity applications such as archive,
dynamically add capacity by adding
disk enclosures
9
…or mix with additional performance…
Scale capacity in tiers
9
Implement a tiered storage infrastructure…
9
…with common management interfaces and
software functions
Storage Technologies in Cloud Storage
File Storage Virtualization
Virtualized
File Storage
Same problems as seen for block storage: high dependency/complexity, reduced utilization/availability Additional complexity
due to high number of entities, identity management and high
dynamics
Equivalent approach to achieve the same goals
as seen for block storage: increased utilization, reduced
dependencies, improved availability
File Storage Virtualization – IBM SoNAS
How does it work?
…
…
…
>
….. ….. ...
scale
out
scale
out
/home/appl/data/web/important_big_spreadsheet.xls /home/appl/data/web/big_architecture_drawing.ppt /home/appl/data/web/unstructured_big_video.mpg /home /appl /data /web /home/appl/data/web/important_big_spreadsheet.xls /home/appl/data/web/big_architecture_drawing.ppt /home/appl/data/web/unstructured_big_video.mpgIBM Scale Out NAS
Policy Engine
Global Namespace
Note:
all three files,
in same directory,
but each allocated to
different
physical
storage pool
Data striped across
all disks in storage
pool.
High performance,
tuning,
auto-load balancing
Tier 1: SAS drives Tier 2: 1TB SATA drives Tier 3: 2TB SATA drives
Interface nodes Interface nodes Interface nodes Storage nodes Storage nodes
>
Storage nodes Logical Physic alFile Storage Virtualization – IBM SoNAS
Dynamic Infrastructure Support
Dynamically scale performance
9
For performance sensitive applications,
dynamically add more performance to your
existing capacity by adding controller pairs
9
…or mix with additional capacity…
Dynamically scale capacity
9
For high capacity applications such as archive,
dynamically add capacity by adding
disk enclosures
9
…or mix with additional performance…
Scale capacity in tiers
9
Implement a tiered storage infrastructure…
9
…with common management interfaces and
software functions
Storage Cloud Offering Examples
Name
Name
Description
Description
Google Docs Allows users to upload documents, spreadsheets and presentations to Google's data servers. Users can edit files using a Google application. Users can also publish documents so that other people can read them or even make edits. Web e-mail providers like Gmail, Hotmail and Yahoo! Mail store e-mail messages on their own servers. Users can access their e-mail from
computers and other devices connected to the Internet.
Web Digital Image sites Sites like Flickr and Picasa host millions of digital photographs. Their users create online photo albums by uploading pictures directly to the services' servers.
Facebook Social networking sites like Facebook and MySpace allow members to post pictures and other content. All of that content is stored on the respective site's servers.
Amazon S3
Amazon S3 A storage service that let low to high end subscribers store, retrieve, and share data objects that are application agnostic. Web Service API access. Customer examples include saving personal data to Web startups, offering online services,
back up databases or store archival data.
A storage service that let low to high end subscribers store, retrieve, and share data objects that are application agnostic. Web Service API access. Customer examples include saving personal data to Web startups, offering online services,
back up databases or store archival data. Nirvanix SDN
Nirvanix SDN Managed cloud storage for the enterprise. Stores, delivers, and processes storage requests in system selected location of network of storage nodes. CloudNAS application and Web Service API access. Customer examples include off-site data
protection, Tier-N storage, distributed content and collaboration, and embedded storage
Managed cloud storage for the enterprise. Stores, delivers, and processes storage requests in system selected location of network of storage nodes. CloudNAS application and Web Service API access. Customer examples include off-site data
protection, Tier-N storage, distributed content and collaboration, and embedded storage VMWare & Dell
VMWare & Dell Integrated compute and storage infrastructures for cloud services. Virtualze Storage within Cloud Computing Infrastructures using VMware vSphere 4.0 and Dell EqualLogic iSCSI SANs
Integrated compute and storage infrastructures for cloud services. Virtualze Storage within Cloud Computing Infrastructures using VMware vSphere 4.0 and Dell EqualLogic iSCSI SANs
IBM Smart Business Storage Cloud IBM Smart Business
Storage Cloud
Scalable NAS solution provided a client and its partner network. Customers store, retrieve, and share information in a private environment that drives efficiency, standardization and best practices while retaining greater customization and
control.
Scalable NAS solution provided a client and its partner network. Customers store, retrieve, and share information in a private environment that drives efficiency, standardization and best practices while retaining greater customization and
control.
Integration of Storage Management into Service Management
IBM CONFIDENTIAL 32
Visibility of ….
… all elements and services
in the storage infrastructure
Control of ….
… changes to the storage
infrastructure
Automation of ….
… common actions in the
storage infrastructure to
reduce administration costs
Storage Infrastructure
Topology Visualization
Storage Infrastructure
Reporting
Storage Infrastructure
Health / Event
Monitoring
Consistent execution of
provisioning operations
Raise alerts based on
events and conditions in
the infrastructure
Continuous
configuration checking
Smart selection of target
entity of control
provisioning operations
Continuous optimization
of storage infrastructure
Problem Determination
(Root Cause Suspect
Identification)
Storage Management in the Cloud
Functional Areas
Discovery & Categorization
– Identify entities in storage
environment and collect
information about them
– Identify capabilities based on data
collected
– Change Management
Reporting & Metering
– Provide access to information
about storage infrastructure
entities and items
– Correlate data to allow metering
at various levels, like per
application, per user, etc.
– Keep historical data to allow
usage based charging
Monitoring & Problem Determination
– Receive events from storage infrastructure
and evaluate for relevance / importance
– Constantly check monitored entities for
abnormal conditions and SLA violations
– Support identification of problem root
causes and corrective actions
Provisioning & Optimization
– Support autonomic provisioning of storage
resources
– Provide “intelligence” to de-couple service
request from storage infrastructure, e.g.
TPC planners
– Continuously optimize storage infrastructure
for efficient SLA-compliant resource usage
Storage Management in the Cloud
Direct Functional Coverage
Cloud Service Developer Cloud Service Provider
Common Cloud Management Platform
Virtualized Infrastructure – Server, Storage, Network, Facilities
Cloud Service Consumer Partner Clouds Customer In-house IT Consumer Administrator Consumer Business Manager DeveloperService Business Manager Service Operations Manager
Cloud
Services
User Interface Consumer End user API Software-as-a-Service Platform-as-as-Service Infrastructure-as-a-Service Business-Process-as-a-ServiceMetering, Analytics & Reporting
Service Provider Portal
Service Development Tools Service Definition Tools Image Creation Tools Configuration Mgmt Offering Mgmt Order Mgmt Accounting & Billing
Customer Mgmt Entitlements
Contract Mgmt ReportingSLA
Pricing & Rating
Peering & Settlement Subscriber Mgmt
Service Offering Catalog Invoicing
Service Automation Management
Virtualization Mgmt Provisioning
Monitoring &
Event Management IT Asset & License Management
Service Request Management
IT Service Level Management Image Lifecycle Management
Capacity & Performance Management Incident, Problem &
Change Management
BSS Business Support
System Service Dev
elo
pment Portal
API
Service De
liver
y Portal OSSOperational Support System
Service Transition Manager
Service Security Manager
Security & Resiliency
Service Delivery Catalog Service Templates
Components with red frame are directly related to functional coverage for storage
For the other components, the storage management contributes to some extent, but there