SCALABLE FILE SHARING AND DATA
MANAGEMENT FOR INTERNET OF THINGS
Sean Lee
Solution Architect, SDI, IBM Systems
Agenda
• Converging Technology Forces
• New Generation Applications
• Data Management Challenges for IoT
• IBM Spectrum Scale and Software Defined Infrastructure Solutions
• Q&A
Converging Technology Forces
•Data The basis for competitive
advantage
•Cloud The growth engine for
business
•Engagement Changing
expectations, fueled by mobile and
social
New Generation of Applications
Storage Virtualization Spectrum Scale
V1 V2 V3 V4 V5 V5 ... …. Vn C
C
Systems of Record
Transactions and processes
Controlled data growth
Efficiency through virtualization Traditional Applications
Insights and engagement
Rapid pace and massive data scale
Global elasticity
Systems of Engagement
New Generation Applications
Integrated Storage Management and Data Protection
Top Data Management Issues from Customer Perspective
Source: InfoPro Survey
0% 10% 20% 30% 40% 50% 60% 70%
Storage Provisioning Archiving / Archive Mgmt Performance Problems Managing Complexity Backup Administration Managing Costs Forecasting / Reporting Managing Storage Growth
Data Processes in IoT
• Explore and Store
– Ingest - handle massive amounts of incoming data
– Process - provide required performance to analytics apps – Store - save valuable data,
discard useless data
• Manage and Secure
– Manage - make data available at the right location, to the right
person, at the right time – Protect - encrypt, securely
delete, backup valuable data
Manage & Protect
Meet New "Data Silo": Data Pool
Big Data HPC
Cloud File
Sharing
Compute
Network
Storage
Compute
Network
Storage
Compute
Network
Storage
Compute
Network
Storage
Rigid and manual assignment of redundant IT resources Low utilization
Needless data movement and copying
Filer 1 Filer 2
Traditional File Storage
Filer 5 Filer 6 Filer 7 Filer 8
Filer 1 Filer 2 Filer 3 Filer 4
More traditional file storage
Filer 1 Filer 2 Filer 3 Filer 4
with automated storage tiering Global Namespace
Solution: IBM Spectrum Scale
Global Namespace
with automated storage tiering
Parallel, Scale-Out File Sharing and Data Management Solution
IBM Spectrum Scale Helps You Handle Data Lakes
Spectrum Scale has a
maximum file system
size of one million
yottabytes
Designed to scale
2
63(~9 quintillion)
files per file system
1 to 16,384 nodes
in a single cluster
IPv6 support
FUTURE PROOF
Multiples of bytes Metric YB yottabyte ZB zettabyte EB exabyte PB petabyte TB terabyte GB gigabyte MB megabyte
Here’s where an admin usually hits “the data management wall”
Spectrum Scale enables virtually limitless performance and capacity
IBM Spectrum Scale from IBM Spectrum Storage Family
What is IBM Spectrum Scale?
• Software Defined Storage
solution for traditional and
new era applications
– All delivered through software,
based upon IBM General
Parallel File System
– Delivered as software, ready-
to-use appliances and cloud-
based SaaS
IBM SPECTRUMSCALE
Spectrum Scale (software)
Elastic Storage (SaaS)
Elastic Storage Server (appliance)
Software Defined Storage for Internet of Things
compute nodes
IBM Spectrum Scale Data Ingest
Computation
& Analytics
Local Data Access High Output Instruments
High Volume Devices
Instruments
Images
Tape Library
GPFS
GPFS GPFS
Low Latency Global Data Access
Auto-tiering and Migration
Sensors Single Name Space
Videos
• Right Data
• Right Place
• Right Time
• Right Performance
• 400
GB/s
20 PB
Deployment with Couplets, Tiers and Tape Archive
Spectrum Scale Clients (GPFS, NFS, CIFS, HTTP)
Spectrum Scale Servers and Storage
Client-side NETWORK Storage-side NETWORK
Building Block #1 Building Block #2 Building Block #3 Building Block #4
Single Global Namespace (One File System) up to hundreds of PB's
SAS
SSD
NL-SAS
NL-SAS NL-SAS
NL-SAS TAPE
Ethernet/InfiniBand
Multiple PB's in single name space:
G:\ (Windows)
/mnt/data (Linux, UNIX)
How Spectrum Scale Helps with Data Management at Scale
Commodity Hardware
Non-disruptive data migration
Flash Acceleration Network performance
monitoring
File Placement Optimization
Native Encryption And Secure Erase
Global Active File Management
Advanced Mirroring and Caching Services
Common Management
Cloud Ready
High speed scanning engine
Policy based data migration Drop in replacement for
Hadoop File System
Other cloud
Complementary Solutions: Aspera High Speed File Transfer
• Software solution
• Moves your data to/from
Spectrum Scale at maximum
speed, regardless of file size,
transfer distance and network
conditions
• FASP™ - highly efficient bulk
data transport technology
“Aspera is the industry standard for the transport and
management of large data files produced by life
sciences.”
Sifei He, Cloud Product Director, BGI*
Complementary Solutions: Big Storage Active Archive
• What is it?
– Enterprise-class cloud based
storage archive service for clients who need to store large amounts of data (TBs) and easily retrieve it on demand.
– Low cost, long term tape-based
archive for massive amounts of data
• Who should use it?
– Long term archives from TB+ to PB+
size
• How to use it?
– Supports both File and Object storage via standard POSIX
interface and standard Swift API (HTTP(S))
Integrates with local and cloud storage
Other service providers
Private Cloud Big Storage
Active Archive
Spectrum Scale Spectrum
Scale
Success Story: Spectrum Scale in Telecommunications
• Major communications service provider (CSP) in
China
• Challenge: CDR/XDR ingest & analytics
– Quick and reliable data loading – Fast query/analysis
– Platform scalability for future growth
• Solution
– IBM Spectrum Scale, Platform Symphony (for Map Reduce ("Hadoop") workload management) and IBM Linux on Power
• Results
– IBM's solution was faster than the competitor's while using 50% fewer servers
– Data loading: 400% faster
– Scalability and query speed execution: 300% more users and >200% faster