Overcoming Big Data Challenges
Overview
Driven by mobility, Internet of Things (IoT), Social Networking, and video and cloud services, data is growing at exponential rates, toward 40 zeta bytes (1021 bytes) by 2020. This astronomical data expansion creates major challenges for today s enterprises that must deal with the Big Data resulting from this unprecedented growth.
Emerging Challenges
• Data Capacity ‒ Simply consider the amount of retail data collected every hour from online giant Amazon®. YouTubeTM reports
that 100 hours of video are uploaded to their site every hour. Imagine the volume of data stored from daily stock trading ‒1TB of trade information during each trading session. Increases in video resolution from standard MPEG to HD, 4K and now to 8K/ 3D formats also dramatically increase the size of the stored content and the required effective data rates. Today s inefficient WAN infrastructure does not scale in terms of capacity and cost to support this unprecedented data growth over any real distances.
• Data Reach ‒ Data has become unwieldy or dispersed across geo-diverse locations. Various applications such as IoT networks, utility grids, or even oil and gas applications contain geo-diverse sensors (at remote and sometimes inaccessible locations), which send data to a central processing facility on a regular basis. sensor data needs to arrive at the central processing center, which manages and controls the utility within very tight timeframes.
• Data Movement ‒ Large volumes of data bog down today s enterprise. Staff becomes stagnant waiting for data for hours, perhaps even for days or weeks. For data-intensive applications such as life sciences simulation, hundreds of TBs of data must be replicated at geo-diverse sites to enable remote collaboration. This task can take days, costing the enterprise both money and time.
• Data Access ‒ As geo-diverse enterprises or cloud service providers serve distributed customer bases, data centers with mass storage need to be located in more economical and remote locations. Hence, latency sensitive data is stored at longer distances and must be accessed from locations where applications are running. Also, sovereign data needs to stay in-place given data movement restrictions due to political, legal and security reasons. This inherently remote data needs to be accessed as if it were local to the end application.
• Data Timeliness ‒ Some businesses need information to truly be available on-the-fly. Market data delayed by tens of
milliseconds may not be usable for trading purposes, and delays in closing transactions may result in substantial losses to the financial firms as well as their customers. Timeliness of data is critical for improved decision making for such applications. To address these emerging challenges, enterprise datacenter architectures require new machinery ‒ structure, which exists today in High-Performance Computing (HPC) environments and which affords low-latency interconnects, lossless protocols, parallel file systems, and CPU offload. Extending HPC-like structures beyond national laboratories and research institutes into the mainstream will empower enterprises of all sizes as they seek to analyze Big Data to enhance internal operations. Getting the maximum performance from data-intensive applications requires fast and scalable compute, storage, software and networks. Simply put, today s enterprises are dealing with HPC workloads and require storage infrastructure that scales endlessly and delivers unmatched I/O capacity and reach.
Enter Global Storage Fabric
New storage fabric technology makes it possible to combine and extend core HPC and datacenter attributes to enterprises creating a solutions-oriented approach to address the challenges of Big Data on a global scale. By extending storage fabric beyond the four walls of the datacenter, a Global Storage Fabric (see Figure 1) is enabled that allows you to:
• Access all of your data from anywhere on the globe as though it were local • Synchronize all datacenter storage locations within minutes
• Establish location-independent data centers where power / real estate are inexpensive • Optimize production and analytics workflow without regard to locality
• Move TBs of data at speed of light rates within minutes (versus hours or days) and without physically moving storage • Achieve daily Business Continuity needs independent of capacity and reach
• Achieve real-time, in-place Big Data analytics processing
• Distribute real-time, multi-channel, high-definition video without boundaries • Deploy Virtual Private Fabrics (VPFs ) to enable new cloud models
Figure 1. Global Storage Fabric - National Diagram
Key Aspects of a Global Storage Fabric
Global Storage Fabric allows data to be truly location-independent and provides a scalable architecture to handle both data-at-rest and data-in-motion. Geo-diversity and amount of data need to be addressed for both types of data. But with data-in-motion, latency becomes a critical requirement, as the data needs to be processed real-time for it to be
meaningful. Key technologies required to support this scalable storage architecture are: parallel file systems, CPU offload technologies, and native, layer-two fabric extension.
CPU offload technologies and parallel file systems are native to HPC environments but not in traditional enterprise NAS or SAN environments. Hence, for extending performance benefits of CPU offload technologies and parallel file systems to non-HPC environments, it is critical to support conventional file system connections so the existing end-user applications
can seamlessly run on this new Global Storage Fabric. Conventional distributed file system interfaces such as Network File System (NFS) or SMB /Common Internet File System (CIFS) interface for Microsoft® Windows® are required as much as the HPC components to deliver unprecedented performance to all applications including new enterprise and cloud
applications. Parallel File System
A critical piece of this scalable storage fabric is a parallel file system, and it needs to be tuned to handle the global reach of Big Data. There are a number of these advanced file systems available today including: Lustre®, IBM® General Parallel File System (GPFSTM), Ceph, and Gluster. Each can be deployed per their respective merits in different applications. A parallel
file system generally sits on top of scalable object storage and its advantages are: • wide scalability - both in terms of performance and storage capacity
• a global name space
• the ability to distribute very large files across many nodes
Because large files are shared across many nodes in the typical cluster environment, a parallel file system, is ideal for high-end HPC cluster I/O systems, and now, for most enterprise environments.
CPU Offload Technologies
Augmenting current implementations with CPU offload technologies allows for a dramatic reduction in access latency across network and storage resources, offering nanoseconds versus milliseconds of latency optimization, delivering a measurable throughput increase. A primary benefit of Remote Direct Memory Access (RDMA) utilization is the increase of IOPs (from 180 in the case of the typical hard drive over TCP/IP to over 20,000 in case of RDMA), translating into a dramatic increase in IO operations per second.
Global Fabric Extender
To truly take advantage of the benefits of reduced compute and faster access to data, datacenters need to employ high-speed fabric technologies. Within the datacenter, there are many options available to interconnect storage systems natively ‒ at latency rates low enough to make them efficient.
To maximize the I/O performance and minimize latency, most high-end storage solutions utilize switched, lossless fabrics such as InfiniBand or Lossless Ethernet. These fabrics work well within the confines of the datacenter. However, as a result of Big Data challengestoday, there is a need to extend these fabrics outside the datacenter and around the globe. A typical WAN connection using TCP/IP over Ethernet achieves poor bandwidth utilization. For global distances (at ≥ 100ms link latency), TCP/IP achieves only 25% efficiency even when multiple flows are used to pack the WAN link1. Also, TCP/IP performance deteriorates rapidly even at shorter distances in environments with high packet loss2.
As a result, the performance of interconnects between the two storage repositories at each location is diminished by the least efficient element (i.e., the WAN connection). Applications managing data synchronization, business continuity, and other key data functions are heavily reliant upon compute resources, which demand significant resources at each location and further increase application costs. This results in a cost prohibitive solution for real-time or near real-time data access or movement, even at limited distances.
To address this problem requires extending the lossless and deterministic fabric that exists within the datacenter with technology that can extend toward global distances without diminishing performance. A deterministic fabric provides consistent and repeatable performance (in terms of elapsed file transfer time or end-to-end throughput) across global distances with different data sets and under varying traffic conditions.
Figure 3 shows how Global Storage Fabric is enabled with Fabric Extension.
Contacts
For additional information or sales inquiries please contact: sales@baymicrosystems.com
Some features listed in the specifications may be under development.
Today s advanced Global Fabric Extension technology can support layer 2, concurrent, lossless, multi-fabric extension over a single connection at distances starting from 100km and exceeding 20,000km for 10G or even 50,000km for smaller 1G clients across cities, states, countries and around the world. Global Fabric Extension assures ultra-low latency and optimized bandwidth utilization approaching 98 percent.
Summary
A Global Storage Fabric is enabled by the combination of Global Fabric Extenders and parallel file systems, resulting in a robust, distributed global file system architecture. Bay Microsystem s Global Fabric Extenders allow optimized storage performance to be offered beyond the four walls. By extending CPU off-load via Remote Direct Memory Access (RDMA) over lossless and congestion-free WAN tunnels, Fabric Extenders dramatically reduce the impact of the bandwidth-delay product on performance over distance. RDMA not only achieves sustained wire-rate bandwidth, but also allows compute resources to be used for computing, rather than processing network protocols. Native support for all commonly used HPC fabrics, applications and ecosystems enable a flexible and scalable infrastructure to resolve commercial Big Data challenges while maintaining the way the enterprise access this data.
For more information about Bay Microsystems family of Global Fabric Extenders, please visit www.baymicrosystems.com.
1. Source: https://www.openfabrics.org/images/docs/2013_Dev_Workshop/Tues_0423/2013_Workshop_Tues_1600_LindenMercerOA2013_NRL.pdf