Delphix Agile Data Platform
Delphix Agile Data Platform Overview Revision: 1 July 2014
You can find the most up-‐to-‐date technical documentation at:
http://www.delphix.com/support
The Delphix Web site also provides the latest product updates.
If you have comments about this documentation, submit your feedback to:
help@delphix.com
© 2012 Delphix Corp. All rights reserved.
The Delphix logo and design are registered trademarks of Delphix Corp. in the United States and/or other jurisdictions.
All other marks and names mentioned herein may be trademarks of their respective companies.
Delphix Corp.
275 Middlefield Road, Suite 50 Menlo Park, CA 94025
Table of Contents
Executive Summary ... 4
Delphix Agile Data Platform Overview ... 5
Tier 1: DxFS (Delphix File System) - Data Virtualization Layer ... 6
Tier 2: DataVisor - Data Orchestration Layer ... 7
Tier 3: Self-Service Management Layer ... 10
Summary ... 11
Executive Summary
From financials and order management to eCommerce and customer support, nearly every business process in the modern enterprise is powered by application software. Maximizing the agility and availability of application environments while minimizing associated costs has emerged as a top priority for CIOs across the globe. This goal, however, remains elusive, and organizations continue to struggle with improving or even maintaining data management and data protection SLAs.
Application lifecycles introduce a constant flow of projects tied to new deployments, upgrades, customizations, expansions, etc. Every project requires frequent movement of data across production and pre-‐production environments. For IT teams, this translates into regular data management tasks such as provisioning, refresh, integration, and recovery requests. Unfortunately, with coordination and approvals each request balloons into a multi-‐team, multi-‐ week exercise. This cripples data management SLAs, forces application project teams to work with stale data sets, and puts project quality and timelines at risk.
Organizations also face growing business downtime risk despite considerable investments in data protection. Simply put, the scale of modern businesses makes even edge case failures like database corruption a tangible risk that must be mitigated. Downtime costs for major enterprise applications range from $10,000 to over $70,000 per minute. At these levels, even a day or two of recovery time from disk or tape can add up to tens of millions of dollars in downtime costs. Traditional backup and recovery or other data protection technologies do not allow organizations to deliver data protection SLAs that reflect what business lines actually need.
Data management and data protection SLAs are further challenged by cost inefficiencies that stem from the long tail of redundant data in application lifecycles. For critical applications, organizations often create up to 10 copies of each production application environment. While these copies are essential for a robust application lifecycle, they also impose a large and growing infrastructure cost penalty. Continued double-‐digit growth of application data and the added redundancy of data backups only compound infrastructure costs.
Delphix addresses these widespread challenges through patent-‐pending data virtualization software. Enterprises of all sizes, across the globe and across industries, are leveraging Delphix to realize:
• Greater agility: superior data management SLAs drive 2-‐5X greater project output. • Lower risk: superior data protection enables 98% faster recovery.
• Reduced costs: 20x consolidation and reduction in infrastructure costs.
Delphix Agile Data Platform Overview
Much like a hypervisor abstracts compute resources to create virtual machines, the Delphix Agile Data platform abstracts data from hardware and storage to create virtual app instances (VAIs), virtual file systems (VFSs) and virtual databases (VDBs). The three tiers of the Delphix technology stack defined below enable these benefits.
• DxFS: the Delphix file system is a purpose-‐built file system optimized to store and manage application data. This tier is responsible for storage and performance optimization.
• DataVisor: responsible for all data orchestration tasks including synchronization, synthesis and recording of changes, data movement across copies, and replication.
• Self-‐service management: wraps the functionality of DxFS and DataVisor with policy driven automation and interfaces that enable integration into any organization’s processes. The capabilities of the Delphix Agile Data software stack are supported across a large number of data sources and delivered as a software virtual appliance referred to as the Delphix Engine. The virtual appliance form factor of the Delphix Engine maximizes deployment flexibility, scale, availability and cloud readiness.
Deployment flexibility: The Delphix Engine works with any hardware and storage. This ensures organizations can leverage existing infrastructure, avoids delays tied to procurement cycles, and fundamentally reduces vendor lock-‐in at all tiers.
On-‐demand scale and availability: The Delphix Engine can be deployed in minutes in any location to address a specific project need. If the scope of a project expands, Delphix can be scaled horizontally or vertically. For a broader rollout, Delphix fully supports hypervisor templates (like VMware vSphere). Delphix also natively supports bandwidth efficient replication to a secondary Delphix appliance for high availability and disaster recovery.
Cloud readiness: The deployment flexibility, scale, and availability benefits of the Delphix Engine are also fundamental requirements for private or public cloud adoption. Delphix therefore makes it easy to deliver “Data as a Service” in the cloud. Without Delphix, other tiers that have already been virtualized are cloud-‐ready, but the data tier continues to anchor and bottleneck cloud adoption.
REPLICATE RECORD
SYNC PLAY MOVE
COMPRESSION CACHING
BLOCK MAPPING FILTERING
Standard Enterprise Cluster Enterprise Exadata Data Guard RAC BI/DWH 9, 10, 11
POLICIES AUTOMATION WEB UI WEB SERVICES CLI
DataVisor) Management)
DxFS)
Tier 1: DxFS (Delphix File System) - Data Virtualization Layer
DxFS or the Delphix file system is a purpose-‐built file system optimized to efficiently store and manage application data. DxFS includes mature data mapping and file-‐system snapshot technology to efficiently manage and present storage. Specifically, DxFS uses several block-‐ aware techniques that collectively eliminate over 90% of the storage required by full physical copies while also optimizing performance of the shared storage managed by Delphix.
DxFS develops a map of unique data blocks across production and the virtual copies created by Delphix. DxFS identifies database block boundaries and compresses databases along those database block boundaries, which is critical to maintaining performance of virtual databases. Additionally, Delphix filters incoming data streams—eliminating temporary, empty, or scratch blocks—driving even more data reduction.
Storage arrays lack such sophisticated application awareness and generally cannot refresh data on top of an existing volume without requiring redundant storage at the point of refresh. With Delphix, the DxFS and DataVisor tiers ensure Delphix can stay synchronized and enable data refreshes, while maintaining storage efficiency.
Businesses architect applications for performance, so virtualization of applications environments must address performance concerns. DxFS acts as a caching tier to augment the I/O performance of the storage subsystem assigned to Delphix. For many workloads, the Delphix cache services over 75% of all data requests.
Solid-‐state disks (SSDs) can be added to a Delphix virtual appliance to further boost the shared performance cache. Not only does Delphix caching technology service shared read requests, it also logs data to quickly commit writes and minimize inefficient disk spindle movements, while preserving data consistency in failure cases. Along with compression and decompression on the fly, Delphix preserves spindle movements for underlying storage—maximizing performance for concurrent, consolidated workloads. Performance levels can be further fine-‐tuned and managed through intuitive quality of service settings.
FEATURE
BENEFIT
BLOCK MAPPING •• Patented, flat metadata design scales to unlimited virtual copies 10-50x storage consolidation, add parallel environments at no cost COMPRESSION •• Block aware compression adds 2-4x data reduction 2-4x reduction across virtual copies and backups
FILTERING •• Intelligent filtering eliminates temporary or empty blocks Block awareness drives 10-20% greater efficiency
Tier 2: DataVisor - Data Orchestration Layer
The biggest bottleneck to application development agility comes from process and operational overhead; for most organizations, data refresh, provisioning, and recovery tasks can take days to weeks of coordinated effort across multiple IT teams with competing priorities and mandates. This cripples application project lifecycles. The DataVisor tier overcomes these challenges by orchestrating the flow of data into DxFS as well as the flow of data out of DxFS to virtual app instances (VAIs), virtual file systems (VFSs) and virtual databases (VDBs).
DataVisor ensures automated, continuous and near real time synchronization across multiple, heterogeneous databases through integrated log shipping. While access to the latest data will benefit any project, fresh data is paramount in some environments (for example: databases that support analysis of transactions from the last 24 hours).
DataVisor also enables a highly efficient synchronization process. Most organizations rely on scripted data dumps or backups to create a copy of production application environments. This process consumes I/O resources at multiple points: as the data is read from storage, as it is processed by CPUs on the DB server, and as it is written to the target storage, often a local data dump. If the data needs to be moved to another location, more resources are consumed as the data is read from storage, copied across the network, and written to the destination.
DataVisor dramatically reduces this load by following a database aware, incremental-‐forever synchronization process that only requests changed data blocks. DataVisor also eliminates the need for DBAs or application teams to create and manage associated custom scripts. By eliminating load from multiple points along a datacenter architecture, Delphix improves performance for both production and non-‐production systems.
FEATURE
BENEFIT
SYNC •• Efficiently sync heterogeneous sources in near real time Deliver right data to right team at right time RECORD •• Synthesize, record all changes into a continuous TimeFlow Superior Recover Point Objective (RPO)
PLAY •• Fast provisioning, refresh, rollback, data integration Reduce time from 10 days to 10 minutes, from 4 teams to 1 team MOVE •• Promote, demote, consolidate, and recover virtual application copies Quickly move data through application, development lifecycle stages REPLICATE •• Efficient replication to secondary Delphix Engine High availability, disaster recovery, backup
As part of its synchronization process, DataVisor also synthesizes and records all changes. This information is used to build a time window or TimeFlow of changes that can be used to provision or refresh virtual copies from any point in time (down to the second or transaction ID). TimeFlow is entirely policy driven so organizations can choose to retain changes for a few weeks or several months per source and target virtual environment.
TimeFlow uses incremental-‐forever synchronization to process change blocks into immediately available points in time. With the added benefit of block aware compression and filtering, organizations can store 50 days of continuous recovery points in the space of 1 full copy. Data Protection SLAs improve not just in terms of RPO but also RTO because Delphix VAIs, VFSs and VDBs can be made available in minutes.
Delphix virtual copies are fully functional copies but have clear advantages in terms of efficiency, agility and availability. Data management tasks (provisioning, refresh, rollback, recovery, and roll forward) can be executed in minutes and via self-‐service interfaces. As a result, data management SLAs will improve significantly and in turn drive application project quality.
For composite applications (e.g., SAP) and certain projects (reporting, business intelligence, master data management) data management tasks have to be synchronized across sources. In practice, data integration requests are often challenging because different business lines own and manage the data sources. Even if all owners provide concurrent access, extracting the same time slice of data or refreshing to a new point in time across sources is challenging. Normal refreshes can take days, and federated refreshes only add complexity. Invariably, wait times for data access and technical challenges with synchronizing data extracts simply reduce consolidated data quality and eat into test time which in turn increases the risk of project failures.
Delphix can connect to multiple sources and enable integrated delivery on-‐demand and in a self-‐service model. Independent access to production data (databases, config files, binaries) decouples reliance on production teams, which are often busy with other priorities. The elimination of organizational dependencies, process overhead, and hours of labor around these data integration projects can accelerate overall project schedules by returning days of wait time, increasing time for testing, and improving data quality, thereby increasing project success rates. Aug 5, 7:28:29 Integrated) Provisioning) Integrated) Refresh) Aug 6, 3:17:51 Integrated) Rollback) Aug 1, 11:50:33
V
V
V
V
V
V
V
V
V
As applications grow and change in production, downstream copies have to be refreshed in development and then moved to testing, QA, staging, and back to production. Normally, promoting or demoting environments along the development lifecycle involves days to weeks of coordinated effort. Delphix virtual copies can be easily provisioned and refreshed with a few clicks and in minutes. DataVisor supports provisioning 2nd generation virtual instances to promote or demote an environment from one stage of development to another.
DataVisor also provides for conversion of virtual data back to a physical state for final promotion to UAT or production. With the flexibility to move data across systems, across sites, and to different users, Delphix ensures that every stage of development has access to the right data at the right time.
The V2P capability is also very useful in the event of a production failure and downtime. In such a case, production loads can be moved over to an existing or new virtual Delphix environment. After troubleshooting is completed on other firefighting virtual copies, the V2P process can be initiated to recover a working physical production copy. Interim production workloads can then be replayed on the new physical instance, minimizing downtime to users.
Replicate Locally or Remotely
Delphix supports replication from one Delphix Engine to another, either locally or across the WAN, with minimum bandwidth requirements. If a Delphix Engine fails, new virtual can be easily re-‐provisioned by the replication Delphix Engine.
Delphix also supports an active-‐active multicast deployment scenario. In this case, production sources are configured to actively synchronize with multiple Delphix Engines. Organizations following agile development practices may want to provide virtual copies for each individual developer. In such cases, an active-‐active multicast deployment makes sense because it supports scaling out to a large number of application copies (beyond what a single Delphix Engine could support) while also providing distributed protection and availability for the Delphix deployment. Test Dev QA UAT Prod V2P$
V
V VV
V
V VV
V
V2P$ Local or RemoteX
V V VTier 3: Self-Service Management Layer
DxFS and DataVisor are wrapped by a self-‐service management layer, which enables enterprise-‐wide usage and adherence to each individual organization’s internal standards.
• Comprehensive policy framework: organizations can define users and groups and specify which copies they can access, whether they can refresh and provision those copies, and whether or not they can connect to new sources. Similarly, retention policies can be defined to ensure that fiscal period ending states of virtual copies tied to financial systems are retained for extended periods to address regulatory requirements.
• Automation of data management tasks: data refreshes can be automatically scheduled and executed so that business analysts can automatically have access to the latest data at the beginning of each week, month, or other configurable period. In environments where data security is a concern, users can automatically mask sensitive data, such as Social Security Numbers, through integrated support for pre-‐ and post-‐scripting—obfuscating private information before VDBs become accessible to users. Delphix also automates the complex, fault prone re-‐parameterization that is required as part of provisioning a new copy. In fact, these settings can be easily stored as templates and re-‐applied when provisioning future VAIs), VFSs and VDBs.
• Interfaces to integrate Delphix into organizational processes: Delphix provides an intuitive web-‐based interface specifically designed to extend self-‐service directly into the hands of application or infrastructure teams. All the functionality of DxFS and DataVisor is also accessible through a CLI and web services API so that data management can be integrated into any organization’s interfaces and processes (service portals, authentication mechanisms, ticketing systems, etc.).
FEATURE
BENEFIT
POLICY ENGINE •• Granular, role-based control over user, group rights management Easily align with enterprise policies
AUTOMATION ENGINE
• Example: init.ora file settings, scheduled refreshes
• Time, labor savings and independent data access
SELF SERVICE INTERFACES
• Web GUI, CLI, web services API
• Enterprise integration (branded service portals, ticketing systems, etc.)
AUDITABILITY, SECURITY
• Comprehensive logging and reporting, preservation of source security
Summary
Data virtualization is the next logical step in the evolution of the datacenter. In many environments, it may be the single largest opportunity to transform and improve data management and data protection SLAs. Today, many projects—like an end-‐of-‐quarter change to a CRM application to drive sales efficiency—never become reality due to high capital costs and operational overhead. Software infrastructure such as server virtualization and Delphix, however, facilitate innovation, allowing businesses to capture potentially lost opportunity value. With Delphix, enterprises can spend less and move faster than the competition.
Delphix slashes complexity and the time to deliver data management tasks by over 95%, freeing IT personnel to focus on higher priorities and projects with higher returns. Superior data management SLAs and the elasticity of parallelized development without added cost can collectively multiply application project output by 5X in real customer deployments. Additionally, Delphix also transforms data protection SLAs by providing superior RPO and RTO over existing data protection solutions. This can wipe out the cost of multiple ineffective point solutions for data protection. Finally, by combining consolidation and data reduction technologies, Delphix cuts application data infrastructure costs by over 90%.
The Delphix Engine can be deployed in under an hour on standard hardware or on cloud infrastructure, providing coverage across current and future architectures. With Delphix, IT teams can quickly and easily manage a private cloud for databases, enabling self-‐service provisioning and refresh for developers and analysts. As a result, Delphix helps organizations deliver critical application projects on time with improved quality at a lower cost.
© 2012 Delphix Corp. All rights reserved. Specifications subject to change without notice. All information in this data sheet is strictly confidential. Please do not copy or distribute to other parties.