ADVANCED FILE
SHARING AND
MANAGEMENT
Dell NX4
Dell Inc.
Visit
dell.com/NX4
for more information and additional
resources
Copyright © 2008 Dell Inc. THIS WHITE PAPER IS FOR INFORMA‐TIONAL PURPOSES ONLY,
AND MAY CONTAIN TYPOGRAPHICAL ER‐RORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.
T
ABLE OF
C
ONTENTS
Executive Summary
3
NX4 Product Overview
4
Key Advantages
4
Deployment and Usage
4
No Compromise Availability
5
Advanced Features
5
Point‐in‐time logical copies of production systems
5
Replication
6
Data Protection
8
Data Deduplication
9
File Level Retention
9
Anti‐virus
10
Tiered Storage
11
Conclusion
12
E
XECUTIVE
S
UMMARY
The Dell NX4 Network Attached Storage device offers flexible, enterprise‐class file storage for Windows® and Linux/UNIX environments, with a wealth of features that can save administrative time and lower costs in mixed‐protocol environments. Because file data is as vital to a business as application data, Dell NX4 features hot‐pluggable components and HIGH availability with transparent failover – bringing to NAS the no‐compromise availability long demanded for application data. Dell NX4 also offers the flexibility to support stranded application servers via iSCSI or Fibre Channel. Dell NX4’s snapshot capability provides point‐in‐time logical images of production file systems, which can be used to save time and keep production data safe and available in a variety of scenarios, including testing and backup. Optionally, Dell NX4 also provides asynchronous replication to local or remote systems, providing additional data protection capability and enabling solutions such as multi‐site data protection, decision support, and content distribution. Dell NX4 replication runs over standard IP LAN and WAN infrastructures, simplifying configuration and management. Dell NX4 replication also integrates with VMware's powerful Site Recovery Manager software to automate, simplify, and dramatically speed up disaster recovery with automatic failback. Dell NX4 offers a number of different backup/restore methods to meet the needs of common backup environments, supporting NDMP, network backups of NFS or CIFS file systems, LAN‐less backups, and LAN‐less/server‐less backups. Administrators have the flexibility to pick the right method for their file systems based on file sizes, file system characteristics, and backup characteristics such as the speed and capacity required. New with Dell NX4 is a data deduplication feature that combines file‐level deduplication and compression to intelligently reduce the amount of storage space required while minimizing any impact to performance for the user or application. Organizations that must comply with industry or government regulations for file data preservation and integrity can benefit from Dell NX4's File Level Retention (WORM) capability, which lets them archive file data to WORM storage on standard, rewriteable magnetic disks using NFS or CIFS operations. Dell NX4 is available with an antivirus agent option, providing on‐demand antivirus support through tight integration with industry‐leading antivirus vendors. For more advanced users, Dell NX4 also provides an open API that works with third‐party migration and policy software, enabling organizations to automatically move older, less‐active files to lower‐cost, second‐tier platforms including ATA and purpose‐built archiving solutions. Dell NX4 remains the primary interface for clients and applications, retrieving files automatically and transparently from secondary storage on demand.
NX4
P
RODUCT
O
VERVIEW
Dell NX4 integrated storage leverages the Data Access in Real Time (DART) operating system for multi‐protocol network file and block access. DART allows concurrent use of NFS and CIFS protocols, with sophisticated locking and access‐control mechanisms. This enables seamless sharing of the same files by UNIX and Windows clients without compromising data integrity. Because NFS and CIFS are implemented in Dell NX4 as peer protocols, no performance‐sapping emulation techniques are required.
Key advantages
IT departments and users have become accustomed to high availability for applications storage, but file data is equally critical to business. Dell NX4 delivers no‐compromise availability to NAS through features such as dynamic failover. Snapshot capability is provided by Celerra® SnapSure™ software, which lets you create logical point‐in‐time copies of file systems for online backups and fast recovery of deleted files. Optional replication capability is provided by Celerra Replicator™ software, which creates read‐only or read/write copies of production file systems and iSCSI LUNs on a local or remote system, affording multi‐site disaster‐recovery protection. Dell NX4's single management interface helps administrators manage file shares, quotas, and users efficiently. Dell NX4 comes with Celerra Manager, feature‐rich software that provides Web‐based management, system status, and monitoring. Advanced functionality from industry leader EMC® places Dell NX4 far beyond simple file servers and helps solve a multitude of common IT problems. For example, with Dell NX4, you can:• Mix SAS and SATA drives in the same enclosure and store information on the right technology based on the business value of the data
• Designate WORM files and file systems for compliance controls
• Deal with the proliferation of unstructured data by deduplicating files while minimizing the performance impact for the end user
• Migrate less frequently used files to Tier 2 storage to speed up backups and improve TCO of storage
Designed to fit easily into existing environments, Dell NX4 supports NAS, iSCSI, and Fibre Channel (FC) connectivity. File and block storage can be consolidated into a single Dell NX4 system, simplifying administration. Organizations can take advantage of this feature to bring stranded application servers under control and simplify administration with just one system to manage.
Deployment and usage
Dell NX4 is appropriate for organizations needing file storage that goes beyond simple file servers, for instance, where: • Users can't share files efficiently due to a mixed‐protocol (CIFS and NFS) environment, with two management consoles and two file structures • File data has become critical and must remain available • Traditional file servers have been plagued by hardware failures • Backup/restore solutions for file data are insufficient • Data growth has led to a proliferation of duplicate files, wasting valuable disk space • Organizations must comply with industry/government regulations for data retention • Remote offices have no administrators on site, leading to inconsistent backups and maintenance
• Multiple appliances and management tools for CIFS and NFS make it difficult to manage file shares, quotas, and file types • Scaling capacity often requires adding more file servers • Storage is utilized inefficiently and backups are long and tedious
No-compromise availability
"No compromise" means that an organization continues running at the same performance and service levels, even in the event of a failure. Dell NX4's Primary/Standby architecture features automatic failover and eliminates any single point of failure from the network to the disk drive. As a member of the EMC Celerra family, Dell NX4 inherits the industry‐leading fault tolerance of EMC's CLARiiON® storage systems and the unique fault detection and isolation capabilities of Celerra's DART and CLARiiON's FLARE® operating environments. Dell NX4's X‐Blades are hot pluggable and can be configured with a standby blade for transparent, HIGH availability. Failover is transparent, with no performance degradation. In the event of an X‐Blade failover, DART uses a metadata logging facility to recover within seconds or minutes. Dell NX4 is designed to allow for hot‐swappable component replacements and automatic notification if the system detects a problem.A
DVANCED
F
EATURES
Point-in-time logical images of production file systems
Celerra SnapSure software lets you create logical point‐in‐time read‐only or read/write copies of file systems and iSCSI LUNs. You can use these snapshots for online backups and quick recovery of deleted files. End users can quickly and easily restore previous versions of a file without system administrator involvement. SnapSure saves disk space and time by allowing multiple snapshot versions of a file system or iSCSI LUN. SnapSure is architected for excellent read performance. For file systems, it operates on a "copy‐on‐first‐write" principle: When a block within a production file system (PFS) is modified, SnapSure saves a copy containing the block's original contents to a separate volume called the SavVol. Subsequent changes made to the same block in the PFS are not copied into the SavVol. SnapSure reads the original blocks in the SavVol and the unchanged blocks remaining in the PFS to provide a complete point‐in‐ time image called a checkpoint.1 A checkpoint reflects the state of a PFS at the point in time the checkpoint was created. SnapSure can create both read‐only and read/write (writeable) checkpoints. Every writeable checkpoint is created from a corresponding read‐only checkpoint, which serves as its baseline. 1 Checkpoints are not mirrors; they may be unreadable if their associated PFS is inaccessible. Only when you save a PFS and its checkpoints to tape or another alternate storage system, can you use them for disaster recovery.
Checkpoints can serve as a direct data source for applications that require point‐in‐time data; for instance, simulation testing, data warehouse population, and automated backup engines performing backup to tape or disk. You can also use a checkpoint to restore a PFS or part of a file system, such as a file or directory, to the state in which it existed when the checkpoint was created. A production file system can be manipulated in the same way as any Dell NX4 file system while a checkpoint of it exists. Writeable checkpoints can also serve as a source for applications that require incremental, temporary changes; for instance, a checkpoint of a database hosted on the Dell NX4 can be subjected to data processing and warehousing without committing the change to the production database. Another example: An administrator could first apply a patch on the database's writeable checkpoint and test it before applying the patch on the production database. When you restore a PFS to the point in time at which a checkpoint was created, SnapSure first creates a new checkpoint of the PFS, enabling you to reverse the restore. Restoring a PFS is a simple command‐line procedure. SnapSure supports online access to checkpoints, eliminating the need for administrator involvement when a client needs to list, view, or copy a point‐in‐time file or directory in a read‐only checkpoint or use it to restore information back to a point in time. SnapSure supports Microsoft® Windows® SCSF, which provides Windows Server® 2003 and Windows XP clients direct access to point‐in‐time versions of their files in checkpoints created with SnapSure. SCSF provides a Previous Versions tab listing all folders available in the checkpoint shadow‐copy directory.
Replication
EMC Celerra Replicator™ is a powerful, easy‐to‐use, asynchronous replication software, optional for Dell NX4. Celerra Replicator creates read‐only and read/write copies of production file systems or iSCSI LUNs on either a local or geographically remote Dell NX4 system. Celerra Replicator provides multi‐site protection and simplifies administration with easy‐to‐define business policies, such as recovery point objectives (RPOs), and uses standard, IP‐based networks for maintaining consistent replicas between sites. Celerra Replicator's adaptive scheduler determines the size and frequency of updates needed to meet a given RPO, taking into account available bandwidth, data load, and concurrency of data transfers.Technology overview
Celerra Replicator provides efficient, snapshot‐based, asynchronous data replication over IP networks. With Celerra Replicator, you can create local or remote copies of file systems, virtual data movers, or iSCSI LUNs. • File system replication creates an asynchronous copy of a source file system at a destination and periodically updates this copy, making it consistent with the source file system. • Replicating a virtual data mover (VDM) enables you to produce a copy of a Windows CIFS production environment and ensures that the necessary context is replicated to the remote site along with the file systems. This includes CIFS server data, audit logs, and local groups. • iSCSI LUN replication is used to produce an asynchronous, consistent copy of a source iSCSI LUN at a destination. Celerra Replicator supports array‐based, crash‐consistent replication, as well as application‐consistent iSCSI replication via EMC Replication Manager.The destination for replication can be the local Dell NX4 (same X‐blade or different X‐blade in the system) or a remote Dell NX4. Any session with a remote destination requires configuration of a trusted connection between the two systems, and any session, local or remote, requires a configured connection between the source and destination X‐blade (called an ”interconnect”). When you create an interconnect, you list the IP addresses that are available to sessions by using the interconnect and, optionally, a bandwidth schedule that lets you control the bandwidth available to sessions during certain time periods. Creating a replication session automatically creates two internal checkpoints on the source and destination. Using these checkpoints, the replication session copies the changes found in the source object to the destination object. Each replication session to a designated destination produces a point‐in‐time copy of the source object, and the transfer is performed asynchronously. With Celerra Replicator, both the production data and the replica are accessible at all times. After initial synchronization, it uses differential snapshots to transfer only changes. For each replication session, you can specify an update policy for the maximum amount of time the source and destination can be out of synchronization before an update is automatically performed. You can also update the destination at will by issuing a refresh request for the session. To ensure required RPOs are met, you can set a maximum "out‐of‐sync" time for each replication, which will be driven by the Celerra adaptive RPO algorithms. As the storage environment scales, Celerra Replicator maintains RPOs according to policy without an administrator having to manually manage the effect of individual replication sessions upon each other. You can create one‐to‐many replication sessions that copy the same source object to multiple destinations, and you can set up a cascading replication configuration, in which a destination site can serve as a source for another replication session to yet another destination, up to two network hops.
Benefits and usage models
Cascading replication is ideal for tiered replication, where the nearby disaster recovery site requires an RPO of minutes and the remote disaster recovery site requires an RPO of hours. This deployment maintains disaster recovery compliance if any site is lost and can recover from the loss of two sites. Celerra Replicator integrates with Celerra SnapSure. Whenever there has been a change to the environment, it's simple to generate a writeable checkpoint at the remote location, bring up the new environment, and validate the disaster‐recovery process. Complete testing methodologies involving full failover and disaster‐recovery process testing can easily be accommodated by forcing a failover. Celerra Replicator runs over standard IP LAN and WAN infrastructures. This simplifies configuration and management of replication and lets organizations deploy remote replication with only IP network skills, which can result in significant savings of staff time and budget. Celerra Replicator is an appropriate solution for: • Disaster recovery – It can replicate a duplicate copy of production data to a remote site, where it can be brought online with little downtime in the case of a disaster. Cascading replication allows for multi‐site protection. • Content distribution – One‐to‐many replication can be used to push content to remote sites; for example, when new engineering or software builds need to be distributed to multiple locations • Backup – Performing backups with a copy of the production data eliminates the need to take the production applications offline. The backup can occur locally or at the remote location.
• Decision support – File systems and iSCSI LUNs can be replicated to make a copy of a database to be used for data mining and decision support without affecting production applications. • Software testing – Before upgrading software, a duplicate copy of the data can be made and the upgrade tested before impacting production data. Writeable checkpoints allow the software to be tested with a modifiable copy of the production data. • Data center migrations – Celerra Replicator can be used to relocate to a new data center by copying the data to the new system and forcing a failover. This allows the migration to take place without any data loss.
Integration with VMware® Site Recovery Manager
Celerra Replicator in an iSCSI environment integrates with VMware's Site Recovery Manager (SRM) software, a package designed to work with replication tools to automate the setting up and testing of disaster recovery plans and the execution of failovers. Where Celerra Replicator automates the meeting of RPOs, SRM provides the complementary function of minimizing Recovery Time, enabling you to set tighter RTOs. Implementing SRM requires a primary site where the production environment being protected resides and a secondary recovery site for failover. Your Dell NX4 system must have DART 5.6.36 or later, Celerra Replicator V2, SnapSure software, the iSCSI protocol, and sufficient storage to host the replications and snapshots of the iSCSI LUNs hosting the VMFS data stores. You will also need an appropriate VMware solution, including a Celerra Storage Replication Adapter (SRA), which is available as a download from VMware's website. Traditional recoveries in a virtualized infrastructure are lengthy, labor‐intensive manual processes requiring specialized training; they run the risk of being outdated and are susceptible to human error. Automating recovery with Celerra Replicator and SRM can accelerate recovery. Testing disaster recovery plans in advance – frequently and thoroughly – helps ensure reliable recoveries. Additionally, Celerra Replicator and SRM can be used to migrate data.
Data Protection
Dell NX4 offers a number of options for backup and restore:Network backups
Network backups entail simply mounting the NFS or CIFS file systems across the network and backing up to the backup server. Supported by most backup software.NDMP backups
Network Data Management Protocol (NDMP) backups use the LAN only for control information. In these "LAN‐less" backups, data is transferred to the local backup device. NDMP‐based backups are used for high‐capacity backups and in environments where multiprotocol support is required. Supported NDMP backup products include EMC Networker, Symantec™ Veritas™ NetBackup™, CommVault® Galaxy®, HP OpenView, Atempo Time Navigator™, and IBM® Tivoli® Storage Manager software.SAN backups
SAN backups use the LAN for control information and do not involve the server (X‐blade) in the data path from the SAN‐storage backup server – they are "LAN‐less" and "server‐less." Backup applications with Celerra MPFS support this type of high‐speed backup. Supported by most backup software.NDMP Volume Based Backup (NVB)
NDMP Volume Based Backup (NVB) is an NDMP block‐based backup option that offers significant performance advantages for file systems with large numbers of small (<16MB) files. Using NVB in these environments can shrink backup windows up to 30%.Data deduplication
Dell NX4 was designed to help organizations deal with the proliferation of unstructured data. We combined file‐level deduplication and compression to intelligently reduce storage space usage. A built‐in policy engine works in the background, transparently monitoring file activity and file attributes to intelligently identify candidate files. Files that meet certain criteria, such as low access frequency, are compressed and single‐instanced. The system automatically filters out from processing those files for which deduplication would result in an undesired performance impact or minimal storage savings. The policy for what files to filter can be defined manually. An administrator can define filters based on last access time, last modified time, or minimum/maximum file size. Administrators can also define filters based on file extensions. Dell NX4 performs all deduplication processing as a background, asynchronous operation that acts on file data after it has been written into the file system. To avoid introducing latency into the client data path, it does not process data as it is written. Only one file system at a time is scanned, and if the CPU load exceeds a user‐defined threshold (75% by default), the process will throttle its activity to a minimal level until the CPU load has decreased to less than a low‐activity threshold (25% by default). This means that the deduplication process consumes CPU cycles that would otherwise be idle. The system reports completion time of the last successful scan, number of files deduplicated, original data size, and space saved. Dell NX4's unique combination of deduplication and compression is designed to provide the maximum storage savings with the lowest resource usage. By eliminating redundant data from file systems without affecting the end‐user experience, Dell NX4 data deduplication helps organizations reduce the amount of storage they need. You can enable Dell NX4's deduplication process on a file‐system‐by‐file‐system basis with a single click.When applied to typical file system data, you can expect storage efficiency savings in the range of 30-to-40 percent, but we have seen savings as high as 50 percent.
Dell NX4's deduplication is best for home directories, file shares, or for general-purpose file system archiving.
File-Level Retention (FLR)
Celerra File‐Level Retention (FLR) software allows you to protect files from modification or deletion until a specific retention date. This enables administrators, using NFS or CIFS operations, to meet Write Once Read Many (WORM) requirements by creating a permanent, unalterable set of files and directories. Dell NX4 offers this feature as a Rule 17a‐4(f)‐compliant option and as a non‐compliant option.Technology overview
You can enable FLR WORM technology on a specified file system only at creation time. When a new file system is created as FLR WORM, it is persistently marked as an FLR WORM file system. After a file system is created, an administrator can apply WORM protection on a per‐file basis. The administrator manages files in the WORM state by setting retention periods that, until expiration, prevent the files from being deleted or modified. WORM files can be grouped by directory or batch process. This lets the administrator manage the file archives on a file‐system basis or run a script to locate and delete EXPIRED files. A file in an FLR‐enabled file system is in one of four possible states: NOT LOCKED, LOCKED (WORM), APPEND, or EXPIRED. The transition between these states is based on the file's last access time and read‐only permission. When a file is created, it is in the NOT LOCKED state. A NOT LOCKED file is treated in the same way as a file in a non‐WORM file system: It can be renamed, modified, or deleted. When you change the permissions on a NOT LOCKED file from read/write to read‐only, the file transitions from NOT LOCKED to LOCKED (WORM) state and cannot be modified or deleted by NAS clients or users. LOCKED (WORM) files can be deleted only after their retention period expires. You specify the retention period before you commit a file to LOCKED (WORM) and, with the compliant option, if you do not set a retention period, the system defaults to an infinite retention period. You set the file's retention period by modifying the file's last access time – via NFS or CIFS – to an expiration date and time that occurs in the future. Files in an APPEND state cannot be deleted, modified, or renamed, but new data can be added. Once an APPEND is complete, files can be returned to the LOCKED (WORM) state. When a file transitions to EXPIRED state, it can be deleted by its owner or by the administrator. LOCKED (WORM) does not automatically delete EXPIRED files. You cannot shorten a retention period after it has been set, but you can revert a file from EXPIRED back to LOCKED (WORM) state by extending its retention period to a data beyond the original retention date.Benefits and usage models
Organizations that must comply with industry or government regulations for data retention and protection can benefit from Celerra FLR. In particular, the Food and Drug Administration, in Title 21 Part 820.180, requires medical device manufacturers to store records for possible inspection by FDA personnel. These records are required to be retained for a period of not less than two years from the date of release for commercial distribution by the manufacturer. Celerra FLR allows the manufacturer to "lock" the files in a WORM file system that prevents modification or deletion of the records for a period of two years or greater. In addition, Celerra FLR provides an audit trail that logs user activity on files in that file system.
Antivirus protection
Dell NX4 also provides as an option Celerra Event Enabler (CEE), a framework that contains EMC Celerra AntiVirus Agent (CAVA) and Celerra Event Publishing Agent (CEPA). CAVA provides an antivirus solution to clients using a Celerra Network Server. It uses an industry‐standard CIFS protocol in a Microsoft® Windows Server® 2003 or Windows® 2000 domain. CAVA uses third‐party antivirus software such as Symantec, McAfee®, Computer Associates, Trend Micro™, and Sophos® to identify and eliminate known viruses before they infect files on the storage system. When a file is written and saved (scan on update) or first read (scan on read), Dell NX4 places a block on that file until virus checking has been performed. It immediately issues a remote procedure call (RPC) to a virus‐checking engine. On receipt of the request, an access is initiated from a filter driver, and the virus‐checking server performs a standard check on the file. If a virus is detected, the user and the administrator are sent a customizable pop‐up message.The scan‐on‐read functionality is triggered when a file that was last scanned before a set access time is opened for read. This access time is typically set when a new virus‐definition file is loaded to rescan old files (once) that may contain undetected viruses. You may also wish, under certain circumstances, to run antivirus in scan‐on‐read mode – for instance, after a restore of data that may be infected with a latent virus, or following migration from a general‐purpose NT server onto a Dell NX4 system. Standard virus checkers request only a small amount of data (signatures of a KB each) to establish the presence of a virus, so the overhead is relatively small, allowing the implementation to be deployed via the normal user network. (The exception is with compressed files, which must be shipped across the Internet in their entirety for virus checking.) CAVA offloads from Dell NX4 the need to consume CPU cycles for scanning files. Centralized scanning of all files stored on Dell NX4 is essential to avoid storing a virus centrally if a client computer is lacking the latest virus definitions or if the environment cannot prevent computers lacking adequate protection from connecting to a NAS share. Customers subscribe to the standard supported antivirus vendors for updates to their virus definition files and CAVA manages which files need to be rescanned based on the update to these virus definition files. You can scale the solution by adding virus‐checking servers as required. Your server vendors should be able to provide you with an understanding of how many dedicated servers you would need. You can also use different server types (e.g., McAfee, Symantec, Trend Micro) concurrently, as per their original antivirus implementation.
Tiered Storage Access
Dell NX4 features an open API for tiered storage access, the Celerra FileMover API. Information stored on the Dell NX4 can be migrated to secondary storage as well as to purpose‐built archiving solutions with complete transparency to users and third‐ party applications. Celerra FileMover API requires a third‐party policy engine to specify migration policies. After the migration, a file's metadata remains on the Dell NX4, stored in an offline inode (stub file). With retention coordination, the policy engine will be able to migrate a file to Dell NX4 with a retention period and replace the file with a stub having the same retention period. The stub will have all the characteristics of file‐level retention, even on a regular non‐file‐level retention file system. To a client or application, the migrated files appear as though they haven't moved. Your Dell NX4 remains the primary interface to clients and applications. When the client or application goes to access the files, Dell NX4 will automatically retrieve the file from secondary storage. This action can also migrate the file back if it is in accordance with the set policy. Retrieval operations are completely transparent to the end user – no need to retrain users or modify applications. Celerra FileMover enables a choice of secondary storage based on business, application, and information requirements. You can move older, less frequently used files automatically to second‐tier platforms, including ATA, purpose‐built archiving solutions such as EMC Centera®, tape, and optical. Storing your business‐critical data on your highly available Dell NX4 and infrequently used data on secondary storage platforms simplifies data management. Operational tasks like backup are shorter because inactive files no longer have to be backed up in every cycle. By tiering storage and placing less‐active files on lower‐cost secondary storage, you keep cost of storage in check. Celerra FileMover API enables automated file system archiving using Rainfinity® and other third‐party applications, including Arkivio®, Symantec, StoredIQ, and Enigma software. It works in a heterogeneous environment with a choice of secondary storage platforms as the destination, including low‐cost tape libraries, optical, high‐performance ATA drives, and purpose‐built compliant archive solutions.
Celerra FileMover delivers benefits in three main areas:
• Reduced costs in backup hardware, software, tapes, and time
• Increased operational efficiencies with faster backups, consistent protection, and improved storage utilization
• Better service levels through improved performance and availability and quicker restores