White Paper
EMC Solutions Group
Abstract
This white paper presents a solution that explores the scalability and performance for shared application workloads using EMC® Fully Automated Storage Tiering for Virtual Pools (FAST VP) and highlights the ease of management with Microsoft System Center Virtual Machine Manager (SCVMM 2008 R2).
March 2012
EMC AUTOMATED PERFORMANCE
OPTIMIZATION for MICROSOFT APPLICATIONS
EMC VMAX, FAST VP, and Microsoft Hyper-V
• Automated performance optimization
• Cloud-ready infrastructure
Copyright © 2012 EMC Corporation. All Rights Reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.
The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of
merchantability or fitness for a particular purpose.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
All trademarks used herein are the property of their respective owners. Part Number H8763.1
Table of contents
Executive summary ... 7 Business case ... 7 Solution overview ... 7 Introduction... 9 Purpose ... 9 Scope ... 9 Audience ... 9 Terminology ... 10 Technology overview ... 11EMC Symmetrix VMAX ... 12
EMC Symmetrix Management Console ... 12
Microsoft Hyper-V ... 12
Microsoft SCVMM... 13
Benefits ... 13
EMC Replication Manager ... 13
EMC FAST VP ... 13
Consolidated applications storage design on EMC VMAX ... 15
Overview ... 15
Application environment highlights ... 15
General design guidelines ... 16
FAST VP ... 17
FAST VP theory ... 18
FAST VP storage design for applications ... 19
FAST VP storage sizing for Microsoft applications ... 19
FAST VP building block for Exchange 2010 ... 19
Phase 1 – Collect user requirements ... 20
Phase 2 - Design the storage architecture based on user requirements ... 22
IOPS calculation example ... 22
Disk space calculations ... 22
Capacity calculation example ... 23
Phase 3 – Validate design ... 24
Microsoft SQL Server ... 24
Phase 1 – Collect user requirements ... 24
Phase 2 – Design the storage architecture based on user requirements ... 25
IOPS calculation ... 25
Microsoft SharePoint ... 27
Phase 1 – Collect user requirements ... 27
Phase 2 – Design the storage architecture based on user requirements ... 28
IOPS calculation ... 28
Disk space calculation... 28
Final best configuration ... 29
FAST VP management tools ... 29
FAST VP policy ... 29
Applications building block design on Symmetrix VMAX ... 31
Microsoft Exchange ... 31
Exchange Server 2010 database and DAG design ... 31
Hyper-V and virtual machine design ... 32
Microsoft SQL Server ... 33
SQL Server configuration overview ... 33
SQL Server test application ... 34
SQL Server database design overview ... 34
Hyper-V and virtual machine design ... 34
Microsoft SharePoint ... 35
SharePoint 2010 farm design considerations ... 35
SharePoint farm design overview ... 36
Server role overview ... 36
SharePoint WFE server ... 37
Database server ... 37
SharePoint crawl server ... 37
SharePoint query server ... 37
SharePoint database design ... 37
Hyper-V and virtual machine design ... 38
Microsoft SQL Server and SharePoint Hyper-V design ... 39
Virtual Provisioning and FAST VP design on the Symmetrix VMAX ... 39
Space reclaim utilities ... 41
Microsoft SCVMM... 43
Backup and restore of Microsoft applications using Replication Manager ... 45
Overview ... 45
Replication Manager design ... 45
Replication Manager theory and best practices ... 45
Exchange... 46
SQL Server... 46
SharePoint ... 46
Replication Manager design layout ... 46
Microsoft SQL Server and SharePoint ... 47
Rapid restore using Replication Manager ... 49
Microsoft SQL Server ... 49
Rapid restore using Replication Manager ... 50
Microsoft SharePoint ... 52
Rapid Restore using Replication Manager ... 55
Performance testing and validation results ... 57
Notes ... 57
Methodology and tools ... 57
LoadGen ... 58
TPC-E ... 58
VSTS ... 58
Data collection points ... 60
Microsoft Exchange ... 60
Microsoft SQL Server ... 60
Microsoft SharePoint ... 60
Microsoft Exchange test results with FAST VP ... 60
Validation results with Jetstress ... 60
Environment validation with LoadGen ... 61
Test 1: End-to-end validation with FAST VP for Exchange under normal conditions ... 62
Objectives ... 62
Configuration ... 62
Performance results and analysis ... 62
Test 2: End-to-end validation with FAST VP for Exchange in a failover condition ... 63
Objectives ... 63
Configuration ... 63
Performance results and analysis ... 63
Exchange client results ... 64
Consolidated applications test results with FAST VP ... 65
Environment validation for all Microsoft applications ... 65
Objectives ... 66
Configuration ... 66
Microsoft Exchange ... 66
Hyper-V and virtual machines performance results ... 66
Exchange client results ... 67
Microsoft SQL Server ... 68
Performance results and analysis ... 68
SharePoint ... 70
Test methodology design and data preparation ... 71
SharePoint test results ... 73
Metalogix StoragePoint test summary ... 75
Metalogix StoragePoint BLOB externalization overview ... 75
BLOB externalization results ... 75
RBS externalization best practices ... 78
Impact of TimeFinder/snapshot ... 78
Storage and performance ... 79
Snapshot sizing ... 81 Hyper-V HA migration ... 82 Conclusion ... 84 Summary ... 84 Findings ... 84 Appendix ... 85
SQL Server restore using Replication Manager ... 85
SharePoint Restore using Replication Manager ... 86
References... 92
Executive summary
This white paper outlines the design guidance and architecture of a solution for a mixed Microsoft application workload on EMC® Symmetrix VMAX™ utilizing Microsoft Hyper-V as the hypervisor. The demonstrated applications are Microsoft Exchange Server 2010, SharePoint 2010, SQL Server 2008 R2, and System Center Virtual Machine Manager (SCVMM 2008 R2), to provide for management of the Hyper-V virtual machines.
When deploying multiple applications on a storage array, today’s customers need to take performance, total cost of ownership, and ease of management into
consideration, and architect a solution that best meets all their needs. To achieve this, many customers have completely separated physical infrastructures. This can lead to an increase in complexity and higher operational costs, management costs, and capital costs. For today’s customers, it is a constant critical business challenge to maintain or improve the performance of a company's mission-critical applications, while reducing costs and providing for simplified management to administer the environment. One way to reduce costs and eliminate bottlenecks is through manual application layouts on static storage tiers and virtual provisioning bottlenecks but the complexity of managing multiple applications does not justify it any longer.
This solution provides a simplified architecture to host the different business applications of a company, ensuring that each business line’s information is separated from that of the other. It greatly simplifies the overall environment and reduces operational and management costs. By leveraging additional EMC software such as EMC Symmetrix Fully Automated Storage Tiering for Virtual Pools (FAST VP), Virtual Provisioning™, and Microsoft SCVMM, the solution provides for automated storage tiering and enables quick provisioning of virtual machines and rapid scaling of additional workloads.
This solution simulates a scenario demonstrating three distinct applications, each with separate requirements for user profile and size, where the physical infrastructure is shared. For example, this architecture could be used for a large enterprise
customer whose IT department is looking to provide for ease of management, growth, and automated performance-tuning, using storage tiering for the different business units within the company.
There is a growing need among customers to be able to run multiple workloads and applications on shared infrastructure and meet expected performance levels at lower costs, as dictated by business service level agreements (SLAs). This solution
showcases a comprehensive design methodology to run a consolidated workload across the Symmetrix VMAX platform, and demonstrates a private cloud solution for customers who are looking for enterprise consolidation with VMAX. The environment includes Exchange Server 2010, SharePoint 2010, and SQL Server 2008 R2. The architecture includes management provided by SCVMM to manage the Hyper-V environment.
In an enterprise environment, a customer must allow for requirements where multiple applications are hosted on the same hardware. This solution, on the EMC VMAX array, is ideal because it enables virtual machines to be automatically allocated in a
Business case
balanced way, using Hyper-V's performance and resource optimization (PRO) load-balancing feature, while at the same time continuing to meet or exceed performance levels per the business SLA, by leveraging EMC’s FAST VP automated tiering
technology. Microsoft SCVMM allows rapid provisioning of virtual machines on demand by each business unit for rapid scaling of additional workloads.
Introduction
This white paper presents a solution that explores the scalability and performance of shared application workloads using FAST VP and highlights the ease of management with Microsoft SCVMM. It utilizes the VMAX platform to host and protect the
consolidated Microsoft workload, which appeals to the enterprise customer market.
The scope of this paper is to document:
• The deployment of consolidated Microsoft applications on the VMAX array • Performance and test results: automated storage performance using FAST VP
storage tiering technology
• Protection of the consolidated environment using EMC Replication Manager • Application protection using database availability groups (DAGs) for Microsoft
Exchange
• The impact of hardware failure on applications The intended audience for the white paper is:
• Customers • EMC partners
• Internal EMC personnel
Purpose
Scope
This paper includes the following terminology: Table 1. Terminology
Term Definition
EMC VMAX The array offering by EMC to provide high-end storage for
the virtual data center. Its innovative EMC Symmetrix Virtual Matrix Architecture™ seamlessly scales performance, capacity, and connectivity on demand to meet all application requirements.
EMC Replication Manager The EMC Replication Manager product provides simplified management of storage replication and integration with critical business applications to take disk-based copies that serve as foundation for recovery operations.
DAG DAG is the base component of the high availability (HA)
and site resilience framework built into Microsoft Exchange Server 2010. A DAG is a group of up to 16 mailbox servers that hosts a set of databases and provides automatic database-level recovery from failures that affect individual servers or databases.
SCVMM Microsoft System Center Virtual Machine Manager 2008 R2
(SCMM), Service Pack 1 (SP1) helps enable the centralized management of the physical and virtual IT infrastructure, increased server utilization, and dynamic resource optimization across multiple virtualization platforms. It includes end-to-end capabilities such as planning, deploying, managing, and optimizing the virtual infrastructure.
FAST VP Fully Automated Storage Tiering for Virtual Pools supports
automated storage tiering at the sub-LUN level. VP refers to virtual pools, which are virtual provisioning thin pools.
VLUN The Virtual LUN technology enables migration of data
between storage tiers within the same array, without any production disruption.
Technology overview
This solution provides a comprehensive design methodology aimed at helping customers to create a scalable building block for their business units, including Microsoft Exchange Server 2010, SharePoint Server 2010, and SQL Server 2008 R2. The testing simulates a consolidated multi-application scenario by demonstrating three separate environments—Messaging (Exchange), Online Transaction Processing (OLTP) Database (SQL), and enterprise-class collaboration (SharePoint)—each with separate requirements for user profile and size, where the physical infrastructure is shared across the VMAX platform and Hyper-V environment. The solution also includes local HA and backup and restore for all three applications.
The following components are used in this solution: • EMC VMAX
• EMC Symmetrix Management Console • Microsoft Hyper-V virtualization technology • Microsoft SCVMM
• EMC Replication Manager • EMC FAST VP
EMC Symmetrix VMAX Enginuity version 5875, with the strategy of simple, intelligent, modular storage, incorporates a new, highly scalable Virtual Matrix Architecture™ that enables VMAX arrays to grow seamlessly and cost-effectively from an entry-level configuration into the world’s largest storage system. It offers:
• Unmatched application availability: For the highest level of information
protection, the VMAX system is the only platform in the industry that can deliver comprehensive solutions for local, remote, and multi-site business continuity. • Redundancy: The VMAX platform provides for excellent redundancy, including multiple engines, with two integrated highly available directors per engine, and a mirrored cache across all the engines that prevent it from being a “single point of failure”.
• More scalability: Up to twice the performance, with the ability to manage up to 10 times more capacity per storage administrator.
• More security: Built-in encryption, RSA-integrated key management, increased value for virtual server and mainframe environments, replication
enhancements, and a new eLicensing model.
The tiered storage configuration used in the test environment is based on the following VMAX features:
• Sub-LUN FAST VP • Virtual LUN VP mobility
The Symmetrix VMAX storage system provides a built-in web browser interface, Symmetrix Management Console (SMC). SMC provides a centralized management to the entire VMAX storage infrastructure. In the context of FAST, SMC integrates easy-to-use wizards to:
• Create thin pools • Create storage tiers
• Define and associate storage groups • Set up FAST policies
Microsoft Windows Server 2008 R2, with Hyper-V, builds on the architecture and functions of Windows Server 2008 Hyper-V by adding multiple new features that enhance product flexibility. Hyper-V provides server virtualization and has increased flexibility in the deployment and life cycle management of applications. Hyper-V virtualization helps to consolidate workloads and reduce server sprawl. Additionally, it allows deploying virtualization with clustering technologies to provide a robust IT infrastructure with high availability and quick disaster recovery.
EMC Symmetrix VMAX EMC Symmetrix Management Console Microsoft Hyper-V
Microsoft Virtual Machine Manager 2008 R2, with Service Pack 1 (SP1), provides centralized management of the physical and virtual IT infrastructure, increased server utilization, and dynamic resource optimization across multiple virtualization
platforms. It includes end-to-end capabilities such as planning, deploying, managing, and optimizing the virtual infrastructure.
Benefits
• Centrally creates and manages virtual machines across the entire datacenter. • Easily consolidates multiple physical servers onto virtual hosts.
• Rapidly provisions and optimizes new and existing virtual machines. • PRO enables the dynamic management of virtual resources through
management packs that are PRO-enabled. As an open and extensible platform, PRO encourages partners to design custom management packs that promote the compatibility of their products and solutions with PRO's powerful
management capabilities.
EMC Replication Manager manages EMC’s point-in-time replication technologies through a centralized management console. Replication Manager coordinates the entire data replication process—from discovery and configuration to the management of multiple application consistent disk-based replicas. Auto-discover your replication environment and enable streamlined management by scheduling, recording, and cataloging replica information, including auto-expiration. With Replication Manager, you can put the right data in the right place at the right time—on-demand or based on schedules and policies that you define. This application-centric product allows you to simplify replication management with application consistency.
FAST VP is EMC’s new automated storage tiering technology introduced in the latest Symmetrix Enginuity microcode version (5875). FAST VP combines a "sub-volume" auto-tiering technology from EMC and leverages Virtual Provisioning technology. It enables storage administrators to implement automated, policy-driven plans that perform dynamic, nondisruptive changes to the storage layouts of different applications. This is done by ensuring that the hot spots of a volume or LUN are served by high-performance drives and the inactive data is handled by cost-effective drives. With FAST VP, customers can achieve:
• Most efficient utilization of Flash drives for high-performance workloads • Lower cost of storage with the majority of the less-accessed data on Serial
Advanced Technology Attachment (SATA) drives
• Better performance at a lower cost, requiring fewer drives, less power and cooling, and a smaller footprint
• Radically simplified automated management in a tiered environment
Microsoft SCVMM
EMC Replication Manager
FAST VP can move data among (up to) three tiers (Flash drive, fibre channel (FC), and SATA drives) to meet the performance and capacity demands of a broad range of applications. Frequently accessed data is moved to, or kept at, proper storage tiers, based on the access patterns of sub-volumes and the defined FAST policy. Based on the changing performance requirements of applications, FAST VP only promotes those active hot spots of a volume or LUN to high-performance drives such as Flash drives, but not the entire volume or LUN. At the same time, FAST VP also moves less accessed portions of a volume or LUN to low-cost drives such as SATA drives. Customers thus get the best of both worlds; high performance and low cost.
Consolidated applications storage design on EMC VMAX
Storage design is a critical element to successfully deploy a mixed Microsoft application workload on the VMAX, hosted on the Hyper-V virtualization server platform. The process is essentially the same design for a physical environment as a virtual building block, from a disk perspective, except that in the virtual environment the virtual machine’s OS volume also needs to be accounted for.
The virtualized Microsoft consolidated application storage design comprises a number of different pieces including:
• Storage design requirements—Disk requirements and layout. • Virtual Provisioning—Pool design.
• FAST VP automated storage tiering design.
• Hyper-V virtual machine building block—Hyper-V root and virtual machine design and resource requirements. Each of these will be discussed in detail in this section.
• Backup and restore design and capabilities using EMC TimeFinder® snapshots and Replication Manager.
The high level objectives of this design are to:
• Validate and show how FAST VP in a virtualized consolidated Microsoft environment can support multiple Exchange, SQL and SharePoint workloads across the same disks
• Showcase how easy it is to provide each application the performance level and flexibility by using the right mix of drive types
• Provide sizing recommendations for all of the applications in a consolidated FAST VP environment
• Show how the VMAX and replication manager can be used to centrally manage protect multiple applications.
• Validate Microsoft SCVMM:
Monitor the health and performance of the environment
The solution was designed to showcase a mixed Microsoft application workload on the VMAX, hosted on the Hyper-V virtualization server platform. VMAX FAST VP automated storage tiering technology was leveraged to detect the I/O patterns of the different applications and handle unanticipated spikes, and provide optimal
performance for multiple applications. Microsoft SCVMM was implemented to manage and monitor the virtualized environment.
This solution simulates a multi-tenant scenario by demonstrating three distinct business units, each with unique requirements for user profile and size where the physical infrastructures are shared. This architecture could be leveraged by companies whose IT organizations service different departments.
Overview
Application environment highlights
This solution architecture helps provide direction on how to run consolidated
applications of different departments, simplify management of these workloads, and serve as guidance towards a private cloud deployment. The following applications are part of the solution:
• Microsoft Exchange 2010
Total of 42 databases, with 500 users per database, to house 21,000 users 14 mailbox virtual machines and 14 HUB/CAS servers in a two-copy DAG
configuration
Four Hyper-V servers hosting 28 virtual machines, to support 21,000 Exchange 2010 DAG environment
Replication Manager application sets and jobs are based on the grouping of three databases taking snapshots off the passive copy
• Microsoft SQL Server 2008 R2
Three SQL servers running hot, warm, and cold TPC-E-like OLTP workloads Total environment size is 1.2 TB
A shared Hyper-V cluster with four nodes to support the SQL and SharePoint environment
Replication Manager application sets and jobs per server, taking pointer based snapshots for each database
• Microsoft SharePoint 2010 3 TB SharePoint farm
A shared Hyper-V cluster with four nodes to support the SQL and SharePoint environment
Replication Manager application sets and jobs to take pointer-based snapshots of the SharePoint farm
The following list provides general design guidance for running a mixed Microsoft application workload on a VMAX:
• I/O requirements include user I/O (send/receive), any other overhead plus an additional 20 percent. (For Exchange, the 20 percent accounts for some overhead as well as log and BDM I/O)
• Database and log I/O should be evenly distributed among the SAN and storage back-end
• Use larger hyper volumes when creating LUNs to achieve better performance • A minimum of two HBAs are required per server to provide for redundancy • When using a hypervisor to virtualize the Microsoft Exchange servers a
minimum of three IP connections are recommended for each Hyper-V server • Always calculate I/O spindle requirements first, then capacity requirements
General design guidelines
• Use the following read/write ratio for the different Microsoft applications when unsure:
Microsoft Exchange—3:2 in a mailbox resiliency configuration and 1:1 in a standalone configuration
Microsoft SQL Server—85:15
Microsoft SharePoint Server—Content Databases - 90:10 R:W– Tempdb–Search 50:50
• Install EMC PowerPath® for optimal path management and maximum I/O performance. For more information on installing and configuring the PowerPath application, visit: http://www.emc.com/products/detail/software/powerpath-multipathing.htm
• Follow these recommendations to ensure the best possible server performance: Format new NTFS volumes on Windows Server 2008 R2 with the allocation
unit size (ALU) set to 64 KB. This can be done from the drop-down list in Disk Manager or through the command prompt using diskpart.
Note Partition alignment is no longer required when running Microsoft Windows Server 2008 as partitions are automatically aligned to a 1 MB offset.
• Visit the following Microsoft links for guidance on determining server memory and CPU requirements for the Microsoft Exchange 2010 Mailbox Server role:
http://technet.microsoft.com/en-us/library/ee832793.aspx (Memory) http://technet.microsoft.com/en-us/library/ee712771.aspx (CPU)
• Visit the following Microsoft links for guidance on determining server memory and CPU requirements for Microsoft SQL Server
http://msdn.microsoft.com/en-us/library/ms143506.aspx
• Visit the following Microsoft links for guidance on determining server memory and CPU requirements for Microsoft SharePoint
http://technet.microsoft.com/en-us/library/cc261795(office.12).aspx FAST VP is a policy-based system that automatically moves sub-LUN data across storage tiers to achieve performance service levels and cost targets. It enables the storage administrator to set high-performance policies that utilize more Flash drive capacity for critical applications, and cost-optimized policies that utilize more SATA drive capacity for less-critical applications. This section introduces FAST VP
functionality and describes how the technology was leveraged in this solution to provide optimal performance for all applications.
FAST VP provides automation to the tiered storage for Virtually Provisioned (VP) devices. Sub-LUN data movement for VP devices provides dramatically improved capacity utilization and a reduction in time and complexity to manage storage. This enables FAST VP to:
• Be more responsive to changes in the production workload activity • Improve performance
• Utilize capacity more efficiently by:
Requiring fewer Flash drives in the system Placing more data on FC and SATA drives
FAST VP theory
FAST VP has three main components:
• Storage tiers: A storage tier is a combination of drive technology and RAID protection in the VMAX array. The storage tiers can be either virtual pools or disk groups.
• Storage group: Storage groups are collections of LUNs that are associated with a FAST VP policy. In this solution there are three storage groups (one per application) associated with FAST VP policies.
• FAST policy: FAST policy is a setting that initiates data movement between storage tiers based on compliance and performance policy thresholds. A FAST VP policy combines storage groups with storage tiers, and defines the
configured capacities, as a percentage that a storage group is allowed to consume on that tier.
FAST VP storage design for applications
It is necessary to perform disk sizing for all applications individually before designing the best configuration for FAST VP. Sizing will depend on multiple factors, such as disk type, protection type, and cache. The following is a general guideline approach for sizing a consolidated workload with FAST VP:
• Perform sizing as per FAST VP policy requirements. A recommended starting point would be 80/20 skew assumption.
Note Skew can vary and will depend on the actual application profile. • Choose RAID 5 or RAID 6 protection types for faster tiers; FC and EFD to yield the
best total cost of ownership (TCO)
• Mirrored protection for SATA normally yields best performance results • For Exchange only:
Separate DAG copies on different spindles
Size the tiers to account for a full failover scenario where all the database copies are running on a single server
Perform sizing exercise for only FC and SATA tiers and allow a small amount of EFD to handle unanticipated spikes
Adopt a building block approach–size for one server and scale
This section addresses the sizing for each of the Microsoft applications that are part of this solution.
Sizing and configuring storage for use with Microsoft applications can be a complicated process, driven by many variables and factors, which vary from organization to organization. Properly configured storage, combined with properly sized server and network infrastructure, can guarantee smooth operation and the best user experience. The sizing for this solution was performed for consolidated workloads of multiple Microsoft applications. The following sizing exercise provides general guidance on FAST VP sizing for consolidated workloads. The sizing was performed for this solution to best leverage a three-tier FAST VP solution to support a mixed Microsoft application workload.
The storage requirements for Microsoft Exchange, often a capacity-driven application, can in most cases be satisfied by SATA drives. This solution, nevertheless, showcases how Exchange works with FAST VP, particularly in a consolidated application
environment leveraging multiple storage tiers.
One of the methods that can be used to simplify the sizing and configuration of large Microsoft Exchange environments is to define a unit of measure—a building block. A building block can be defined as the amount of disk and server resources required to support a specific number of Microsoft Exchange Server 2010 users.
FAST VP storage sizing for Microsoft applications
FAST VP building block for Exchange 2010
The amount of required resources is based on: • Specific user profile type
• Mailbox size • Disk requirements
The building block approach simplifies the implementation of Microsoft Exchange Server 2010 Mailbox server. Once the initial building block is designed, it can be easily reproduced to support the required number of total users in an organization. This approach serves as a baseline for Microsoft Exchange administrators to create their own building blocks, based on their company’s specific Microsoft Exchange environment requirements. This approach is very helpful when future growth is expected, as it makes Microsoft Exchange environment expansion much easier, and straightforward. EMC’s best practices involving the building block approach for Microsoft Exchange Server design has proved to be very successful throughout many customer implementations.
Phase 1 – Collect user requirements
The user requirements used to validate both the building block storage design methodology and VMAX performance are detailed in Table 2.
Table 2. User requirements
Item User equipment
Total number of users 21,000
User I/O profile 150 messages sent\received per day =
.15 IOPS per user per day in a DAG setup
User mailbox size Start with 500 MB, grow to 1 GB
Deleted item retention 14 days
Concurrency 100 percent
Recovery point objective (RPO) Remote < 5 minutes, local = 6 hours
Recovery time objective (RTO) 60 minutes
Mailbox resiliency solution (DAG) Yes
Backup/restore required Yes (hardware volume shadow service (VSS))
The requirements include starting with a user mailbox size of 500 MB with the ability to seamlessly grow to 1 GB. This document shows how this can be easily
accomplished, using the VMAX Virtual Provisioning feature (Virtual Provisioning and FAST VP design on the Symmetrix VMAX).
Based on the user requirements, a virtual Microsoft Exchange Building block of 3,000 users per server was created. The decision for the number of users per server was based on a number of factors, including:
• Total number of users—use this number to find a figure that can be evenly divided by a per-server number.
• User profile—a larger user I/O profile or large mailbox usually dictates fewer users per building block.
• Recovery Time Objective—with a smaller RTO and depending on the backup and restore technology used, fewer users may be supported within a single building block.
• Array features—the ability of an intelligent array, such as the VMAX, to backup and restore larger amounts of Microsoft Exchange data in seconds, makes it much easier to achieve good consolidation.
• Simplicity and ease of design—fewer larger databases will help and using a SAN and intelligent storage will make this easier to achieve.
• Hyper-V server configuration, memory and number of virtual CPUs available— the building block must fit the hardware resources so that the Hyper-V resources are factored in Table 3.
Table 3. Required information for building block design Building block characteristic Value
Maximum number of Microsoft Exchange Server 2010 users per server
3,000
Number of databases per server 6
Number of users per database 500
Disk type Mixed disk type supported by FAST VP
storage tiering technology Number of Hyper-V servers for Microsoft
Exchange Mailbox virtual machines
4 Total number of Microsoft Exchange Mailbox virtual machines
14
Database read/write ratio 3:2
Phase 2 - Design the storage architecture based on user requirements
It is recommended to first calculate the spindle requirements per building block from an I/O perspective, and then the space requirements. When performing spindle calculations to satisfy I/O requirements, follow the guidelines outlined below:
1. Calculate total Exchange I/O as using the following formula: Total I/O = (No. of users * I/O profile) + 20 percent
2. Apply a FAST policy split to size the tiers. It is recommended to start with an 80/20 skew–80 percent of I/Os to be serviced by faster tier, 20 percent to be serviced by slower tiers
3. Total I/O for faster tier = (0.8 or the larger percentage) * (Total Exchange I/O) 4. Total I/O for slower tier = (0.2 or the smaller percentage) * (Total Exchange
I/O)
5. Array IOPs = (Total IOPs *.60) + RAID penalty (Total IOPs *.40) 6. Disks required = Array IOPs / IOPs per spindle
• Tasks 5 and 6 need to be performed for each tier involved.
• When using thick or thin devices with Microsoft Exchange 2010, the initial spindle requirements for the pools must meet the I/O requirements. Once that has been accomplished, the sizing calculations should follow.
IOPS calculation example
• Total I/O for 3,000 users= 3000 * .15 = 450 + 20 percent = 540 IOPS • Total I/O for FC = (80 percent of 540) = 432
• Total I/O for SATA = (20 percent of 540) = 108
• SATA disks required to service 108 I/Os in a RAID 1 configuration (108*.60)*W 2(108*.40)= 152 / 55 ~ 4
• FC disks required to service 432 I/Os in a RAID 5 configuration (432*.60)*W 4(432*.40)= 948 / 130 ~ 8
• From an I/O sizing perspective, using 80 percent FC, 20 percent SATA split, the following disks are required per server:
4 7.2k 2 TB SATA 8 10k 600 GB FC
Disk space calculations
EMC recommends that the Microsoft calculator be used when performing disk space calculations. Current Microsoft calculations require more than 60 percent capacity over the user mailbox target size. EMC’s Virtual Provisioning is a key to reducing up-front storage purchase and addressing any unforeseen growth. The LUN calculations were performed to account for the final mailbox size (1 GB) and the spindle
calculations for the initial size (0.5 GB). Notes
When performing capacity calculations, follow the outlined steps below: • Calculate total capacity based on mailbox requirements
• Apply a FAST policy split to size the tiers. EMC recommends to start with an 80/20 skew–80 percent of capacity to reside on slower tier
• Total capacity for slower tier = (0.8 or the larger percentage) * (Total Exchange Capacity)
• Total capacity for faster tier = (0.2 or the smaller percentage) * (Total Exchange Capacity)
• Spindle requirement per server = <Total Capacity>/ <Usable Capacity> Note Sub-task 6 needs to be performed for each tier involved.
Capacity calculation example
• Capacity requirements = Total database capacity per server + Total log capacity per server
• Database size = <Number of mailboxes> x <Mailbox size on disk> x <Database overhead growth factor> = 500 users x 1.23 GB + 20 percent = 738 GB Based on this, the database LUN capacity can be calculated as follows:
• Database LUN size = <Database size> + <Content Index Catalog / (1 - Free space percentage requirement) = (738 GB + 73.8 GB)/0.8 = 1024 GB
• Log LUN size = Logs per day per user x number of users x truncation failure tolerance x 20 percent = 30 logs @ 1 MB x 500 users x 6 days x .20 = 115 GB • Total database capacity per server (for initial 0.5 GB mailbox size) = <Database
LUN size> * <No. of databases> = 1024*6 = 3072 GB
• Total log capacity per server = <Log LUN size> * <No. of databases> = 115*6 = 345 GB
• Total capacity = < Total database capacity per server> + < Total log capacity per server> = 3072 + 345= 3417 GB
• Usable capacity available per 2 TB SATA drive = 1754 GB • Usable capacity available per 600 GB 10K FC drive = 536 GB
• Spindle requirement per server = <Total capacity> / <Usable capacity> • Disk capacity skew recommendations are 80 percent SATA, 20 percent FC • Capacity on SATA is 3417*0.8 = 2733 GB
• Capacity on FC is 3417*0.2 = 684 GB
• Required spindles (SATA mirrored) – 2733 / 1754 ~ 4 (mirrored disks) • Required spindles (FC RAID 5) – 684 / 536 ~ 4
• From a capacity sizing perspective, using an 80 percent SATA and 20 percent FC split, the following disks would be required per server:
4 7.2k 2 TB SATA 4 10k 600 GB FC
The best configuration based on both I/O and capacity requirements is as follows:
1 GB mailbox state
No. of spindles required to satisfy both I/O
and capacity 4 7.2K 2 TB SATA drives 8 10K 600 GB FC
Thin LUN size (Database) 1024 GB
Thin LUN size (Log) 115 GB
For Exchange, the sizing does not account for Flash drives, which are used in small quantities to handle any unanticipated spikes. In many cases, as Exchange 2010 is capacity bound, requirements can be satisfied by SATA drives as opposed to a tiered solution.
Phase 3 – Validate design
Microsoft Exchange Jetstress and Microsoft Exchange LoadGen tools were used to validate the storage design. For a complete summary of Jetstress and LoadGen findings, see the Performance testing and validation results section of this white paper.
To maintain flexibility, performance, and granularity of recovery, ensure that the storage sizing and back-end configuration for SQL Server is optimal. Outlined in this section is the approach adopted when sizing SQL Server in a FAST VP configuration.
Phase 1 – Collect user requirements
The SQL configuration for this environment is outlined in Table 4. Table 4. SQL configuration user requirements
Item User equipment
Total number of users 100,000
Database users per server 20,000, 30,000, 50,000 respectively
Total IOPS 6,000
Number of databases 3
Database profile Hot / Warm / Cold
RPO Remote < 5 minutes, local = 6 hours
Microsoft SQL Server
Item User equipment
RTO 60 minutes
Read/write ratio 85:15
Backup/restore required Yes (hardware VSS)
Phase 2 – Design the storage architecture based on user requirements
It is recommended to first calculate the spindles for the SQL Server to satisfy I/O requirements, and then the space requirements. Shown below is the sizing calculation for this solution.
IOPS calculation
• Total I/O for 225,00 users = 6,000 + 20 percent = 6,000 + 1,200 = 7,200 IOPS
• Calculate the back-end I/O as per FAST VP policy requirements. In this solution, the FAST VP sizing was calculated based on the following skew for I/O 75 percent SATA, 15 percent FC and 10 percent Flash.
• Calculate the backend I/O for each tier:
Total backend I/O for RAID 1/0 SATA = (10 percent of 7,200) = (720*0.85) + 2 (720*0.15) = 828
Total I/O for RAID 5 FC = (15 percent of 7,200) = (1,080*0.85) + 4(1,080*0.15) = 1566
Total I/O for RAID 5 Flash = (75 percent of 7,200) = (5,040*0.85) + 4 (5,040*0.15) = 7308
Total Back-end I/O = 10,224
• SATA disks required to service 808 I/Os in a RAID 1/0 configuration 828/50=~17 round up to 18 for R1/0
• FC disks required to service 2,088 I/Os in a RAID 5 configuration 1,566 / 130= ~12
• FLASH disks required to service 7,308 I/Os in a RAID 5 configuration 7,308/1,800=~4
Note When calculating for performance, the fastest tier needs to service the maximum number of I/Os.
• From an I/O sizing perspective, using the above policy settings, the following disks are for the environment:
18 7.2k 2 TB SATA drives 12 10k 600 GB FC drives 4 200 GB Flash drives
Capacity calculation
• User database size Hot – 200 GB Warm – 300 GB Cold – 600 GB
• Calculate the database LUN size based on the user database sizes:
Database LUN size = <Database size> + Free space percentage requirement (20 percent)
Hot = 300 + 20 percent = 360 GB Warm = 400 + 20 percent = 480 GB Cold = 700 + 20 percent = 840 GB
• Calculate the tempdb and log LUN sizes for each of the databases. The log and tempdb sizes are calculated as 20 percent the size of the database:
Log and tempdb size
Hot database – 20 percent of 300 = 60 GB Warm database – 20 percent of 400 = 80 GB Cold database – 20 percent of 700 = 140 GB
The user database log and the tempdb are laid out on a separate LUN for each database. Based on this, the log LUNs were sized at 120 GB for the hot and warm databases and 140 GB for the cold database:
• Total database size = Sum of the sizes of all the databases = 2,448 GB • Usable capacity available per 2 TB SATA drive = 1,754 GB
• Usable capacity available per 600 GB 10K FC drive = 536 GB
• FAST policy skew used was 75 percent SATA, 15 percent FC and 10 percent Flash
• Capacity on each tier is:
SATA = 2,448*0.75 = 1,836 GB FC = 2,448*0.15 = 368 GB Flash = 2,448*0.1 = 245 GB
• Spindle requirement = <Total capacity> / <Usable capacity> • Spindles required for each tier is:
SATA (mirrored) = 4 FC (RAID 5 3+1) = 4 Flash (RAID 5 3+1) = 4
Note When calculating for capacity, the slowest tier needs to host the majority of the data.
• From a capacity sizing perspective, using these policy settings, the following disks are required for the environment:
4 7.2k 2 TB SATA drives 4 10k 600 GB FC drives 4 200 GB Flash drives
The best configuration based on both I/O and capacity requirements are as follows:
1 TB ( 200GB, 300GB, 500GB) SQL Server database
No. of spindles required to satisfy both I/O and capacity
18 7.2K 2 TB SATA drives 12 10K 600 GB FC drives 4 200 GB Flash drives Thin LUN sizes (Database)
Hot—360 GB Warm—480 GB Cold—840 GB Thin LUN size (Log)
Hot—120 GB Warm—120 GB Cold—140 GB
Microsoft SharePoint
The SharePoint Server farm has storage requirements that typically are mid to low IOPS on its content database search components. With FAST VP on VMAX, the SharePoint farm storage configuration can be simplified and the requirements of the disk I/O request in the farm can be dynamically satisfied with the auto-tiering technology. This section outlines the approach adopted when sizing SharePoint Servers in a FAST VP configuration.
Phase 1 – Collect user requirements
The user requirements used to validate both the building block storage design methodology and VMAX performance are detailed in Table 5.
Table 5. Building block storage design methodology and VMAX performance user requirements
SharePoint farm profile Quantity/size/type
Total data 3 TB
Document size range 200 KB-5 MB
Total site count 40,000
Size per site 20 GB
Total site collections count 16
SharePoint farm profile Quantity/size/type
Total IOPS 2,000
Total site collection 1
Total user count 30,000-40,000 heavy users
(Microsoft defined) Usage profile(s) (% browse / % search / % modify) 80% / 10% / 10%
User concurrency 10%
Phase 2 – Design the storage architecture based on user requirements
It is recommended to first calculate the spindles for SQL Server and other SharePoint Servers that require storage to satisfy I/O requirements, and then for space
requirements. In a SharePoint farm, the requirement for IOPS is much less than applications such as SQL Server. A two-tier (SATA and FC) configuration can be used to satisfy its disk needs. Shown below is the sizing calculation for this solution.
IOPS calculation
• Total I/O for the farm = 2,000*(1 + 20 percent)= 2,400 IOPS
• Apply a FAST policy split to size the tiers. EMC recommends to start with a 90/10 skew – 90 percent of I/Os to be serviced by faster tier, 10 percent to be serviced by slower tiers
• Calculate back-end I/O for each tier:
Total back-end I/O for RAID 1/0 SATA (10 percent of 2,400) = (240*0.85) + 2 (240*0.15) = 276
Total I/O for RAID 5 FC (90 percent of 2,400) = (2,160*0.85) + 4 (2,160*0.15) = 3,132
Total back-end I/O = 3,408
Note When calculating for performance, the fastest tier needs to service the maximum number of I/Os.
• From an I/O sizing perspective, using the above policy settings, the following disks are required for the environment:
6 7.2k 2 TB SATA drives (276/50 ~ 6) 24 10k 600 GB FC drives (3132/130 ~ 24)
Disk space calculation
EMC’s Virtual Provisioning is key to reducing up-front storage purchase and handling any unforeseen growth. When performing capacity calculations, follow the steps below:
• Calculate the total capacity based on SharePoint farm requirements
• Apply a FAST policy split to size the tiers. In this solution, a 90/10 skew for the SharePoint farm, where 70 percent of the capacity resided on the slower tier, was used.
• Total capacity for slower tier = (0.9 or the larger percentage) * 3 TB = 2.7 TB • Total capacity for faster tier = (0.1 or the smaller percentage) * 3 TB = 0.3 TB • Usable capacity available per 2 TB SATA drive = 1,754 GB
• Usable capacity available per 600 GB 10K FC drive = 536 GB • Spindle requirement for SATA drives = 2.7 TB / 1,754 GB = 4 • Spindle requirement for FC drives = 0.3 TB/536 GB ~ 4
• From a capacity sizing perspective, using a 70 percent SATA, 30 percent FC split, the following disks would be required for the farm:
2 7.2k 2 TB SATA 4 10k 600 GB FC
The ideal configuration based on both I/O and capacity requirements are as follows: SharePoint farm disk requirement
Number of spindles required to satisfy both I/O and capacity
6 7.2K 2 TB SATA drives 24 10K 600 GB FC
The final disk configuration for this environment needs to satisfy both I/O and capacity for all of the applications. The application that benefited the most with FAST VP in this environment was SQL Server since the TPC-E-like load was very heavy when compared to the other two applications. The final best disk configuration that
satisfies both the I/O and capacity requirements in this mixed workload environment is as follows:
Mixed Microsoft workload disk requirement
Total number of spindles required for the mixed Microsoft application workload
42 SATA 90 FC
8 Flash (4 introduced to handle unanticipated Exchange spikes)
The following management options are available to configure FAST VP: • Solutions Enabler Command Line Interface (SYMCLI)
• Symmetrix Management Console (SMC)
FAST VP in a VMAX environment provides an easy way to employ the storage service specializations of an array configuration with a mixture of drive types. FAST VP offers a simple and cost-effective way to provide optimal performance of a given mixed configuration, by automatically tiering storage to the changing application needs. The FAST VP tiers for this solution were chosen per the application requirements and FAST
Final best configuration
FAST VP
management tools
VP policies were set to allow data movement between the tiers to optimize performance. The following FAST VP tiers were used for this solution: Table 6. FAST VP tiers
Tier name Drive technology
MS_Flash 200 GB 15k Flash drives
MS_FibreChannel 600 GB 10K FC drives
MS_SATA 2 TB 7.2K SATA drives
The FAST VP policy settings varied for each application since the workload pattern of each application differed and their needs were different.
• For Exchange, FAST VP sizing was performed to leverage only FC and SATA tiers, and a small number of Flash drives were introduced to handle any
unanticipated spikes and any hot-spots in the workload. This policy helped to automate performance and, at the same time, keep the cost in check.
• For SQL, the testing tool emulated an OLTP TPC-E-type load, which performs wide-stripe reads, hence the FAST VP policy was designed to include more Flash to manage the high performance requirements. This provided the best blend of cost and performance.
• SharePoint, being a fairly low I/O application, did not require the Flash tier, hence the policy was implemented to include only the FC and SATA tiers. Nevertheless, by sharing the same tiers, and allowing FAST VP to move data per the policy settings, optimal performance was achieved for each application. The skew for each application was obtained by using a modeling tool called Tier Advisor.
Performance data is uploaded to the tool, which then models an optimal storage array configuration by enabling interactive experimentation with different storage tiers and storage policies until the desired cost and performance preferences are achieved. Tier Advisor helps define the amount of disk drives to use for each disk drive technology when configuring a tiered storage solution.
Applications building block design on Symmetrix VMAX
Once the disk calculations are computed for each application, the virtual machine and Hyper-V requirements can be calculated. The memory and CPU requirements are based on Microsoft best practices. This section details this for each application, separately.
Exchange Server 2010 database and DAG design
A DAG is a group of up to 16 Mailbox servers that host a set of databases, and provide automatic Exchange database-level recovery from failures that affect individual servers or databases.
A DAG is a boundary for mailbox database replication, database and server switchovers and failovers, and for an internal component called Active Manager. Active Manager is an Exchange Server 2010 component that manages switchovers and failovers, and runs on every server in a DAG.
The Exchange design for this solution incorporates an active/passive DAG design and adheres to the previously described guidelines, deploying three active and three passive databases for each mailbox virtual machine. Each virtual machine is capable of handling all six databases or 3,000 users in a switchover/failover condition, but under normal conditions only 1,500 users would be active on each mailbox virtual machine.
This grouping of three databases and 1,500 users was used and carried over to the backup configuration with Replication Manager, in an effort to make the management as easy as possible. The design also accounted for the reboot or loss of a Hyper-V root server. The active database on MBX1 residing on Hyper-V root server 1 has its passive copies on Hyper-V root server 2. If either Hyper-V root server needs to be rebooted, due to patching or maintenance, there is no loss of service to the user mailboxes. The design incorporated the following:
• A total of 21 active and 21 passive databases, with 500 users per database to house the 21,000 users
• Mailbox databases are grouped in collections of three (1,500 users) • Each mailbox virtual machine has three active and three passive mailbox
databases
• The active and passive database copies do not reside on the same virtual machine or on the same Hyper-V root server
• Replication Manager application sets and jobs design based on groupings of three databases per Replication Manager backup job
Microsoft Exchange
The DAG configuration on the virtual machines and Hyper-V servers is described in Figure 2.
Figure 2. DAG configuration on the virtual machines and Hyper-V servers
Hyper-V and virtual machine design
Once the user per building block and disk calculations are complete, the virtual machine and Hyper-V requirements can be calculated. Guidelines for memory configurations may be found at:
http://technet.microsoft.com/en-us/library/dd346700.aspx
CPU and memory requirement calculations start with the mailbox server role. Based on the requirements, the building block must be able to support 3,000 users per server. Provisioning sufficient megacycles so that mailbox server CPU utilization does not exceed 80 percent is also required. Table 7 lists the mailbox server megacycle and memory requirement calculations.
Table 7. Mailbox CPU requirements
Parameter Value
Active megacycles 3,000 mailboxes x 3 megacycles per
mailbox = 9,000
Passive megacycles 0 (sizing for all active)
Replication megacycles 0 (sizing for all active)
Maximum mailbox access concurrency 100%
Total required megacycles during mailbox server failure
9,000
Using megacycle capacity to determine the number of mailbox users that a Microsoft Exchange Mailbox Server can support is not an exact science. A number of factors can lead to unexpected megacycle results in test and production environments.
Therefore, megacycles should only be used to approximate the number of mailbox users that a Microsoft Exchange Mailbox Server can support. Also, per Microsoft recommendations, a 10 percent overhead needs to be factored in for hypervisor overhead.
Remember that it is always better to be a bit conservative rather than overly aggressive during the capacity planning portion of the design process.
Based on Microsoft guidelines and server vendor specifications, the CPU and memory requirements were determined for each virtual machine role. Table 8 provides the summary of the virtual machine CPU and memory configurations, which has taken into consideration a Hyper-V root server failure.
Table 8. Virtual machine CPU and memory configurations summary
Virtual machine role vCPUs per virtual machine Memory per virtual machine
Mailbox (to support 3000 users during failover)
4 20 GB
HUB/CAS 4 20 GB
A 120 GB VHD volume for the mailbox server OS was provisioned. Pass-through disks were used for the database and log volumes, primarily to accommodate the
requirement for a hardware Volume Shadow Service (VSS) snapshot. Table 9 describes those values.
Table 9. Mailbox virtual machine resources
Item Description
Number of users supported 3,000
User profile supported 0.15 (150 messages / user / day)
Database LUN 6 1 TB thin LUNs (pass-through)
Log LUN 6 115 GB thin LUNs (pass-through)
OS LUN 120 GB (VHD)
Virtual machine CPU 4 vCPU Xeon X7560 2.27 GHz
Virtual machine memory 20 GB
SQL Server configuration overview
The Windows and SQL Server 2008 R2 configuration of each virtual machine are as follows:
• Grant "Lock pages in memory" to a SQL startup account • Use a 64 KB NTFS allocation unit size for the user data device
• Four tempdb data files with an equal initialization size for every SQL instance • Grant the SQL Server Admin user “Perform volume maintenance tasks right”,
which enables fast file initialization and supports thin provisioning
Microsoft SQL Server
SQL Server test application
The test environment for SQL Server is based on a TPC-E-like workload. It is composed of a set of transactional operations simulating the activity of a brokerage firm, such as managing customer accounts, executing customer trade orders, and other interactions with financial markets.
SQL Server database design overview
The SQL Server test configuration is based on the following profile: • Number of SQL users supported: 100,000
• Simulated user workload with one percent concurrency rate and zero think time, consistent with Microsoft testing methodologies
• User data: 1.2 TB Table 10. SQL Server profile
Profile Value
Total SQL database capacity 1.2 TB
Number of SQL instances 3 (1 per virtual machine)
Number of userdatabases per instance 1
Number of virtual machines 3
Type of data store Pass-through
SQL virtual machine configuration 4 virtual processors (vCPUs) with 16 GB memory (no over-commitment) Concurrent users Mixed workloads to simulate hot, warm, and cold databases
Hyper-V and virtual machine design
Once the SQL Server LUN design and disk calculations are complete the virtual machine and Hyper-V requirements can be calculated. Guidelines for memory configurations may be found at:
Based on the requirements, the SQL Server needs to support a CPU utilization of less than 80 percent. Based on Microsoft guidelines and server vendor specifications, the CPU and memory requirements were determined for each virtual machine.
http://msdn.microsoft.com/en-us/library/ms143506.aspx
Table 11 provides the summary of the virtual machine CPU and memory, and storage
configurations, which has taken into consideration a Hyper-V root server failure. A 120 GB VHD volume for the SQL server OS was provisioned. Pass-through disks were used for the database and log volumes, primarily to accommodate the requirement for a hardware VSS snapshot. Table 11 and Table 12 describe those values.
Table 11. SQL Server 2008 R2 virtual machine resources Item SQL Server1 SQL Server 2 SQL Server3
Number of users supported 20,000 30,000 50,000
Database (pass-through thin LUN) 360 GB 480 GB 840 GB
Log LUN (pass-through thin LUN) 60 GB 80 GB 120 GB
Tempdb/log LUN ( pass-through thin LUN) 60 GB 80 GB 120 GB
OS LUN (VHD) 120 GB 120 GB 120 GB
Virtual machine CPU ( vCPU Xeon X7560 2.27 GHz)
4 4 4
Virtual machine memory 16 GB 16 GB 16 GB
Table 12. Hyper-V server configuration (shared with SharePoint farm virtual machines)
Item Description
Number of Hyper-V Servers 3
Server type Xeon x7460, 4 sockets, 6 cores
Total memory 192 GB
Total vCPUs 32
Number of HBAs 2 dual-port 4 Gb QLogic 2462
SharePoint 2010 farm design considerations
The SharePoint farm was designed for optimized performance, reduced bottlenecks, and ease of manageability.
The SharePoint 2010 farm consists of:
• A web application that was created by using the enterprise portal collaboration template
• Sixteen enterprise document center site collections were created on 16 content databases on two SQL Server hosts
• Two SharePoint crawl servers • Four SharePoint query servers
• Four SharePoint web front-end servers
• Metalogix StoragePoint installed on the web front-end (WFE) server
Microsoft SharePoint
Table 13 describes the SharePoint environment profile. Table 13. Environment profile
SharePoint farm profile Quantity/size/type
SharePoint – Total Data 3 TB
SharePoint – Document Size Range 200 KB-5 MB
SharePoint – Total Site Count 40,000
SharePoint – Size Per Site 20 GB
SharePoint – Total Site Collections Count
16
SharePoint – Content Database Size 200 GB
SharePoint – Total User Count 30-40k
SharePoint – Usage Profile(s) (%
browse / % search / % modify) 80% / 10% / 10%
SharePoint – User Concurrency 10%
SharePoint farm design overview
In SharePoint 2010, the crawl servers index the contents of the SharePoint farm, then populate the crawl and property stores on the SQL database server, and add content index (CI) files on the query server.
In this test environment, 16 content databases on two SQL Server instances were populated with over two million documents, which added up to 3 TB of data.
For each crawl server in the SharePoint farm, one 10 GB LUN was created to store the crawl index files.
For each query server in the SharePoint farm, two LUNs of 240 GB each were created to store the CI files (the query index partition and its backup mirror) resulting from the crawl operations.
The search server in SharePoint Server 2010 is architected to provide greater redundancy in a single farm, and to enable scalability in multiple directions. Each component that makes up the query architecture (the query servers, index partitions, and property database) and the crawling architecture (crawl servers, crawl database, and crawl property database) can be scaled-out separately, based on the needs and growth of an organization.
Server role overview
The SharePoint 2010 farm configuration details contain the following server roles: • SharePoint WFE server
• SQL Server 2008 R2 database server • SharePoint crawl server
• SharePoint query server
SharePoint WFE server
• The WFE servers are deployed on a Windows Server 2008 R2 virtual machine • Metalogix StoragePoint 3.0 are installed on one of the WFE servers
• Internet Information Service (IIS) is configured to provide web content to SharePoint clients
• IIS web garden threads are set at the default value of one for optimal performance and ease of management
• IIS logging is disabled to limit the unnecessary growth of log files and to optimize performance
Database server
• The resource-intensive database servers are deployed on a Windows Server 2008 R2 virtual machine
• SQL server 2010 R2 Enterprise application servers are installed • Dedicated SQL Server I/O channels to the back-end disks are used • The database and log LUNs are attached through pass-through LUNs
SharePoint crawl server
• The crawl servers are installed on Windows Server 2008 R2 virtual machine • SharePoint index crawling roles are deployed to this host
• Dedicated crawl index LUNs are attached as pass-through LUNs to store the CI files from the crawling operation
SharePoint query server
• The query servers are installed on Windows Server 2008 R2 virtual machine • SharePoint query roles are deployed to this host
• Dedicated query index LUNs are attached as pass-through LUNs to each of the query server for its query index partitions and mirror
• One of the query servers also serves as the Search Admin, and the query index LUN also hosts the Osearch Administration component
SharePoint database design
The sequence of building a SQL Server environment for SharePoint 2010 requires three different subsets of databases in the following order:
• SQL system databases created during installation (master, model, msdb, and tempdb).
• SharePoint databases created by the system from the deployment of SharePoint farms (SharePoint admin, configdb, crawl store, Windows SharePoint Services (WSS), and content database).
• SharePoint content databases, populated with user content documents in the SharePoint farm.
• Content database design:
SharePoint 2010 content databases are recommended to be 200 GB or less for general usage scenarios, and a maximum of 2,000 items per list. Larger database sizes are supported in certain situations.
With over one million documents and the requirement of 3 TB of data, the SharePoint farm is designed to have 16 content databases for 16 sites, and 160 sub-sites.
One 250 GB pass-through LUN is configured for each content database.
Hyper-V and virtual machine design
Once the SharePoint farm Server LUN design and disk calculations are complete, the virtual machine and Hyper-V requirements can be calculated. Guidelines for memory configurations for SharePoint may be found at:
http://technet.microsoft.com/en-us/library/cc261795(office.12).aspx
Based on the requirements, the SharePoint Server virtual machines need to support a CPU utilization of less than 80 percent. Based on Microsoft guidelines and server vendor specifications, the CPU and memory requirements were determined for each virtual machine role. Table 14 provides the summary of the virtual machine CPU and memory configurations, which has taken into consideration a Hyper-V root server failure.
A 120 GB VHD volume for the SharePoint server OS was provisioned. Pass-through disks were used for the database and log volumes, primarily to accommodate the requirement for a hardware VSS snapshot.
Table 14. Summary of the virtual machine CPU and memory configurations
Virtual machine name Number of servers vCPU Memory (GB)
WFE Server 4 4 4 Query Server 4 4 8 Crawl Server 2 4 8 Admin Server 1 2 4 SQL Server 2 4 16 Total 50 100
Microsoft SQL Server and SharePoint Hyper-V design
Figure 3 shows the initial location of the virtual machines on the Hyper-V hosts for SQL Server and the SharePoint farm servers:
Figure 3. Hyper-V virtual machine placement
The goal is to ensure that the load is evenly distributed across all of the root servers to prevent any performance bottlenecks. The SQL servers and SharePoint machines were distributed evenly across the four Hyper-V root servers, as shown in Figure 3. Depending on the load, the less active database servers could be hosted by the same Hyper-V host. The design in Figure 3 is an example of where the least-active database server (SQL03) was placed with another, less active database server (SP SQL01) on the same Hyper-V machine.
For the SharePoint farm, the key is to lay out the different servers with the same role to different Hyper-V hosts, in order to balance the server resource usage. Also, if there is an outage of one Hyper-V server, the requests will be easily serviced by other virtual machines in the farm, while the specific virtual machine fails over to another Hyper-V host and resumes their roles.
This section details how VMAX Virtual Provisioning and FAST VP storage tiering technology was used to provide a well-performing, easy-to-use, and economical design for all of the applications. One of the main advantages of using VMAX Virtual Provisioning is the ability to easily increase the database volumes’ storage as the users’ mailboxes grow, without any server interruptions. This, combined with FAST VP, which allows for sub-LUN level movement between storage tiers, can deliver excellent performance, tremendous savings in disk cost, power, cooling, and a reduced footprint. It also provides for outstanding storage flexibility for a mixed workload Microsoft environment. VMAX Virtual Provisioning and FAST VP work well with Hyper-V and storage technologies such as snapshots, clones, and Symmetrix Remote Data Facility (SRDF). For more information on VMAX virtual provisioning, refer to the Symmetrix Virtual Provisioning Feature Specificationpaper.
When leveraging FAST VP for Microsoft Exchange on a VMAX, the DAG copies were laid out on separate spindles per Microsoft’s best practice recommendations. Hence, two sets of Symmetrix VP tiers were designed to host each DAG copy. One set was scaled dynamically to include disks for SQL and SharePoint to support the
consolidated workload. The Exchange FAST VP policy setting was identical for both of the DAG copies.
Virtual Provisioning and FAST VP design on the Symmetrix VMAX