“70-473 Design and Implement Cloud Data Platform Solutions” (Beta Exam)
Exam Preparation document. Throughout the document the URL’s to the original sources can be found. To be shure to have the latest information, read the online document. To make studying easer I tried to pick the information that seems to be the most important or relevant for the exam.
Last update: December 29, 2015 15:26 CET. Used Sources:
Microsoft Virtual Academy
o https://mva.microsoft.com/search/SearchResults.aspx?q=sql%202014 Aure SQL Database Documentation
o https://azure.microsoft.com/en-us/documentation/services/sql-database Technet Library SQL Server
o https://technet.microsoft.com/en-us/library/mt590198(v=sql.1).aspx Channel 9, Azure SQL Database
o https://channel9.msdn.com/Search?term=azure%20sql%20database#ch9Search&lan g-en=en&lang-nl=nl&pubDate=year
Channel 9, SQL Server 2014
o https://channel9.msdn.com/Search?term=sql%20server%202014#ch9Search&lang-nl=nl&lang-en=en
70-473 Design and Implement Cloud Data Platform Solutions
1 Design and implement database solutions for Microsoft SQL Server and SQL Database (20–25%) ... 4
1.1 Design a hybrid SQL Server solution ... 4
1.1.1 Design Geo/DR topology ... 6
1.1.2 Design a data storage architecture ... 10
1.1.3 Design a security architecture ... 12
1.1.4 Design a data load strategy ... 17
1.2 Implement SQL Server on Azure Virtual Machines (VMs) ... 17
1.2.1 Provision SQL Server in an Azure VM ... 17
1.2.2 Configure firewall rules ... 17
1.2.3 Configure and optimize storage ... 17
1.2.4 Migrate an on-premises database to Microsoft Azure ... 28
1.2.5 Configure and optimize VM sizes by workload ... 28
1.3 Design a SQL Database solution ... 28
1.3.1 Design a solution architecture ... 29
1.3.2 Design Geo/DR topology ... 29
1.3.3 Design a security architecture ... 29
1.3.4 Design a data load strategy ... 30
1.3.5 Determine the appropriate service tier ... 31
1.4 Implement SQL Database ... 39
1.4.1 Provision SQL Database ... 39
1.4.2 Configure firewall rules ... 39
1.4.3 Configure active geo-replication ... 39
1.4.4 Migrate an on-premises database to SQL Database ... 39
1.4.5 Configure for scale and performance ... 39
1.5 Design and implement data warehousing on Azure ... 39
1.5.1 Design a data warehousing solution on Azure ... 39
1.5.2 Design a data load strategy and topology ... 39
1.5.3 Configure SQL Data Warehouse ... 40
1.5.4 Migrate an on-premises database to SQL Data Warehouse ... 40
2 Manage database management systems (DBMS) security (25–30%) ... 40
2.1 Design and implement SQL Server Database security ... 40
2.1.1 Configure firewalls ... 40
2.1.2 Manage logins, users, and roles ... 40
2.1.3 Assign permissions ... 40
2.1.4 Configure auditing ... 40
2.1.5 Configure transparent database encryption ... 51
2.2.1 Configure firewalls ... 51
2.2.2 Manage logins, users, and roles ... 52
2.2.3 Assign permissions ... 52
2.2.4 Configure auditing ... 52
2.2.5 Configure row-level security ... 53
2.2.6 Configure data encryption ... 53
2.2.7 Configure data masking ... 54
2.2.8 Configure Always Encrypted ... 55
3 Design for high availability, disaster recovery, and scalability (25–30%) ... 56
3.1 Design and implement high availability solutions ... 56
3.1.1 Design a high availability solution topology ... 56
3.1.2 Implement high availability solutions between on-premises and Azure ... 57
3.1.3 Design cloud-based backup solutions ... 59
3.1.4 Implement backup and recovery strategies ... 61
3.2 Design and implement scalable solutions ... 64
3.2.1 Design a scale-out solution ... 64
3.2.2 Implement multi-master scenarios with database replication ... 64
3.2.3 Implement elastic scale for SQL Database ... 64
3.3 Design and implement SQL Database data recovery... 65
3.3.1 Design a backup solution for SQL Database ... 65
3.3.2 Implement self-service restore ... 68
3.3.3 Copy and export databases ... 68
4 Monitor and manage database implementations on Azure (25–30%) ... 68
4.1 Monitor and troubleshoot SQL Server VMs on Azure ... 68
4.1.1 Monitor database and instance activity ... 68
4.1.2 Monitor using dynamic management views (DMVs) and dynamic management functions (DMFs) 70 4.1.3 Monitor performance and scalability ... 70
4.2 Monitor and troubleshoot SQL Database ... 70
4.2.1 Monitor and troubleshoot SQL Database ... 70
4.2.2 Monitor database activity ... 71
4.2.3 Monitor using DMVs and DMFs ... 72
4.2.4 Monitor performance and scalability. ... 72
4.3 Automate and manage database implementations on Azure ... 72
4.3.1 Manage SQL Server in Azure VMs with PowerShell ... 72
4.3.2 Manage Azure SQL Database with PowerShell ... 73
1 Design and implement database solutions for Microsoft SQL
Server and SQL Database (20–25%)
1.1 Design a hybrid SQL Server solution
MVA course: Platform for Hybrid Cloud with SQL Server 2014 Jump Start Extend on-premises AlwaysOn Availability Groups to Azure:
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-sql-server-extend-on-premises-alwayson-availability-groups
Selecting a SQL Server option in Azure: Azure SQL Database (PaaS) or SQL Server on Azure VMs (IaaS) https://azure.microsoft.com/en-us/documentation/articles/data-management-azure-sql-database-and-sql-server-iaas
Learn how each option fits into the Microsoft data platform and get help matching the right option to your business requirements. Whether you prioritize cost savings or minimal administration ahead of everything else, this article can help you decide which approach delivers against the business requirements you care about most.
When designing an application, four basic options are available for hosting the SQL Server part of the application:
SQL Server on non-virtualized physical machines
SQL Server in on-premises virtualized machines (private cloud) SQL Server in Azure Virtual Machine (public cloud)
Azure SQL Database (public cloud)
The following table summarizes the main characteristics of SQL Database and SQL Server on Azure VMs:
SQL DATABASE SQL SERVER IN AZURE VM
Best for
New cloud-designed applications that have time
Existing applications that require fast migration to the cloud with minimal changes.
constraints in development and marketing.
Applications that need built-in high availability, disaster recovery, and upgrade mechanisms. Teams that do not want to
manage the underlying operating system and configuration settings. Applications using scale-out
patterns.
Databases of up to 1 TB in size.
Building Software-as-a-Service (SaaS) applications.
SQL Server applications that require access to on-premises resources (such as Active Directory) from Azure via a secure tunnel.
If you need a customized IT environment with full administrative rights.
Rapid development and test scenarios when you do not want to buy on-premises non-production SQL Server hardware.
Disaster recovery for on-premises SQL Server applications using [backup to Azure
Storage](http://msdn.microsoft.com/library/jj919 148.aspx) or [AlwaysOn replicas with Azure VMs](../virtual-machines/virtual-machines-sql- server-high-availability-and-disaster-recovery-solutions.md).
Large databases that are bigger than 1 TB in size.
Resources
You do not want to employ IT resources for support and maintenance of the underlying infrastructure. You want to focus on the
application layer.
You have IT resources for support and maintenance.
Total cost of
ownership
Eliminates hardware costs. Reduces administrative costs.
Eliminates hardware costs.
Business continuity
In addition to built-in fault tolerance infrastructure capabilities, Azure SQL Database provides features, such as Point in Time Restore, Geo-Restore, and Geo-Replication to increase business continuity. For more information, see [SQL Database business
continuity overview](sql- database-business-continuity.md).
SQL Server on Azure VMs lets you to set up a high availability and disaster recovery solution for your database’s specific needs. Therefore, you can have a system that is highly optimized for your application. You can test and run failovers by yourself when needed. For more information, see [High Availability and Disaster Recovery for SQL Server on Azure Virtual Machines]((../virtual- machines/virtual-machines-sql-server-high-availability-and-disaster-recovery-solutions.md).
Hybrid cloud
Your on-premises
application can access data in Azure SQL Database.
With SQL Server on Azure VMs, you can have applications that run partly in the cloud and partly on-premises. For example, you can extend your on-premises network and Active Directory Domain to the cloud via [Azure Virtual
Network](../virtual-network/virtual-networks-overview.md). In addition, you can store on-premises data files in Azure Storage using [SQL Server Data Files in Azure]
(http://msdn.microsoft.com/library/dn385720.as px). For more information, see [Introduction to SQL Server 2014 Hybrid
Cloud](http://msdn.microsoft.com/library/dn606 154.aspx).
Supports disaster recovery for on-premises SQL Server applications using [SQL Server Backup and Restore with Azure Blob Storage]
(http://msdn.microsoft.com/library/jj919148.asp x) or [AlwaysOn replicas in Azure VMs](../virtual- machines/virtual-machines-sql-server-high-availability-and-disaster-recovery-solutions.md).
Getting Started with Azure SQL Data Sync:
https://azure.microsoft.com/en-us/documentation/articles/sql-database-get-started-sql-data-sync In this tutorial, you create a hybrid (SQL Server and SQL Database instances) sync group fully configured and synchronizing on the schedule you set.
1.1.1 Design Geo/DR topology
Designing cloud applications for business continuity using Geo-Replication
https://azure.microsoft.com/en-us/documentation/articles/sql-database-designing-cloud-solutions-for-disaster-recovery
Overview
The Active Geo-Replication feature implements a mechanism to provide database redundancy within the same Microsoft Azure region or in different regions (geo-redundancy). Active Geo-Replication asynchronously replicates committed transactions from a database to up to four copies of the database on different servers. The original database becomes the primary database of the
continuous copy. Each continuous copy is referred to as an online secondary database. The primary database asynchronously replicates committed transactions to each of the online secondary databases. While at any given point, the online secondary data might be slightly behind the primary database, the online secondary data is guaranteed to always be transactionally consistent with changes committed to the primary database. Active Geo-Replication supports up to four online secondaries, or up to three online secondaries and one offline secondary.
One of the primary benefits of Active Geo-Replication is that it provides a database-level disaster recovery solution. Using Active Geo-Replication, you can configure a user database in the Premium service tier to replicate transactions to databases on different Microsoft Azure SQL Database servers within the same or different regions. Cross-region redundancy enables applications to recover from a
permanent loss of a datacenter caused by natural disasters, catastrophic human errors, or malicious acts.
Another key benefit is that the online secondary databases are readable. Therefore, an online secondary can act as a load balancer for read workloads such as reporting. While you can create an online secondary in a different region for disaster recovery, you could also have an online secondary in the same region on a different server. Both online secondary databases can be used to balance read only workloads serving clients distributed across several regions.
Other scenarios where Active Geo-Replication can be used include:
Database migration: You can use Active Geo-Replication to migrate a database from one server to another online with minimum downtime.
Application upgrades: You can use the online secondary as a fail back option.
To achieve real business continuity, adding redundancy between datacenters to relational storage is only part of the solution. Recovering an application (service) end-to-end after a disastrous failure requires recovery of all components that constitute the service and any dependent services. Examples of these components include the client software (for example, a browser with a custom JavaScript), web front ends, storage, and DNS. It is critical that all components are resilient to the same failures and become available within the recovery time objective (RTO) of your application. Therefore, you need to identify all dependent services and understand the guarantees and
capabilities they provide. Then, you must take adequate steps to ensure that your service functions during the failover of the services on which it depends. For more information about designing solutions for disaster recovery, see Designing Cloud Solutions for Disaster Recovery Using Active Geo-Replication.
Active Geo-Replication Capabilities
The Active Geo-Replication feature provides the following essential capabilities:
Automatic Asynchronous Replication: After an online secondary database has been seeded, updates to the primary database are asynchronously copied to the online secondary database automatically. This means that transactions are committed on the primary database before they are copied to the online secondary database. However, after seeding, the online secondary database is transactionally consistent at any given point in time.
NOTE:
Asynchronous replication accommodates the latency that typifies wide-area networks by which remote datacenters are connected.
Multiple online secondary databases: Two or more online secondary databases increase redundancy and protection for the primary database and application. If multiple online secondary databases exist, the application will remain protected even if one of the online secondary databases fails. If there is only one online secondary database, and it fails, the application is exposed to higher risk until a new online secondary database is created.
Readable online secondary databases: An application can access an online secondary database for read-only operations using the same security principals used for accessing the primary database. Continuous copy operations on the online secondary database take precedence over application access. Also, if the queries on the online secondary database cause prolonged table locking, transactions could eventually fail on the primary database.
User-controlled termination for failover: Before you can failover an application to an online secondary database, the continuous copy relationship with the primary database must be terminated. Termination of the continuous copy relationship requires an explicit action by the application or an administrative script or manually via the portal. After termination, the online secondary database becomes a stand-alone database. It becomes a read-write database unless the primary database was a read-only database. Two forms of Termination of a Continuous Copy Relationship are described later in this topic.
NOTE:
Active Geo-Replication is only supported for databases in the Premium service tier. This applies for both the primary and the online secondary databases. The online secondary must be configured to have the same or larger performance level as the primary. Changes to performance levels to the primary database are not automatically replicated to the secondaries. Any upgrades should be done on the secondary databases first and finally on the primary. For more information about changing performance levels, see Changing Performance Levels. There are two main reasons the online secondary should be at least the same size as the primary. The secondary must have enough capacity to process the replicated transactions at the same speed as the primary. If the secondary does not have, at minimum, the same capacity to process the incoming transactions, it could lag behind and eventually impact the availability of the primary. If the secondary does not have the same capacity as the primary, the failover may degrade the application’s performance and availability.
Continuous Copy Relationship Concepts
Local data redundancy and operational recovery are standard features for Azure SQL Database. Each database possesses one primary and two local replica databases that reside in the same datacenter, providing high availability within that datacenter. This means that the Active Geo-Replication databases also have redundant replicas. Both the primary and online secondary databases have two secondary replicas. However, the primary replica for the secondary database is directly updated by the continuous copy mechanism and cannot accept any application-initiated updates. The following figure illustrates how Active Geo-Replication extends database redundancy across two Azure regions. The region that hosts the primary database is known as the primary region. The region that hosts the online secondary database is known as the secondary region. In this figure, North Europe is the primary region. West Europe is the secondary region.
If the primary database becomes unavailable, terminating the continuous copy relationship for a given online secondary database makes the online secondary database a stand-alone database. The online secondary database inherits the read-only/read-write mode of the primary database which is unchanged by the termination. For example, if the primary database is a read-only database, after termination, the online secondary database becomes a read-only database. At this point, the application can fail over and continue using the online secondary database. To provide resiliency in the event of a catastrophic failure of the datacenter or a prolonged outage in the primary region, at least one online secondary database needs to reside in a different region.
Creating a Continuous Copy
You can only create a continuous copy of an existing database. Creating a continuous copy of an existing database is useful for adding geo-redundancy. A continuous copy can also be created to copy an existing database to a different Azure SQL Database server. Once created the secondary database is populated with the data copied from the primary database. This process is known as seeding. After seeding is complete each new transaction is replicated after it commits on the primary.
For information about how to create a continuous copy of an existing database, see How to enable Geo-Replication.
Preventing the Loss of Critical Data
Due to the high latency of wide area networks, continuous copy uses an asynchronous replication mechanism. This makes some data loss unavoidable if a failure occurs. However, some applications may require no data loss. To protect these critical updates, an application developer can call the sp_wait_for_database_copy_sync system procedure immediately after committing the transaction. Calling sp_wait_for_database_copy_sync blocks the calling thread until the last committed
transaction has been replicated to the online secondary database. The procedure will wait until all queued transactions have been acknowledged by the online secondary database.
sp_wait_for_database_copy_sync is scoped to a specific continuous copy link. Any user with the connection rights to the primary database can call this procedure.
NOTE:
The delay caused by a sp_wait_for_database_copy_sync procedure call might be significant. The delay depends on the length of the queue and on the available bandwidth. Avoid calling this procedure unless absolutely necessary.
Termination of a Continuous Copy Relationship
The continuous copy relationship can be terminated at any time. Terminating a continuous copy relationship does not remove the secondary database. There are two methods of terminating a continuous copy relationship:
Planned Termination is useful for planned operations where data loss is unacceptable. A planned termination can only be performed on the primary database, after the online secondary database has been seeded. In a planned termination, all transactions committed on the primary database are replicated to the online secondary database first, and then the continuous copy relationship is terminated. This prevents loss of data on the secondary database.
Unplanned (Forced) Termination is intended for responding to the loss of either the primary
the primary database or the secondary database. Every forced termination results in the irreversible loss of the replication relationship between the primary database and the associated online
secondary database. Additionally, forced termination causes the loss of any transactions that have not been replicated from the primary database. A forced termination terminates the continuous copy relationship immediately. In-flight transactions are not replicated to the online secondary database. Therefore, a forced termination can result in an irreversible loss of any transactions that have not been replicated from the primary database.
NOTE:
If the primary database has only one continuous copy relationship, after termination, updates to the primary database will no longer be protected.
For more information about how to terminate a continuous copy relationship, see Recover an Azure SQL Database from an outage.
1.1.2 Design a data storage architecture
SQL Server Database Files and Filegroups:
https://msdn.microsoft.com/en-us/library/ms189563.aspx Database Files:
File Description
Primary The primary data file contains the startup information for the database and points to the other files in the database. User data and objects can be stored in this file or in secondary data files. Every database has one primary data file. The recommended file name extension for primary data files is .mdf.
Secondary Secondary data files are optional, are user-defined, and store user data. Secondary files can be used to spread data across multiple disks by putting each file on a different disk drive. Additionally, if a database exceeds the maximum size for a single Windows file, you can use secondary data files so the database can continue to grow.
The recommended file name extension for secondary data files is .ndf. Transaction
Log
The transaction log files hold the log information that is used to recover the database. There must be at least one log file for each database. The recommended file name extension for transaction logs is .ldf.
File Groups:
Filegroup Description
Primary The filegroup that contains the primary file. All system tables are allocated to the primary filegroup.
User-defined
Any filegroup that is specifically created by the user when the user first creates or later modifies the database.
SQL Server Data Files in Microsoft Azure
https://msdn.microsoft.com/en-US/library/dn385720.aspx
SQL Server Data Files in Microsoft Azure enables native support for SQL Server database files stored as Microsoft Azure Blobs. It allows you to create a database in SQL Server running in on-premises or in a virtual machine in Microsoft Azure with a dedicated storage location for your data in Microsoft
Azure Blob Storage. This enhancement especially simplifies to move databases between machines by using detach and attach operations. In addition, it provides an alternative storage location for your database backup files by allowing you to restore from or to Microsoft Azure Storage. Therefore, it enables several hybrid solutions by providing several benefits for data virtualization, data movement, security and availability, and any easy low costs and maintenance for high-availability and elastic scaling.
This topic introduces concepts and considerations that are central to storing SQL Server data files in Microsoft Azure Storage Service.
For a practical hands-on experience on how to use this new feature, see Tutorial: Using the Microsoft Azure Blob storage service with SQL Server 2016 databases .
The following diagram demonstrates that this enhancement enables you to store SQL Server database files as Microsoft Azure blobs in Microsoft Azure Storage regardless of where your server resides.
Benefits of using SQL Server Data Files in Microsoft Azure
Easy and fast migration benefits: This feature simplifies the migration process by moving one database at a time between machines in on-premises as well as between on-premises and cloud environments without any application changes. Therefore, it supports an
incremental migration while maintaining your existing on-premises infrastructure in place. In addition, having access to a centralized data storage simplifies the application logic when an application needs to run in multiple locations in an on-premises environment. In some cases, you may need to rapidly setup computer centers in geographically dispersed locations, which gather data from many different sources. By using this new enhancement, instead of moving data from one location to another, you can store many databases as Microsoft Azure blobs, and then run Transact-SQL scripts to create databases on the local machines or virtual machines.
Cost and limitless storage benefits: This feature enables you to have limitless off-site storage in Microsoft Azure while leveraging on-premises compute resources. When you use
the overhead of hardware management. If you lose a computation node on-premises, you can set up a new one without any data movement.
High availability and disaster recovery benefits: Using SQL Server Data Files in Microsoft Azure feature might simplify the high availability and disaster recovery solutions. For example, if a virtual machine in Microsoft Azure or an instance of SQL Server crashes, you can re-create your databases in a new machine by just re-establishing links to Microsoft Azure Blobs.
Security benefits: This new enhancement allows you to separate a compute instance from a storage instance. You can have a fully encrypted database with decryption only occurring on compute instance but not in a storage instance. In other words, using this new enhancement, you can encrypt all data in public cloud using Transparent Data Encryption (TDE) certificates, which are physically separated from the data. The TDE keys can be stored in the master database, which is stored locally in your physically secure on-premises machine and backed up locally. You can use these local keys to encrypt the data, which resides in Microsoft Azure Storage. If your cloud storage account credentials are stolen, your data still stays secure as the TDE certificates always reside in on-premises.
Snapshot backup: This feature enables you to use Azure snapshots to provide nearly instantaneous backups and quicker restores for database files stored using the Azure Blob storage service. This capability enables you to simplify your backup and restore policies. For more information, see File-Snapshot Backups for Database Files in Azure.
More info in the article SQL Server Data Files in Microsoft Azure ……. https://msdn.microsoft.com/en-US/library/dn385720.aspx
1.1.3 Design a security architecture
Channel 9, Ignite 2015 video: Overview and Roadmap for Microsoft SQL Server Security Encryption
o Always Encrypted
o TDE for SQL DB, TDE Perf (Intel NIS HW acceleration) o Enhancements to Crypto
o CLE for SQL DB (Cell Level Encryption) Auditing
o Enhancements to SQL Audit
Reporting and Analysis (also with power BI) Audit outcome of transactions
Secure App Development o Role level security o Dynamic Data Masking Always encrypted
https://msdn.microsoft.com/en-us/library/mt163865.aspx
Allows customers to securely store sensitive data outside of their trust boundary. Data remains protected from high-privileged, unauthorized users.
Client driven: Client side encryption and decryption.
Always Encrypted Typical Scenarios:
Client and Data On-Premises
A customer has a client application and SQL Server both running on-premises, at their business location. The customer wants to hire an external vendor to administer SQL Server. In order to protect sensitive data stored in SQL Server, the customer uses Always Encrypted to ensure the separation of duties between database administrators and application administrators. The customer stores plaintext values of Always Encrypted keys in a trusted key store which the client application can access. SQL Server administrators have no access to the keys and, therefore, are unable to decrypt sensitive data stored in SQL Server.
Client On-Premises with Data in Azure
A customer has an on-premises client application at their business location. The application operates on sensitive data stored in a database hosted in Azure (SQL Database or SQL Server running in a virtual machine on Microsoft Azure). The customer uses Always Encrypted and stores Always
Encrypted keys in a trusted key store hosted on-premises, to ensure Microsoft cloud administrators have no access to sensitive data.
Client and Data in Azure
A customer has a client application, hosted in Microsoft Azure (e.g. in a worker role or a web role), which operates on sensitive data stored also stored in Microsoft Azure. The customer uses Always Encrypted to reduce security attack surface area (the data is always encrypted in the database and on the machine hosting the database).
Always Encrypted supports two types of encryption: randomized encryption and deterministic encryption.
Deterministic encryption uses a method which always generates the same encrypted value for any given plain text value. Using deterministic encryption allows grouping, filtering by equality, and joining tables based on encrypted values, but can also allow unauthorized users to guess information about encrypted values by examining patterns in the encrypted column. This weakness is increased when there is a small set of possible encrypted values, such as True/False, or North/South/East/West region. Deterministic encryption must use a column collation with a binary2 sort order for character columns.
Randomized encryption uses a method that encrypts data in a less predictable manner. Randomized encryption is more secure, but prevents equality searches, grouping, indexing, and joining on encrypted columns.
Use deterministic encryption for columns that will be used as search or grouping parameters, for example a government ID number. Use randomized encryption, for data such as confidential investigation comments, which are not grouped with other records and are not used to join tables. Row-Level Security (RLS)
https://msdn.microsoft.com/en-us/library/dn765131.aspx
Store data intented for many customers in a single database/table while at the same time restricting row-level read & write access based on users’ execution context.
RLS supports two types of security predicates.
Filter predicates silently filter the rows available to read operations (SELECT, UPDATE, and DELETE).
Block predicates explicitly block write operations (AFTER INSERT, AFTER UPDATE, BEFORE UPDATE, BEFORE DELETE) that violate the predicate.
Example use cases:
A hospital can create a security policy that allows nurses to view data rows for their own patients only.
A bank can create a policy to restrict access to rows of financial data based on the employee's business division, or based on the employee's role within the company.
A multi-tenant application can create a policy to enforce a logical separation of each tenant's data rows from every other tenant's rows. Efficiencies are achieved by the storage of data for many tenants in a single table. Of course, each tenant can see only its data rows.
2 – 40% perf overhead depending of load type (simple OLTP or complex heavy duty queries/analysis), most of time lower side of overhead.
Dynamic Data Masking
1.1.4 Design a data load strategy
SQL Server Customer Advisory Team: Loading data to SQL Azure the fast way
1.2 Implement SQL Server on Azure Virtual Machines (VMs)
Different type of VM’s to translate on prem server specs to Azure VM specs. VM’s with premium storage
1.2.1 Provision SQL Server in an Azure VM
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-sql-server-infrastructure-services
1.2.2 Configure firewall rules
For SQL Server TCP Port 1433 must be opened in the Windows Firewall.
1.2.3 Configure and optimize storage
White Paper: Performance Guidance for SQL Server in Windows Azure Virtual Machines https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-sql-server-performance-best-practices
Azure virtual machine disks and cache settings
Azure Virtual Machines provide three types of disks: operating system (OS) disk, temporary disk, and data disks. For a description of each disk type, see section Azure Infrastructure services fundamentals in this article.
Operating system disk vs. data disk
When placing your data and log files you should consider disk cache settings in addition to size limits. For a description of cache settings, see section Azure Infrastructure services fundamentals in this article.
While “Read Write” cache (default setting) for the operating system disk helps improve the overall operating system performance, boot times and reducing the read latency for the IO patterns that OS usually generates, we recommend that you do not use OS disk for hosting system and user database files. Instead, we recommend that you use data disks. When the workload demands a high rate of random I/Os (such as a SQL Server OLTP workload) and throughput is important to you, the general guideline is to keep the cache set to the default value of “None” (disabled). Because Azure storage is capable of more IOPS than a direct attached storage disk, this setting causes the physical host local disks to be bypassed, therefore providing the highest I/O rate.
Temporary disk
Unlike Azure disks (operating system and data disks) which are essentially VHDs stored as page blobs in Azure Storage, the temporary disk (labeled as D:) is not persistent and is not implemented using Azure Storage. It is reserved by the operating system for the page file and its performance is not guaranteed to be predictable. Any data stored on it may be lost after your virtual machine is restarted or resized. Hence, we do not recommend the D: drive for storing any user or system database files, including tempdb.
Data disks performance options and considerations
This section discusses the best practices and recommendations on data disk performance options based on testing done by Microsoft. You should be familiar with how SQL Server I/O operations work in order to interpret the test results reported in this section. For more information, see Pages and Extents Architecture.
It is important to note that the results we provide in this section were achieved without SQL Server High Availability and Disaster Recovery Solutions enabled (such as, AlwaysOn Availability Groups, database mirroring or log shipping). We recommend that you deploy one of these features to maintain multiple redundant copies of your databases across at least two virtual machines in an availability set in order to be covered by the Azure Cloud Services, Virtual Machines, and Virtual Network Service Level Agreement. Enabling any of these features affects performance, so you should consider incorporating one of them in your own performance testing to get more accurate results.
As a general rule, we recommend that you attach maximum number of disks allowed by the VM size (such as, 16 data disks for an A7 VM) for throughput sensitive applications. While latency may not necessarily improve by adding more data disks when your workload is within the maximum IOPS limit, the additional IOPS and bandwidth that you get from the attached additional disks can help to avoid reaching the single disk 500 IOPS limit. Note that this might trigger throttling events that might increase disk response times and disk latency.
Single data disk configuration
In our performance tests, we’ve executed several SQL Server I/O measurements to understand data disk response characteristics with respect to the typical I/O patterns generated by SQL Server based on different kind of workloads. The results for a single disk configuration on an A7 VM instance are summarized here:
Random I/O (8 KB Pages)
Sequential I/O (64 KB Extents)
IOPS 500 500 500 300
Bandwidth 4 MB/s 4 MB/s 30 MB/s 20 MB/s
Note: Because Azure Infrastructure Services is a multi-tenant environment, performance results may vary. You should consider these results as an indication of what you can achieve, but not a guarantee. We suggest you repeat these tests and measurements based on your specific workload.
Multiple data disk configuration
If your workload exceeds or is close to the I/O performance numbers mentioned in the previous section, we recommend that you add multiple disks (depending on your virtual machine size) and stripe multiple disks in volumes. This configuration gives you the ability to create volumes with specific throughput and bandwidth, based on your data and log performance needs by combining multiple data disks together.
Adding multiple data disks to Azure virtual machine
After you create a virtual machine in Azure, you can attach a data disk to it using either the Azure Management Portal or the Add-AzureDataDisk Azure PowerShell cmdlet. Both techniques allow you to select an existing data disk from a storage account, or create a new blank data disk.
If you choose to create a new blank data disk at the Management Portal, you can choose the storage account that your virtual machine was created in but not a different storage account.
To place your existing data disk (.vhd file) into a specific storage account, you need to use the Azure PowerShell cmdlets. The following example demonstrates how to update a virtual machine using the Get-AzureVM and the Add-AzureDataDisk cmdlets. The Get-AzureVM cmdlet retrieves information on a specific virtual machine. The Add-AzureDataDisk cmdlet creates a new data disk with specified size and label in a previously created Storage Account.
Get-AzureVM "CloudServiceName" -Name "VMNAme" | Add-AzureDataDisk -CreateNew -DiskSizeInGB 100
-MediaLocation ` "https://<storageaccount>.blob.core.windows.net/vmdisk/Disk1.vhd" -DiskLabel "disk1" -LUN 1 | Update-AzureVM
To create a new storage account, use the New-AzureStorageAccount cmdlet as follows:
New-AzureStorageAccount -StorageAccountName "StorageAccountX" -Label "StorageAccountX" -Location "North Central US"
For more information about Azure PowerShell cmdlets, see Azure PowerShell on MSDN and Azure command line tools.
Disk striping options for Azure Virtual Machines
For Azure VMs running on Windows Server 2008 R2 and previous releases, the only striping technology available is striped volumes for dynamic disks. You can use this option to stripe multiple data disks into volumes that provide more throughput and bandwidth than what a single disk can provide.
Starting with Windows Server 2012, Storage Pools are introduced and operating system software RAID capabilities are deprecated. Storage Pools enable you to virtualize storage by grouping industry-standard disks into “pools”, and then create virtual disks called Storage Spaces from the available capacity in the storage pools. You can then configure these virtual disks to provide striping capabilities across all disks in the pool, combining good performance characteristics. In addition, it enables you to add and remove disk space based on your needs.
During our tests, after adding a number of data disks (4, 8 and 16) as shown in the previous section, we created a new storage pool by using the following Windows PowerShell command:
New-StoragePool –FriendlyName StoragePool1 –StorageSubsystemFriendlyName "Storage Spaces*" – PhysicalDisks (Get-PhysicalDisk –CanPool $True)
Next, we created a virtual disk on top of the new storage pool and specified resiliency setting and virtual disk size.
$disks = Get-StoragePool –FriendlyName StoragePool1 -IsPrimordial $false | Get-PhysicalDisk New-VirtualDisk –FriendlyName VirtualDisk1 -ResiliencySettingName Simple –NumberOfColumns $disks.Count –UseMaximumSize –Interleave 256KB
Important Note: For performance, it is very important that the –NumberOfColumns parameter is set to the number of disks utilized to create the underlying Storage Pool. Otherwise, IO requests cannot be evenly distributed across all data disks in the pool and you will get suboptimal performance. The –Interleave parameter enables you to specify the number of bytes written in each underlying data disk in a virtual disk. We recommend that you use 256 KB for all workloads.
Lastly, we created and formatted the volume to make it usable to the operating system and applications by using the following Windows PowerShell commands:
Get-VirtualDisk –FriendlyName VirtualDisk1 | Get-Disk | Initialize-Disk –Passthru | New-Partition –AssignDriveLetter –UseMaximumSize | Format-Volume –AllocationUnitSize 64K
Once the volume created, it is possible to dynamically increase the disk capacity by attaching new data disks. To achieve optimal capacity utilization, consider the number of columns your storage spaces have and add disks in multiples of that number. See Windows Server Storage spaces Frequently Asked Questions for more information.
Using Storage Pools instead of traditional Windows operating system striping in dynamic disks brings several advantages in terms of performance and manageability. We recommend that you use Storage Pools for disk striping in Azure Virtual Machines.
During our internal testing, we have implemented the following scenarios with different number of disks as well as disk volume configurations. We tested the following scenarios with configurations of 4, 8 and 16 data disks respectively, and we observed increased IOPS for each data disk added as expected:
We arranged multiple data disks as simple volumes and leveraged the Database Files and Filegroups feature of SQL Server to stripe database files across multiple volumes.
We used Windows Server Storage Pools to create larger volumes, which contains multiple data disks, and we placed database and log files inside these volumes.
It’s important to notice that using multiple data disks provides performance benefits but it creates more management overhead. In addition, partial unavailability of one of the striped disks can result in unavailability of a database. Therefore, for such configurations, we recommend that you consider enhancing the availability of your databases using high availability and disaster recovery capabilities of SQL Server as described in High Availability and Disaster Recovery for SQL Server in Azure Virtual Machines.
The following tables summarize the results of tests that we performed using multiple data disks configurations at Microsoft.
Aggregated throughput and bandwidth across 4 data disks Random I/O
(8 KB Pages)
Sequential I/O (64 KB Extents)
Reads Writes Reads Writes
IOPS 2000 2000 1600 1200
Bandwidth 16 MB/s 16 MB/s 100 MB/s 75 MB/s
Aggregated throughput and bandwidth across 8 data disks Random I/O
(8 KB Pages)
Sequential I/O (64 KB Extents)
Reads Writes Reads Writes
IOPS 4000 4000 2400 2400
Bandwidth 30 MB/s 30 MB/s 150 MB/s 150 MB/s
Aggregated throughput and bandwidth across 16 data disks Random I/O
(8 KB Pages)
Sequential I/O (64 KB Extents)
Reads Writes Reads Writes
IOPS 8000 8000 2400 4000
Bandwidth 60 MB/s 60 MB/s 150 MB/s 250 MB/s
Note: Because Azure Infrastructure Services is a shared, multi-tenant environment, performance results may vary. You should consider these results as an indication of what you can achieve, but not a guarantee. We recommend that you repeat these tests and measurements based on your specific workload.
By using the newly introduced Intel-based A8 and A9 VM sizes, we repeated our IO performance tests and noticed a significant increase in bandwidth and throughput for larger sequential IO requests. If you use Intel-based A8 and A9 VM sizes, you can get a performance increase for 64 KB (and bigger) read and write operations. If your workload is IO intensive, these new VM sizes (A8 and A9) can help in achieving more linear scalability compare to smaller VM sizes, but always within the 500 IOPs per disk boundaries. For more information, see About the A8 and A9 Compute Intensive Instances. Based on our tests, we have made the following observations about the Azure Virtual Machine environment:
Spreading your I/O workload across a number of data disks benefits smaller random operations (more common in OLTP scenarios) where IOPS and bandwidth scale in a nearly linear fashion.
As the I/O block size increases, for read operations adding more data disks does not result in higher IOPS or bandwidth. This means that if your workload is read intensive with more analytical queries, adding more disks will not necessarily help.
For write intensive workload, adding more data disks can increase performance in a nearly linear fashion. This means that you can benefit from placing each transaction log for multiple databases on a separate data disk.
For large sequential I/O block sizes (such as, 64 KB or greater), writes generally perform better than reads.
A8 and A9 VM sizes provide increased throughput for IO sensitive workloads.
For SQL Server Load D and DS VM’s can also be very interesting. Especially DS series where you have Premium Storage (SSD) available for the data disks.
Placement of database files
Depending on how you configure your storage, you should place and the data and log files for user and system databases accordingly to achieve your performance goals. This section provides guidance on how you should place database files when using SQL Server in Azure Virtual Machines:
Option 1: You can create a single striped volume using Windows Server Storage Spaces leveraging multiple data disks, and place all database and log files in this volume. In this scenario, all your database workload shares aggregated I/O throughput and bandwidth provided by these multiple disks, and you simplify the placement of database files. Individual database workloads are load balanced across all available disks, and you do not need to worry about single database spikes or workload distribution. You can find the graphical representation of this configuration below:
Option 2: You can create multiple striped volumes, each composed by the number of data disks required to achieve specific I/O performance, and do a careful placement of user and system database files on these volumes accordingly. You may have one important production database with a write-intensive workload that has high priority, and you may want to maximize the database and log file throughput by segregating them on two separate 4 disk volumes (each volume providing around 2000 IOPs and 100 MB/sec). For example, use:
4-disks volume for hosting TempDB data and log files. 4-disks volume for hosting other minor databases.
This option can give you precise file placement by optimizing available IO performance. You can find the graphical representation of this configuration below:
You can still create single disk volumes and leverage SQL Server files and filegroups placement for your databases. While this can still offer some benefits in terms of flexible storage layout organization, it introduces additional complexity and also limits single file (data or log) IO performance to a value that a single Azure data disk can provide such as 500 IOPs and 60 MB/sec.
Although Azure data disks have different behaviors than traditional rotating spindles (,in which competing random and sequential operations on the same disks can impact performance), we still recommend that you keep data and log files in different paths to achieve dedicated IOPs and bandwidth for them.
To help understand your IO requirements and performance while running your SQL Server workloads on Azure Virtual Machines, you need to analyze the following three tools and combine the results carefully:
- SQL Server IO statistics: They reflect the database management system view of the IO subsystem.
- Windows Server Logical Disk Performance Counters: They show how the operating system performs on IOs.
- Azure Storage Analytics: Azure hosts data disks’ VHD files in Azure Storage. You can turn on logging and metrics for the storage account that hosts your data disks, and get useful information such as the number of successful and failed requests, timeout, throttling, network, authorization, and other errors. You can configure and get data from these metrics on the Azure Portal, or via PowerShell, REST APIs, and .NET Storage Client library.
By leveraging all these information, you can understand:
If your IO related stalls or wait types in SQL Server (manifesting as increased disk response times in OS Perf Counters) are related to throttling events happening in Azure Storage. And, If rebalancing your data and log files across different volumes (and underlying disks) can help
maintaining throughput and bandwidth between storage performance limits. TempDB
As mentioned in section Azure virtual machine disks and cache settings, we recommend that you place tempDB on data disks instead of the temporary disk (D:). Following are the three primary reasons for this recommendation based on our internal testing with SQL Server test workloads.
Performance variance: In our testing, we noticed that you can get the same level of performance you get on D:, if not more IOPS, from a single data disk. However, the performance of D: drive is not guaranteed to be as predictable as the operating system or data disk. This is because the size of the D: drive and the performance you get from it depends on the size of the virtual machine you use, and the underlying physical disks shared between all VMs hosted by the same server.
Configuration upon VM downtime situation: If the virtual machine gets shutdown down (due to planned or unplanned reasons), in order for SQL Server to recreate the tempDB under the D: drive, the service account under which SQL Server service is started needs to have local administrator privileges. In addition, the common practice with on-premises SQL deployments is to keep database and log files (including tempDB) in a separate folder, in which case the folder needs to be created before SQL Server starts. For most customers, this extra re-configuration overhead is not worth the return.
Performance bottleneck: If you place tempdb on D: drive and your application workloads use tempDB heavily, this can cause performance bottleneck because the D: drive can introduce constraints in terms of IOPS throughput. Instead, place tempDB on data disks to gain more flexibility. For more information on configuration best practices for optimizing tempdb, see Compilation of SQL Server TempDB IO Best Practices.
We strongly recommend that you perform your own workload testing before implementing a desired SQL Server file layout strategy.
Effects of warm-up on data disks
With Azure disks, we have observed a “warm-up effect” that can result in a reduced rate of throughput and bandwidth for a short period of time. In situations where a data disk is not accessed for a period of time (approximately 20 minutes), adaptive partitioning and load balancing mechanisms kick in. If the disk is accessed while these algorithms are active, you may notice some degradation in throughput
and bandwidth for a short period of time (approximately 10 minutes), after which they return to their normal levels. This warm-up effect happens because of the adaptive partitioning and load balancing mechanism of Azure, which dynamically adjusts to workload changes in a multi-tenant storage environment. You may observe similar effects in other widely known cloud storage systems as well. For more information, see Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency.
This warm-up effect is unlikely to be noticed for systems that are in continuous use. But we recommend you consider it during performance testing or when accessing systems that have been inactive for a while.
Single vs. multiple storage accounts for data disks attached to a single VM
To simplify management and reduce potential risks of consistency in case of failures, we recommend that you leave all the data disks attached to a single virtual machine in the same storage account. Storage accounts are implemented as a recovery unit in case of failures. So, keeping all the disks in the same account makes the recovery operations simple. There is no performance improvement if you store data disks attached to a single VM in multiple storage accounts. If you have multiple VMs, we recommend that you consider the storage account limits for throughput and bandwidth during capacity planning. In addition, distribute VMs and their data disks to multiple storage accounts if the aggregated throughput or bandwidth is higher than what a single storage account can provide. For information on storage account limits, see Azure Storage Scalability and Performance Targets. For information on max IOPS per disk, see Virtual Machine and Cloud Service Sizes for Azure.
NTFS allocation unit size
NTFS volumes use a default cluster size of 4 KB. Based on our performance tests, we recommend changing the default cluster size to 64 KB during volume creation for both single disk and multiple disks (storage spaces) volumes.
Data compression for I/O bound workloads
Some I/O intensive workloads can gain performance benefits through data compression. Compressed tables and indexes means more data stored in fewer pages, and hence require reading fewer pages from disk, which in turn can improve the performance of workloads that are I/O intensive.
For a data warehouse workload running on SQL Server in Azure VM, we found significant improvement in query performance by using page compression on tables and indexes, as shown in Figure 1.
Figure 1: Query Performance with Data Compression
Figure 1 compares performance of one query with no compression (NONE) and page compression (PAGE). As illustrated, the logical and physical reads are significantly reduced with page compression, and so is the elapsed time. As expected, CPU time of the query does go up with page compression, because SQL Server needs to decompress the data while returning results to the query. Your results will vary, depending upon your workload.
For an OLTP workload, we observed significant improvements in throughput (as measured by business transactions per second) by using page compression on selected tables and indexes that were involved in the I/O intensive workload. Figure 2 compares the throughput and CPU usage for the OLTP workload with and without page compression.
400000 500000 600000 700000 800000 900000 1000000 100000 150000 200000 250000 NONE PAGE Rea d s Ti me (ms )
Query Performance with Data Compression
CPU Time Elapsed Time Logical Reads Physical Reads (+RA)
0 10 20 30 40 50 60 70 NONE PAGE CPU Ti m e ( % ), Th ro u gh p u t (B u si n e ss Tr an sact io n s/se c)
OLTP Throughput and CPU Usage with Data
Compression
Figure 2: OLTP Throughput and CPU Usage with Data Compression
Note that you may see different results when you test your workloads in Azure Virtual Machine environment. But we recommend that you test data compression techniques for I/O intensive workloads and then decide which tables and indexes to compress. For more information, see Data Compression: Strategy, Capacity Planning and Best Practices.
Restore performance – instant file initialization
For databases of any significant size, enabling instant file initialization can improve the performance of some operations involving database files, such as creating a database or restoring a database, adding files to a database or extending the size of an existing file, autogrow, and so on. For information, see How and Why to Enable Instant File Initialization.
To take advantage of instant file initialization, you grant the SQL Server (MSSQLSERVER) service account with SE_MANAGE_VOLUME_NAME and add it to the Perform Volume Maintenance Tasks security policy. If you are using a SQL Server platform image for Azure, the default service account (NT Service\MSSQLSERVER) isn’t added to the Perform Volume Maintenance Tasks security policy. In other words, instant file initialization is not enabled in a SQL Server Azure platform image.
After adding the SQL Server service account to the Perform Volume Maintenance Tasks security policy, restart the SQL Server service.
The following figure illustrates observed test results for creating and restoring a 100 GB database with and without instant file initialization.
Figure 3: Performance Impact of Instant File Initialization For more information, see Database File Initialization.
Other existing best practices
Many of the best practices when running SQL Server on premises are still relevant in Azure Virtual Machines, including:
Limit or disable autogrow on the database: Autogrow is considered to be merely a contingency for unexpected growth. Do not manage your data and log growth on a day-to-day basis with autogrow. If autogrow is used, pre-grow the file using the Size switch.
0 10 20 30 40 50 60
Create 100 GB database Restore 100 GB database
Ti me (mi n u tes)
Impact of Instant File Initialization
Disable autoshrink on the database: Make sure autoshrink is disabled to avoid unnecessary overhead that can negatively affect performance. For more information about autogrow and autoshrink, see Considerations for the "autogrow" and "autoshrink" settings in SQL Server. Establish locked pages to reduce IO and any paging activities: Lock pages in memory is a
Windows policy that determines, which account can use a process to keep memory allocations pinned in physical memory. It prevents the system from paging the data to virtual memory on disk. When the SQL Server service account is granted this user right, buffer pool memory cannot be paged out by Windows. For more information about enabling the Lock pages in memory user right, see How to: Enable the Lock Pages in Memory Option (Windows).
1.2.4 Migrate an on-premises database to Microsoft Azure
1.2.4.1 Migrating a SQL Server database to Azure SQL Database
https://azure.microsoft.com/en-us/documentation/articles/sql-database-cloud-migrate Moving your on-premises database to Azure SQL Database varies in complexity based on your database and application design, and your tolerance for downtime. For compatible databases, migration to Azure SQL Database is a straightforward schema and data movement operation
requiring few, if any, changes to the schema and little or no re-engineering of applications. Azure SQL Database V12 brings near-complete engine compatibility with SQL Server 2014 and SQL Server 2016. Most SQL Server 2016 Transact-SQL statements are fully supported in Microsoft Azure SQL Database. This includes the SQL Server data types, operators, and the string, arithmetic, logical, cursor
functions, and the other Transact-SQL elements that most applications depend upon. Partially or unsupported functions are usually related to differences in how SQL Database manages the database (such as file, high availability, and security features) or for special purpose features such as service broker. Because SQL Database isolates many features from dependency on the master database, many server-level activities are inappropriate and unsupported. Features deprecated in SQL Server are generally not supported in SQL Database. Databases and applications that rely on partially or unsupported functions will need some re-engineering before they can be migrated.
The workflow for migrating a SQL Server database to Azure SQL Database are: 1. Determine if your database is compatible
2. If not compatible, fix database compatibility issues 3. Migrate a compatible database
1.2.5 Configure and optimize VM sizes by workload
White Paper: Performance Guidance for SQL Server in Windows Azure Virtual Machines https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-sql-server-performance-best-practices
1.3 Design a SQL Database solution
https://azure.microsoft.com/en-us/services/sql-database
1.3.1 Design a solution architecture
1.3.2 Design Geo/DR topology
https://azure.microsoft.com/en-us/updates/general-availability-azure-sql-database-geo-replication-enhancements
Azure SQL Database geo-replication enhancements (General availability Nov 10, 2015)
Azure SQL Database geo-replication includes a set of new features that improve programming and management capabilities for business continuity and disaster recovery scenarios. These
enhancements are available for V12 databases, and they include: T-SQL syntax for geo-replication
Failover and failback
Ability to synchronize security credentials and firewall rules Full support of geo-replication for databases in elastic pools Configurable performance levels of the secondary database Azure Resource Manager API and support of role-based security Synchronous PowerShell cmdlets
For more details, please refer to Spotlight on new capabilities of SQL Database geo-replication.
1.3.3 Design a security architecture
Webinar December, 30 2015 09:30:00 GMT (UTC):
https://azure.microsoft.com/en-us/community/events/azure-sql-db-security Azure SQL Database security guidelines and limitations
1.3.4 Design a data load strategy
Migrating a SQL Server database to Azure SQL Database
To test for SQL Database compatibility issues before you start the migration process, use one of the following methods:
Use SqlPackage: SqlPackage is a command-prompt utility will test for and, if found, generate a report containing detected compatibility issues.
Use SQL Server Management Studio: The Export Data Tier application wizard in SQL Server management studio will display detected errors to the screen.
If compatibility issues are detected, you must fix these compatibility issues before proceeding with the migration.
Use SQL Azure Migration Wizard
Use SQL Server Data Tools for Visual Studio Use SQL Server Management Studio
To migrate a compatible SQL Server database, Microsoft provides several migration methods for various scenarios. The method you choose depends upon your tolerance for downtime, the size and complexity of your SQL Server database, and your connectivity to the Microsoft Azure cloud.
SSMS Migration Wizard Export to BACPAC File Import from BACPAC File Transactional Replication
To choose your migration method, the first question to ask is can you afford to take the database out of production during the migration. Migrating a database while active transactions are occurring can result in database inconsistencies and possible database corruption. There are many methods to quiesce a database, from disabling client connectivity to creating a database snapshot.
To migrate with minimal downtime, use SQL Server transaction replication if your database meets the requirements for transactional replication. If you can afford some downtime or you are performing a test migration of a production database for later migration, consider one of the following three methods:
SSMS Migration Wizard: For small to medium databases, migrating a compatible SQL Server 2005 or later database is as simple as running the Deploy Database to Microsoft Azure Database Wizard in SQL Server Management Studio.
Export to BACPAC File and then Import from BACPAC File: If you have connectivity challenges (no connectivity, low bandwidth, or timeout issues) and for medium to large databases, use a BACPAC file. With this method, you export the SQL Server schema and data to a BACPAC file and then import the BACPAC file into SQL Database using the Export Data Tier Application Wizard in SQL Server Management Studio or the SqlPackage command-prompt utility.
Use BACPAC and BCP together: Use a BACPAC file and BCP for much large databases to achieve greater parallelization for increases performance, albeit with greater complexity. With this method, migrate the schema and the data separately.
Export the schema only to a BACPAC file.
Import the schema only from the BACPAC File into SQL Database.
Use BCP to extract the data into flat files and then parallel load these files into Azure SQL Database.
1.3.5 Determine the appropriate service tier
https://azure.microsoft.com/en-us/documentation/articles/sql-database-service-tiers
Azure SQL Database provides multiple service tiers to handle different types of workloads. You can create a single database with defined characteristics and pricing. Or you can manage multiple databases by creating an elastic database pool. In both cases, the tiers include Basic, Standard, and Premium. But the database options in these tiers vary based on whether you are creating an individual database or a database within an elastic database pool. This article provides an overview of service tiers in both contexts.
Service tiers and database options
Basic, Standard, and Premium service tiers all have an uptime SLA of 99.99% and offer predictable performance, flexible business continuity options, security features, and hourly billing. The following table provides examples of the tiers best suited for different application workloads.
Service tier
Target workloads
Basic Best suited for a small size database, supporting typically one single active operation at a given time. Examples include databases used for development or testing, or small scale infrequently used applications.
Standard The go-to option for most cloud applications, supporting multiple concurrent queries. Examples include workgroup or web applications.
Premium Designed for high transactional volume, supporting a large number of concurrent users and requiring the highest level of business continuity capabilities. Examples are databases supporting mission critical applications.
NOTE:
Web and Business editions are being retired. Find out how to upgrade Web and Business editions. Please read the Sunset FAQ if you plan to continue using Web and Business Editions.
For single databases there are multiple performance levels within each service tier, you have the flexibility to choose the level that best meets your workload’s demands. If you need to scale up or down, you can easily change the tiers of your database in the Azure Classic Portal, with zero-downtime for your application. See Changing Database Service Tiers and Performance Levels for details.
Performance characteristics listed here apply to databases created using SQL Database V12. In situations where the underlying hardware in Azure hosts multiple SQL databases, your database will still get a guaranteed set of resources, and the expected performance characteristics of your
individual database is not affected.
For a better understanding of DTUs, see the DTU section in this topic.
NOTE:
For a detailed explanation of all other rows in this service tiers table, see Service tier capabilities and limits.
Elastic database pool service tiers and performance in eDTUs
In addition to creating and scaling a single database, you also have the option of managing multiple databases within an elastic database pool. All of the databases in an elastic database pool share a common set of resources. The performance characteristics are measured by elastic Database Transaction Units (eDTUs). As with single databases, elastic database pools come in three service tiers: Basic, Standard, and Premium. For elastic databases these three service tiers still define the overall performance limits and several features.
Elastic database pools allow these databases to share and consume DTU resources without needing to assign a specific performance level to the databases in the pool. For example, a single database in a Standard pool can go from using 0 eDTUs to the maximum database eDTU (either 100 eDTUs defined by the service tier or a custom number that you configure). This allows multiple databases with varying workloads to efficiently use eDTU resources available to the entire pool.
Each database within a pool also adheres to the single-database characteristics for that tier. For example, the Basic pool has a limit for max sessions per pool of 2400 - 28800, but an individual database within that pool has a database limit of 300 sessions (the limit for a single Basic database as specified in the previous section).
Understanding DTUs
The Database Transaction Unit (DTU) is the unit of measure in SQL Database that represents the relative power of databases based on a real-world measure: the database transaction. We took a set of operations that are typical for an online transaction processing (OLTP) request, and then
measured how many transactions could be completed per second under fully loaded conditions (that’s the short version, you can read the gory details in the Benchmark overview).
A Basic database has 5 DTUs, which means it can complete 5 transactions per second, while a Premium P11 database has 1750 DTUs.
DTU vs. eDTU
The DTU for single databases translates directly to the eDTU for elastic databases. For example, a database in a Basic elastic database pool offers up to 5 eDTUs. That’s the same performance as a