Backup and restore of Oracle databases: introducing a disk layer

(1)

CERN IT Department CH-1211 Geneva 23 Switzerland

www.cern.ch/it

Backup and restore of Oracle

databases: introducing a disk layer

by

Ruben Gaspar

IT-DB-DBB

(2)

www.cern.ch/it

Agenda

• CERN Oracle databases & Oracle backup

basics

• Backup to disk implementation details

• Recovery platform

• Some bits of backup to disk backend

• Summary

(3)

www.cern.ch/it

Agenda

• CERN Oracle databases & Oracle backup

basics

• Backup to disk implementation details

• Recovery platform

• Some bits of backup to disk backend

• Summary

(4)

4

Target Oracle databases for backup to disk

• ~70

Oracle databases, most of them running

Oracle clusterware (RAC)

– 49 are being backed up to disk and then tape

– 21 are just backed up with snapshots. Test and development instances.

• 15 Data Guard

RAC clusters in Prod

– Active Data Guard since upgrade to 11g

– They are just backed up to tape

•

10 Oracle single instance in DBaaS also backed up using snapshots.

Redo Transport

(5)

Oracle backup basics

• The Oracle clock: System Change Number (

SCN

)

–

It will take 544 years to run out of SCN at 16K/s

–

smon_scn_time

tracks time versus SCN

• Type of backups

–

Consistent: taken while

database has been cleanly shutdown

. All redo

applied to data files. Archive logs are not produced.

–

Inconsistent

: taken while

database is running.

Database must be in

archivelog

mode. It means archive logs will be produced. Point in

Time Recoveries (PITR) are possible.

Drawback

: clean-up of

archivelogs is critical to avoid that database blocks → TSM was playing

a critical role here

• Backup sets

: Oracle proprietary format for backups. Binary files.

–

Backup sets are containers for one or several backup pieces

–

Backup pieces contain blocks of 1 or several data files (multiplexing)

• RMAN channels

: disk or tape or proxy, read data files and write back to the

backup media. We use

SBT

: serial backup to tape API, using

IBM Tivoli

Data Protection 6.3

(provided by TSM support)

(6)

Oracle backup basics (II)

• Backup jobs based on templates. Recovery Manager API

--Full

backup incremental level 0 database;

--comulative

backup incremental level 2 cumulative database;

--Incremental

backup incremental level 1 database;

--Archivelogs

backup tag 'BR_TAG' archivelog all delete all input;

• Retention policy from 60 to 90 days, depending on DB.

CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 90 DAYS; e.g. LEMONRAC → [1xfull + 6xdifferential + archivelogs] * 13 weeks

• Controlfile backup, automatically taken by each backup

CONFIGURE CONTROLFILE AUTOBACKUP ON;

e.g. LHCBSTG → [2xfull + 5xdifferential + 24x4 archivelogs] *13 weeks = 934GB

BR evolution: Backup to disk- 6

2 1

Fulls (GB) Inc (GB) Archived logs Total

LEMONRAC _87902.42 _857.52 _13319.39 _102079.32 PITR Full Cum. Inc

(7)

(8)

What is there to be backed up ?

Backup jobs using RMAN API take care of : • Database files: user and system files

• Control files: contain structure and status of data files. They have also all backup history

• Archived logs: backup of redo logs. Needed for inconsistent backup strategies. They need to be backed up and removed from the active file system otherwise if running out of space, database freezes/stops.

5.1TB redo logs produced per day

(9)

Agenda

• CERN Oracle databases & Oracle backup basics

• Backup to disk implementation details

• Recovery platform

• Some bits of backup to disk backend

• Summary

(10)

Backup architecture

• Custom solution: about 15k lines of code, Perl + Bash

• Flexible: easy to adapt to new Oracle release, backup media • Based on Oracle Recovery Manager (RMAN) templates

• Central logging

(11)

Backup architecture

BR evolution: Backup to disk- 11 • Custom solution: about 15k lines of code, Perl + Bash

• Flexible: easy to adapt to new Oracle release, backup media • Based on Oracle Recovery Manager (RMAN) templates

• Central logging

• Easy to extend via Perl plug-ins: snapshot, exports, RO tablespaces,…

We send compressed: • 1 out of 4 full backups • All archivelogs

(12)

Impact on TSM

• Savings depend on database workload,

e.g.: backup sets on disk for three databases

DB Full (GB) Inc (GB) Archived logs (GB) Savings

EDHP _29197.76 _1216.697 _2169.766 _70%

CASTORNS _4944.839 _213.256 _336.2889 _71%

ATLASSTG 1484.146 724.9567 3063.658 45%

x 1/4 +

Source: TSM support

• + backup sets are compressed (see later)

Sent to tape

5 1

7

(13)

Impact on TSM (II)

Source: TSM support

~47% savings ~70% savings

15 accounts: alicestg,atlasstg,cmsstg,castorns,..

(14)

Workflow for disk/tape backups

• Same workflow as per tape backups → to ease maintenance

• Disk or Tape templates are almost identical, just channel allocation differs

• Disk channel allocation calculated on the fly considering available space in aggregate and file system: using Netapp management API called ZAPI

• About 75 templates to adapt to all type of backup strategies

• Tape and disk backup strategies co-exist

• Reversible changing from one to another is a matter of changing templates.

(15)

Workflow for disk/tape backups

• Same workflow as per tape backups → to ease maintenance

• Disk or Tape templates are almost identical, just channel allocation differs

• Disk channel allocation calculated on the fly considering available space in aggregate and file system: using Netapp management API called ZAPI

• About 75 templates to adapt to all type of backup strategies

• Tape and disk backup strategies co-exist

• Reversible changing from one to another is a matter of changing templates.

(16)

16

Typical DB architecture

RAC

01 02 03 04 Public interface Interconnect LAN 10GbE 10GbE 10GbE 6Gb/s 6Gb/s

backup01 backup02

Media Manager

Server IBM TSM

10GbE

At least 2 file systems for backup to disk:

• /backup/dbsXX/DBNAME Public interface 10GbE 1 GbE 1 GbE 10GbE Cluster interconnect mgmt network Private network

BR evolution: Backup to disk- 16

7-mode

C-mode

datafiles Archivelogs

(17)

New C-mode features

• Transparent file system movements:

cluster01::> volume move start -destination-aggregate aggr1_c01n02 -vserver vs1 -volume castorns03 -cutover-window 10

• DNS load balancing inside the cluster

• Automatic virtual IP rebalancing (based on failover groups)

• Access security via “export-policy” joins firewall + different

authentication mechanisms: sys, krb5, ntlm

• Global namespace

• Compression and Deduplication

– We strongly rely on compression as the way to satisfy 2.3PB of backup set storage needs using 1.1PB of disk

(18)

Backup to disk configuration on database servers

• RMAN configuration parameters: minimal change

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/backup/dbs01/<DBNAME>/<DBNAME>_%F';

• Global namespace in use: /backup/dbsXX

• Ease management: mount point unchanged as data moves. It’s a Netapp C-mode feature (see later)

7-mode: mount –o … priv-controllerIP:/vol/castorns03 /ORA/dbs03/CASTOR

C-mode: mount -o … public-ip-cluster:/backup/dbs01/CASTORNS /backup/dbs01/CASTORNS

/backup/dbs01/<DBNAME> → autobackup controlfile + backupsets /backup/dbsXX/<DBNAME> → backupsets

(19)

19

Particular cases

• Solution also operational in a

Data Guard

configuration: full and

incremental taken on standby (more while talking about restores)

• Multiple channels:

rman_channels_connect

in order to

distribute backup load

• Plug-in for RO tablespaces backup (ACCLOG:

size about 170TB, growth 70TB/year

)

–

Automatic clean-up in case of tablespace state change

–

One backup set per tablespace

• Extension to allow special mount points (ACCLOG)

–

rman_mounts_readonly

Active Data Guard for users’ access and

for disaster recovery Primary Database Redo Transport full + incremental + controlfile archivelogs + controlfile

BR evolution: Backup to disk- 19

username/password@rac-node1 username/password@rac-node2

(20)

20

Backup to disk performance

34 hours ~ 35 MB/s

14 hours ~ 100MB/s

Tape Disk

ACCLOG full backup 5TB

• Backups run faster ~ 50% than on tape

• Sending backup sets from disk to tape needs optimisation • Work on progress with TSM support

(21)

21

Backup to Disk space consumption

• Channels order is important → storage management

–

Space distribution should be according planning to avoid

miss balance. File systems should grow at same pace.

–

Emptiest volume is always selected on top

Automatic size extension

(22)

Agenda

• CERN Oracle databases & Oracle backup basics

• Backup to disk implementation details

• Recovery platform

• Some bits of backup to disk backend

(23)

23

Recovery platform

• Only reliable proof of truth: run a recovery

• Any change introduce in backup platform/backup strategy is

always validated via test recoveries

• Isolation

–

Run independently of the production database

–

Cant access any other system (database network links)

–

No user jobs must run

• Flexible and easy to customize

• Maximize recovery server: several recoveries at the same time

–

Exports taken after a successful recovery → help in support

cases: mainly logical errors

• Open source:

http://sourceforge.net/projects/recoveryplat/

(24)

24

Recovery platform (II)

• Introducing disk buffer highly improves our recovery

testing

• Also tested with Data Guard configurations:

– Data Guard: Oracle support ID 1070039.1

RMAN> set backup files for device type disk to accessible

• Restore from disk are usually 50% faster

• More recoveries can be run, nowadays about 40

recoveries per week • No blocking of tape

resources that could be used by backups

(25)

Agenda

• CERN Oracle databases & Oracle backup basics

• Backup to disk implementation details

• Recovery platform

• Some bits of backup to disk backend

• Summary

(26)

Backup to disk cluster

• 2xFAS6240 Netapp controllers • 24xdiskshelf DS4243

• 24x3TB SATA disks each (576 disks)

• raid_dp (raid6) → 1.1 PB usable space split into 8 aggregates ~ 135TB each • 2xquad core 64bit Intel(R) Xeon(R) CPU E5540 @ 2.53GHz

• 10gbps connectivity

• Multipath SAS loops 3 gbps • Flash cache 512GB per node

(27)

How fast, How compressed

• Compression (datafiles)

–

Online compression of datafiles ~55% (saved by compression)

• Backupsets compression of a 501 GB tablespace of random alphanumeric

strings, dbms_random.

no-compressed (t) basic low medium high

No-compressed-fs Cron- compression Netapp 8.1.1 Inline-compression Netapp 8.1.1

501GB 83GB (6h21’) 116GB (49’) 88GB (07h23’) 82GB (11h02’) 459GB(41’) 188GB 188GB(46’)

Percentage saved (%)

83% 76,8% 82,4% 83,6% 8,3% 62% 62%

0 50 100 150 200 250 300 350 400 450

1 2 3

MB/s

Number of channels

RMAN backup to disk*

knfs dnfs

dnfs + Ontap compression

(28)

28

Compression: real values

Used(GB)* Saved (GB)

%saved-by-compression

AISDB_PROD 24719 25941 52

CASTORNS 3629 3448 49

CMSSTG 6510 6395 50

CSR 20636 32008 61

ITCORE 16387 23552 60

EDHP 9631 24913 66

LEMONRAC 47104 49152 51

*Space used on controller side

Logical space used: Used + Saved

(29)

29

NAS controllers throughput

net_data_recv

disk_data_written

compression ratio

(30)

30

Deduplication

• When combined with compression, it doesn’t provide

good results

–

Due to the way compression works: compression group: 32k,

our Oracle block is 8k, Wafl block is 4k

Checksum 4k

4k

(31)

31

Deduplication

• When combined with compression, it doesn’t provide

good results

–

Due to the way compression works: compression group: 32k,

our Oracle block is 8k, Wafl block is 4k

Checksum 4k

4k

• Control files are a different story. Block size of 16k

DB Type Location Size(GB)

PAYP archives /backup/dbs01 0.91 PAYP archives /backup/dbs02 22.90

PAYP controlfile /backup/dbs01 456.92

PAYP fullinc /backup/dbs01 68.00 PAYP fullinc /backup/dbs02 81.10

(32)

Agenda

• CERN Oracle databases & Oracle backup basics

• Backup to disk implementation details

• Recovery platform

• Some bits of backup to disk backend

(33)

Summary

• Backup and Recovery testing is critical

• Tape copies are essential but TSM became a critical point of

failure for DB services

• Adding a disk buffer

–

Removes TSM criticality

–

Reduces DB volume in TSM

–

Speeds up backups and restores

• Better response time

• Better resource utilization

• Disk buffer plug-ins were easily integrated in our backup

framework

• First system to exploit Ontap C-mode features

–

Valuable experience for the future

(34)