• No results found

SEIZE THE DATA SEIZE THE DATA. 2015

N/A
N/A
Protected

Academic year: 2021

Share "SEIZE THE DATA SEIZE THE DATA. 2015"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

BIG DATA CONFERENCE 2015

Boston August 10-13

(3)

Module Overview

 Backup and Restore

 Copy Vertica Database

 Online Recovery

(4)
(5)

Backup - Overview

Backup is the process of copying the actual data files to a specified location

• Vertica data and backup files are written once

− Once a file is written Vertica will not update it

• Number of files increase with each backup

• Tuple Mover keeps the number of files under control

− The TM ‘mergeout’ process consolidates smaller ROS containers into larger ones

• To backup, copy Vertica files to stable storage

− Can be direct attached storage, NFS mounts or SAN

(6)

Backup – When?

Backup is the process of copying the actual data files to a specified location

• Part of Regular Disaster Recovery Strategy

− Nightly, weekly, depending on business continuity requirements and resources

• After loading or altering a large volume of data

• Before Maintenance Tasks

− Upgrading to another version of Vertica − Dropping a Partition

(7)

Backup and Restore – Options

There are several ways to take a Vertica Backup

• Backup and Restore by Database

− Most common backup process

− Backs up the entire database which includes all the schemas and objects within them

• Backup and Restore by Schema

− Multi-tenant database with different backup frequency

− Multi-application cluster with different backup requirements /policies

• Backup and Restore by Table

− Can be used to backup some critical tables − Restore certain tables for QA / Testing

(8)

Vertica Backup Restore – VBR

vbr.py is a Python script located under /opt/vertica/bin

• Use vbr.py with various options to take backup and restore data

• Create a configuration file

− vbr.py --setupconfig

− Goes into interactive mode, gathers all parameters and creates the configuration file

• VBR parameters

− Database name, schema name, snapshot name, object names

(9)
(10)
(11)

Vertica Backup Restore – VBR

A few parameters explained

• Snapshot Name – stores all the files under that named directory

• Restore Points – number of

incremental

backups stored in addition to full backup

• Node

− Names of nodes in the cluster

− Data is backed up from each node of the cluster

• Backup Directory

− Location where the backup files are stored

(12)

VBR preparation

Steps and some prerequisites

• Backup location to be configured on all the nodes

• Verify database is running

• Ensure backup hosts are running if data is backed up to those hosts

− Backup can be done to the same cluster nodes

− Backup can also be done to a dedicated host which has the SAN storage

• Backup Directory Permissions / Contents

(13)

Performing a Backup

How to run the vbr.py script

• vbr.py --task backup --config-file <myconfigfile>

− Same command is used for full and incremental backups

• First run does a

full

backup

− All data files are copied to the sub-directory with the snapshot name

• Subsequent runs are

incremental

− Copies files which have changed since last backup − Files are only added or deleted, never modified

− Each incremental backup goes into a separate sub-directory with a timestamp − Each incremental backup also adds those files to the full backup

(14)
(15)

Performing a Restore

The same vbr.py script is used for restore

• vbr.py --task restore --config-file <myconfigfile>

− The configuration file is the same that is used for the Backup

• Restore can be specific

− Entire database, specific schema or table depending on the configuration file used − Vertica copies the files from backup location to the data directory location

• Some key features

− Vertica does not have the concept of transaction logging − There is no roll forward or roll back of transactions

(16)
(17)

Copy Vertica Database

This option of VBR copies the entire Database (cluster) to a target cluster

• When do we need

copycluster

?

− Maintain a warm-standby cluster for Disaster Recovery

− Provide an alternative cluster to a different set of users / applications

• Prerequisites

− Source and Target cluster must have same number of nodes

− Database, node names and dbadmin user have to be the same on both sides − Password-less ssh has to be established between all the nodes on both sides − Target database has to be shut down before starting the process

• vbr.py --task copycluster --config-file <cfgfile>

(18)
(19)

Node Recovery

Vertica is highly available MPP architecture, but nodes may go down…

• Node can recover from failure

− A node can rebuild its data set from other nodes in the cluster if the cluster is K-safe − In a full recovery the node rebuilds from scratch

• Incremental Recovery

− Node rebuilds from the current persisted state

− To speed up a full recovery, use a prior backup for the given node and perform incremental recovery

• RAID 10 is best practice

(20)

Monitor Recovery

• Monitor disk space

− df –h

− SELECT * FROM v_monitor.disk_storage;

• Monitor Recovery

− tail vertica.log

(21)

SEIZE THE DATA. 2015

QUESTIONS?

Please attend our Q&A with HP Big Data experts today

Marina Ballroom, Lobby level

10:15 am • 10:30 am

12:00 pm • 1:00 pm

2:30 pm • 3:00 pm

4:30 pm • 5:00 pm

(22)

References

Related documents

Recently, many algorithms based on one-dimensional and two-dimensional processing have been used to enhance the system performance, such as adaptive temporal matched filtering

User menyampaikan keluhan secara langsung atau melalui Short Message Service (SMS) atau telephone, karena pelaporan masih menggunakan SMS atau telephone pelapor

Initially, I had difficulty understanding how it was that students were integrating the various disciplinary perspectives in their pursuit of the question, “What does it mean to

In a Cisco Unified Communications Manager environment, Cisco Unity Express provides local storage and processing of integrated messaging, voicemail, fax, automated attendant,

Rimage Surveillance Solutions: Rimage Surveillance Software Suite With a growing number of digital surveillance cameras, increased image quality and recording required, along

This research was a qualitative exploratory multiple case study to derive a common understanding of what GSD organizational leaders need to meet software product quality in

Therefore the positive charge creates electric field away from the positive charge.. This is because the force my

Revitalisasi pengelolaan budidaya peri- kanan karamba di Sungai Riam Kanan untuk mengoptimalkan manfaat budi daya perikanan karamba dapat dilakukan dengan pengaturan