• No results found

SEIZE THE DATA SEIZE THE DATA. 2015

N/A
N/A
Protected

Academic year: 2021

Share "SEIZE THE DATA SEIZE THE DATA. 2015"

Copied!
93
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

This is a rolling (up to three year) Roadmap and is subject to change without notice.

This document contains forward looking statements regarding future operations, product

development, product capabilities and availability dates. This information is subject to

substantial uncertainties and is subject to change at any time without prior notification.

Statements contained in this document concerning these matters only reflect Hewlett

Packard's predictions and / or expectations as of the date of this document and actual results

and future plans of Hewlett-Packard may differ significantly as a result of, among other

things, changes in product strategy resulting from technological, internal corporate, market

and other changes. This is not a commitment to deliver any material, code or functionality and

should not be relied upon in making purchasing decisions.

(3)

This is a rolling (up to three year) Roadmap and is subject to change without notice.

This Roadmap contains HP Confidential Information.

If you have a valid Confidential Disclosure Agreement with HP, disclosure of the Roadmap is

subject to that CDA. If not, it is subject to the following terms: for a period of 3 years after the

date of disclosure, you may use the Roadmap solely for the purpose of evaluating purchase

decisions from HP and use a reasonable standard of care to prevent disclosures. You will not

disclose the contents of the Roadmap to any third party unless it becomes publically known,

rightfully received by you from a third party without duty of confidentiality, or disclosed with

HP’s prior written approval.

(4)

Avoiding Disasters with Vertica

Backup and Restore

Stephen Walkauskas

John Heffner

(5)
(6)

Differential Backup

Transaction logs

Incremental Backup

Database change history.

Differences since the last

full backup.

Differences since the last

backup (incremental or full).

(7)

Differential backup

Transaction logs

Incremental backup

Grows too large and causes

long recoveries

Grows too large as Delta

grows from last full backup

Long recoveries and can’t

drop intermediate restore

(8)

Backup

Catalog

Snapshot

Catalog

Restore Point 0

Catalog

Restore Point 1

Data

Data files in Vertica

are write-once,

never modified (but

like “const” in C++,

can be deleted)

(9)
(10)

Some detail on how backup works currently (up to 7.1.x)

Backup

Catalog

Restore Point 1

Create “durable”

snapshot.

Serialize catalog.

Hard link all referenced

storage.

Copy metadata (catalog).

Transport data.

Hard link if in prior

restore point, else copy.

Catalog

Snapshot

Catalog

(11)

Performance issues with many files

Common case: tables with many sparsely-populated columns and many

partitions.

Vertica compresses these well, but creates a lot of small files.

Data copy is incremental, but filesystem operations are not.

Lots of random I/O – most devices do hundreds per second.

Backup can hurt query performance.

Two pronged approach to improving performance

(12)

Pack small ROSes together

Create fewer files

Put index and data into the same file, get rid of containing directory.

Bundle multiple columns into a single file, if ROSes sufficiently small.

Partitions, projections, and local segments still use separate files.

Bundled automatically at storage container creation, no rewriting or appending.

File size significantly larger than filesystem metadata.

clickid

userid

timestamp

objectid

dbg_info

clickid

timestamp

objectid

dbg_info

userid

(13)

New backup format using “manifests”

Fewer filesystem operations

Create “non-durable”

snapshot.

In-memory reference

counts.

Track backup location content

with a “manifest” file.

Copy only storage not in

manifest.

Additional benefits:

No hard links – allows more

flexibility in backup

Backup

Catalog

Snapshot

Catalog

Restore Point 0

Catalog

Restore Point 1

Objects

Manifest

Read

Update

(14)

Some performance numbers

94

348

Minimal incremental

Full backup

Backup time (shorter bars are better)

Backups run on test database with large catalog:

Three nodes, 119 schemas x 179 non-uniform tables.

850 GB (compressed) stored in 2.4m ROSes. Serialized catalog 4 GB.

Unbundled: 4.8m files.

(15)

Some performance numbers

108 (3.2x)

94

348

Minimal incremental

Full backup

Backup time (shorter bars are better)

Backups run on test database with large catalog:

Three nodes, 119 schemas x 179 non-uniform tables.

850 GB (compressed) stored in 2.4m ROSes. Serialized catalog 4 GB.

Unbundled: 4.8m files.

Bundled: 250k files.

7.1: Hard links

(16)

Some performance numbers

1.1 (

82x

)

108 (3.2x)

94

348

Minimal incremental

Full backup

Backup time (shorter bars are better)

Backups run on test database with large catalog:

Three nodes, 119 schemas x 179 non-uniform tables.

850 GB (compressed) stored in 2.4m ROSes. Serialized catalog 4 GB.

Unbundled: 4.8m files.

Bundled: 250k files.

7.1: Hard links

(17)
(18)

Selective object restore

Backup

Vertica

sales

customers

products

store

Backup schema

(19)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

products

store

(20)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

products

store

(21)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

store

Restore schema

(22)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

products

store

(23)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

products

store

In the next Vertica release, you can select objects to restore, schemas or tables,

(24)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

store

Restore products table

from schema

products

In the next Vertica release, you can select objects to restore, schemas or tables,

(25)

Selective object restore

sales

customers

products

store

Backup

Vertica

sales

customers

products

store

In the next Vertica release, you can select objects to restore, schemas or tables,

from a full backup

(26)

Replicating objects between databases

Replicate schemas and tables between databases

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

(27)

Replicating objects between databases

Primary and secondary can continue with normal operations

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

New York Data Center

Chicago Data Center

(28)

Replicating objects between databases

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

New York Data Center

Chicago Data Center

Primary and secondary can continue with normal operations

A consistent snapshot is replicated to the secondary site

(29)

Replicating objects between databases

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

New York Data Center

Chicago Data Center

Primary and secondary can continue with normal operations

A consistent snapshot is replicated to the secondary site

(30)

Replicating objects between databases

Primary and secondary can continue with normal operations

A consistent snapshot is replicated to the secondary site

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

(31)

Replicating objects between databases

Re-sync from primary will bring secondary consistent with the primary

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

New York Data Center

Chicago Data Center

(32)

Replicating objects between databases

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

Re-sync from primary will bring secondary consistent with the primary

(33)

Replicating objects between databases

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

Recover from secondary site during disasters

(34)

Replicating objects between databases

Vertica – Secondary DB

Vertica - Primary DB

products

customers

store

sales

customers

products

store

Recover from secondary site during disasters

New York Data Center

Chicago Data Center

(35)

Replicating objects between databases

Vertica – Secondary DB

Vertica - Primary DB

sales

customers

products

store

sales

customers

products

store

Recover from secondary site during disasters

(36)

Lightweight table/partition copy

Vertica

sales

(37)

Lightweight table/partition copy

Vertica

Copy partitions ‘2011’ to ‘2012’ from sales to sales2

sales

(38)

Lightweight table/partition copy

Vertica

Storage referenced in place, no hard links, portable across file systems

Copy partitions ‘2011’ to ‘2012’ from sales to sales2

(39)

Lightweight table/partition copy

Vertica

Storage referenced in place, no hard links, portable across file systems

sales

(40)

Side-by-side restore

In Vertica 7.1.x and earlier, you can only overwrite schemas/tables during

restore

Backup

Vertica

(41)

Side-by-side restore

Backup

Vertica

sales

sales

In Vertica 7.1.x and earlier, you can only overwrite schemas/tables during

(42)

Side-by-side restore

Backup

Vertica

sales

sales

(43)

Side-by-side restore

Backup

Vertica

sales

sales

(44)

Side-by-side restore

Backup

Vertica

sales

sales

$

SELECT DROP_PARTITION(‘sales’, 1);

OOPS, that’s a mistake, I want that partition! But now I’ve got new data I don’t want to

(45)

Side-by-side restore

Backup

Vertica

sales

sales

Now you can restore objects side-by-side without overwriting the original table

Restore “sales”

side-by-side

(46)

Side-by-side restore

Backup

Vertica

sales

sales

(47)

Side-by-side restore

Backup

Vertica

sales

sales

$

select swap_partitions_between_tables (‘sales’,

‘1’, ‘1’, ‘sales_tmp’);

(48)

Side-by-side restore

Backup

Vertica

sales

sales

(49)
(50)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Vertica (node002)

sales

customers

products

All data lost

on node001

Two choice from which to

recover data: buddy or

backup?

(51)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Recover “buddy”

containers

Vertica (node002)

(52)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Recover “buddy”

containers

Vertica (node002)

(53)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Recover “buddy”

containers

Vertica (node002)

(54)

Backup (node001)

sales

customers

products

Vertica (node001)

Next Backup is not

incremental

(55)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Vertica (node002)

sales

customers

products

All data lost

on node001

Two choice from which to

recover data: buddy or

backup?

(56)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Bootstrap

recovery with

node restore

(57)

Backup (node001)

sales

customers

products

Vertica (node001)

Bootstrap recovery

(58)

Backup (node001)

sales

customers

products

Vertica (node001)

sales

customers

products

Bootstrap recovery

with node restore

(59)

Proceed with incremental recovery

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

Backup (node001)

sales

customers

products

Recover delta between backup

epoch and current epoch

(60)

Backup (node001)

sales

customers

products

Vertica (node001)

Next backup is

incremental

(61)
(62)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

Recover delta: historical phase

No locks

Recover delta since backup epoch up to

latest epoch

(63)

Vertica (node002)

sales

customers

products

Recover delta: historical phase

No locks

Recover delta since backup epoch up to

latest epoch

Vertica (node001)

(64)

Recover delta: historical phase

Can have multiple historical phases

Starting in 7.1 INSERTS are written to

RECOVERING nodes

For this reason mostly only deletes are

recovered after first

historical phase

Vertica (node002)

sales

customers

products

INSERT INTO SALES VALUES (…)

Vertica (node001)

(65)

Recover current

: T-lock tables for

consistency

* Recover delta from last historical

phase to current epoch

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

(66)

Recover current

: T-lock tables

for consistency

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

(67)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

Recover current

: T-lock tables

for consistency

(68)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

T

Recover current

: T-lock tables

for consistency

(69)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T T

T

Recover current

: T-lock tables

for consistency

(70)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

T

T

Recover current

: T-lock tables

for consistency

(71)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

T T

Recover current

: T-lock tables

for consistency

(72)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

Yikes!

Recovery is almost done but it will

fail and restart the current phase for all

tables if T-lock table times out.

T

T T

X

Recover current

: T-lock tables

for consistency

(73)

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

T T

T

Recover current

: T-lock tables

for consistency

(74)

Recover complete:

Node transitions to UP

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

(75)

Vertica (node002)

sales

customers

products

T

ERROR: Timeout locking table ‘t’

Or

ERROR: Found drop partition

event during recovery

Etc

Vertica (node001)

sales

customers

products

T

T T

X

Recover current

: T-lock tables

for consistency

(76)

Recover by table

Vertica (node001)

sales

customers

products

Better to recover each table

independently and localize most

recovery errors to a single table

Vertica (node002)

(77)

Recover by table

Vertica (node001)

sales

customers

products

Recover table’s “buddy” containers

Vertica (node002)

(78)

Recover by table

Vertica (node001)

sales

customers

products

Vertica (node002)

sales

customers

products

(79)

Recover by table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

(80)

Recover by table

Recover table delta

: mainly deletes, as

of 7.1 INSERTS go to RECOVERING node

Vertica (node002)

sales

customers

products

Vertica (node001)

(81)

Recover by table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

Recover table delta

: mainly deletes, as

(82)

Recover by table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

Recover table delta

: mainly deletes, as

(83)

Recover by table

Recover table current

: T-lock table

for consistency

* Notice: No locks on “sales” to this point

Vertica (node002)

sales

customers

products

Vertica (node001)

(84)

Recover by table

Recover table current

: T-lock table

for consistency

* T-lock compatible with INSERT / COPY but

not DELETE or DDL

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

(85)

Recovery by table

Recover table current

: T-lock table

for consistency

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

(86)

Recover by table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

T

Recover table current

: T-lock table

for consistency

(87)

Recover by table

Recover table complete

:

- Table transitions to UP on RECOVERING node

- T-lock released

- Table participates in ALL DML and DDL

- UP segment on RECOVERING node not used

for queries

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

UP

(88)

Recover by table

Repeat process for each table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

(89)

Recover by table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

UP UP UP

(90)

Recover by table

Vertica (node002)

sales

customers

products

Vertica (node001)

sales

customers

products

UP UP UP UP

(91)

Recover by table

Recover complete:

Node transitions to UP

Vertica (node002)

sales

customers

products

SELECT SUM(…) FROM SALES GROUP BY …;

Vertica (node001)

(92)

People to corner at the reception

Development

Jing Xu

John Heffner

Tharanga Gameathige

Stephen Walkauskas

Product Management

Ignacio Hwang

QA

Afeso Ologun

George Young

Michelle Qian

Pan Ye

Qinong Liu

(93)

References

Related documents

This research was a qualitative exploratory multiple case study to derive a common understanding of what GSD organizational leaders need to meet software product quality in

Therefore the positive charge creates electric field away from the positive charge.. This is because the force my

Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided- learning through a two-layer neural network..

cfu/mL vibrio anguilliarum (-C) or saline (-UC). Effect of Lonicera japonica Leaf powder enriched diet on immune gene expression of olive flounder, P. A) Relative expression of

question naturally arises, "Which version of his Unified Field Theory was motivating the Navy in the Philadelphia Experiment?" Einstein was employed by the Navy (at

The transformed function passes through the point (x,  10). Determine the value of x. a) Sketch the graphs of the functions on the same grid. c) State the range and the equation