• No results found

Weaving Relations for Cache Performance

N/A
N/A
Protected

Academic year: 2021

Share "Weaving Relations for Cache Performance"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

Weaving Relations for Cache Performance

Anastassia Ailamaki David J. DeWitt

Mark D. Hill

Marios Skounakis

Presented by:

VLDB 2001, Rome, Italy

Best Paper Award

(2)

Bottleneck in DBMSs

 Processor speeds increase faster than memory speeds

 DBMSs becoming compute and memory bound

New architectures hide I/O latency

Larger memories

Pipelining/Out of order execution/Non-blocking caches

Complexity of DBMSs

 Major performance bottleneck are the L2 data cache

(3)

Data cache misses

 Requests for data not found in the cache and fetched from the main memory

 Major penalty in comparison to L1 cache misses

 Loading cache with useless data:

– Wastes bandwidth

– Pollutes the cache

– Possibly forces replacement of useful information

– Increases the number of cache misses

(4)

Data placement policies

 NSM (N-ary Storage Model)

– Stores records sequentially within a page (+) Intuitive and Easy to implement

(-) Exhibits poor cache performance

 DSM (Decomposition Storage Model)

– A n-attribute relation is partitioned into n subrelations (+) Exploits spatial locality

(-) Huge reconstruction cost (joining participating subrelations

(5)

New Data Placement Policy

 PAX stands for Partition Attributes Across

 The records are partitioned within each page storing together values of each attribute in

minipages

 This design:

– Exhibits fewer L2 data cache misses

– Executes faster range selection queries and updates

– Executes faster TPC-H queries involving I/Os

– Disregarding the number of participating attributes

(6)

Outline

 Motivation

 Related Work

 PAX

 Experimental results

 Summary

 Discussion

(7)

NSM [1] – Intro

NSM – N-ary Storage Model

Records are stored sequentially

Pointers (offset) to end of page

One offset of each variable length entry

Intra-record locality (attributes of record r together)

Intuitive

Simple to implement

Pollutes cache

memory

(8)

NSM [2] – Record Design

All attributes of a record are stored together

NSM pollutes the cache and wastes bandwidth

FIXED-LENGTH VALUES VARIABLE-LENGTH VALUES HEADER

offsets to variable- length fields null bitmap,

record length, etc

(9)

NSM [3] – Cache Behavior

select name from R where age > 40

CACHE

MAIN MEMORY

1237 RH1

PAGE HEADER 30

Jane RH2 4322 John

45 RH3 Jim 20

RH4

7658 52

• 1563

block 1 30

Jane RH

52 2534 Leon block 4 Jim 20 RH4 block 3 45 RH3 1563 block 2 2534 Leon

Susan

(10)

DSM [1] – Intro

DSM –

Decomposition Storage Model

Store n-attribute table as n-one attribute tables

High cache utilization

But, High

reconstruction cost

(11)

DSM [2] – Cache Behavior

John Jim Suzan Jane 1

PAGE HEADER

3 4

30 20 52

1

PAGE HEADER

3 4

2

MAIN MEMORY

45 5

CACHE block 1

30

1 2 45

43

block 2

20 52

3 4 5

2

$$$

(12)

Outline

 Motivation

 Related Work

 PAX

 Experimental results

 Summary

 Discussion

(13)

PAX [1] – Intro

 PAX – Partition Attributes Across

 Partition data within the page for spatial locality

 Fewer cache misses, low reconstruction cost

 Implementing PAX requires changes only to the page-level data manipulation code

 Can be used interchangeable with other data

layout schemes

(14)

PAX [2] – Comparison with NSM

1237 RH1

PAGE HEADER 30

Jane RH2 4322 John 45

1563

RH3 Jim 20 RH4

7658 Susan 52

PAGE HEADER 1237 4322

1563

7658

Jane John Jim Susan

NSM PAX

(15)

PAX [3] – Data Placement

CACHE 1563

PAGE HEADER 1237 4322 7658

Jane John Jim Suzan

30 52 45 20

block 1

30 52 45 20

select name from R

where age > 40

(16)

PAX [4] – Detailed View

pid

3 2 4 v 4

4322 1237

Jane John

1 1

f

}

} Page Header

attribute

sizes free space

# records

# attributes

F - Minipage presence bits

v-offsets

} } V - Minipage

(17)

Outline

 Motivation

 Related Work

 PAX

 Experimental results

 Summary

 Discussion

(18)

Experimental Setup

 Windows NT4.0

 Pentium II Xeon/MT Workstation

 16KB L1-I, 16KB L1-D, 512 KB L2, 512 MB RAM

 Minimal setup

Memory resident DB

Exclusion of I/O subsystem

 Processor hardware counters for the experiments

(19)

NSM/DSM/PAX Comparison

Elapsed time while number of attributes or selectivity in query increases

DSM becomes unaffordable

0 5 10 15 20 25 30

1% 5% 10% 20% 50% 100%

selectivity seconds

NSM

DSM

PAX

(20)

Execution Time Breakdown

 PAX in comparison with NSM exhibits less:

Data cache misses

L2 Data penalty

L1 Data penalty

Execution time breakdown

300 600 900 1200 1500 1800

Resource Branch Mispred.

Memory Comp.

(21)

Sensitivity Analysis

Sensitivity to projectivity

Sensitivity to number of attributes in relation

As the number of attributes involved in the query increases the performance of the two techniques converges

0 1 2 3 4 5 6

1 2 3 4 5 6 7

projectivity seconds

NSM PAX

0 1 2 3 4 5 6

1 2 3 4 5 6 7

# of attributes in predicate seconds

NSM PAX

(22)

Evaluation with DSS Workloads

Windows NT4.0

Pentium II Xeon/MT Workstation

16KB L1-I, 16KB L1-D, 512 KB L2, 512 MB RAM

Processor hardware counters for the experiments

Implemented on top of Shore Storage Manager

Loaded 100M, 200M, and 500M TPC-H DBs

Ran Queries:

Range Selections with variable parameters (RS)

TPC-H Q1 and Q6

sequential scans

lots of aggregates (sum, avg, count)

grouping/ordering of results

(23)

Bulk-loading/Insertions

Estimate average field sizes

Start inserting records

If a record doesn’t fit,

Reorganize page

(move minipage boundaries)

Adjust average field sizes

50% of reorganizations

accommodate a single record (the last record in the page)

PAX loads a TPC-H database in 2-26% longer than NSM

Elapsed Bulk Load Times

0 500 1000 1500 2000 2500 3000 3500

100 MB 200 MB 500 MB Database Size

time (seconds) NSM PAX DSM

(24)

PAX/NSM Speedup

PAX/NSM Speedup on PII/NT

15%

30%

45% 100 MB 200 MB 500 MB

PAX improves performance up to 42% even with I/O

(25)

Outline

 Motivation

 Related Work

 PAX

 Experimental results

 Summary

 Discussion

(26)

Summary [1]

 DBMSs do not exploit architecture features of modern processors

 PAX offers a new technique for the placement of data

 PAX exhibits significant improvement in the number of L2 data cache misses

 Number of comparisons with NSM and DSM present better performance:

High data cache performance

Faster than NSM for DSS queries

 Orthogonal to other storage decisions

(27)

Summary [2] – Comparison

Criterion NSM DSM PAX

Inter-record spatial locality NO YES YES Low record reconstruction

cost YES NO YES

Implementation SIMPLE NO NO

Performance with large

number of attr. GOOD BAD GOOD

(28)

Outline

 Motivation

 Related Work

 PAX

 Experimental results

 Summary

 Discussion

(29)

Technical Questions

 Why there is no space overhead in comparison with NSM?

 Does the underlying platform affect the performance? (Linux/AMD)

Other Questions?

(30)

Critique

 Difficult Implementation

 No results regarding TPC-C (OLTP) workload presented. – Any data layout policy should exhibit relatively good

performance in both the two types of workloads

 What happens after deletions? Overhead for page re- organization

 What is the behavior with the variable-length attributes? Re- calculation of boundaries.

 Experimental setup very convenient for PAX. Fixed-line

queried attributes, exactly 4 times smaller than cache block

(31)

Points of Discussion

 Has this idea been implemented in any commercial product?

 Work for addressing the L1I cache misses problem? Or it is only task of the compiler?

 How about the new trends in computer

architecture?

(32)

Thank you!!

[email protected]

References

Related documents

En este trabajo hablamos de patronímico como el recurso denominativo que incorpora total o parcialmente, directa o indirectamente (siglas o acrónimos), el nombre

Beijing Yili Fine Chemicals Co., Ltd.. Balçik Köyü Yolu Üzeri GEPOSB Içi 3 Cd. Atatürk Bulvan Milas caddesi No.. Miranda de Ebro-Puentelarra, km.. INYECCIONES PLASTICAS. MECACONTROL

stvarno rabe internetsko bankarstvo nešto manji. Postoji statistika koja egzaktno dokazuje trend rasta korištenja internet bankarstva. To su podaci broja transakcija i

This paper describes several subsistence livelihoods (e.g. rice farming, fishing, and community forestry) that have traditionally been present in Northeast Thailand, a region known

Section number Photographs Page 11 Helm, Charles H., House Franklin County, Missouri Historic Resources of Washington, Missouri.

It is recommended that the number of the original pump is checked to ensure that the correct replacement part

South Asian Clinical Toxicology Research Collaboration, Faculty of Medicine, University of Peradeniya, Peradeniya, Sri Lanka.. Department of Medicine, Faculty of

To test whether immediate changes in CO2 concentration affect the food uptake, 30 copepods from pre-incubation at control seawater pH were placed in three 1 l bottles containing