• No results found

SAS Intelligence Database Teknikken bag den nye database

N/A
N/A
Protected

Academic year: 2021

Share "SAS Intelligence Database Teknikken bag den nye database"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

SAS Intelligence

Database – Teknikken

bag den nye database

Georg Morsing

[email protected]

(2)

Ny databaseteknologi til enterprise intelligence

SAS

Intelligence

Database

RDBMS

ƒ

Information

Fra store datamængder

Meget hurtigt

Til mange samtidige brugere

ƒ

Skalerbarhed gennem

(3)

Ny databaseteknologi til enterprise intelligence

SAS

Intelligence

Database

RDBMS

ƒ

Eliminering af

transaktionsorienteret

overhead i RDBMS

ƒ

Ny databaseteknologi:

Organiser data optimalt

(4)

Ny databaseteknologi til enterprise intelligence

SAS

Intelligence

Database

RDBMS

ƒ

Mange samtidige

ƒ

Lille opgave

ƒ

Stor opgave

ƒ

Udnyt alt hardware fuldt ud

Designet til opgaven!

Cykel

(5)

Ny databaseteknologi til enterprise intelligence

SAS

Intelligence

Database

RDBMS

Eliminering af transaktionsorienteret overhead

Optimal udnyttelse af hardwaren

9

Reduktion af diskforbrug

2-4 gange mindre

9

Hurtigere svartid

op til 40 gange hurtigere

(6)

SAS Intelligence Database

Flere data, queries, brugere

Svartid

10

8

6

2

0

RDBMS

Database kun til enterprise intelligence

ƒ

Hurtig

ƒ

Pålidelig og robust

24/7, dedikeret database

ƒ

Skalerbar

Opnå konstant svartid

Med vækst i data, forespørgsler og

brugere

ƒ

Lav “total cost of ownership”

Reduceret hardware krav

Udnytter hardware optimalt

SAS IDB

4 CPU

6 CPU

2 CPU

(7)

SAS Intelligence Database

Teknikken bag den nye database

4.

Connectivity – Data & klienter

5.

Metadata, administration og

SAS

®

Intelligence Platform

1.

Partitionering af data

2.

Parallel processing

3.

Index

CPU

CPU

CPU

CPU

Operating system

Application

Application

RAM

I/O

controller

I/O

controller

(8)

SAS Intelligence Database

Partitionering af data

Metadata

Data

Traditionel tabel

SAS IDB tabel

Descriptor

Data

Data

System

Table

(9)

SAS Intelligence Database

Partitionering af data

Metadata

Data

SAS IDB tabel

CPU

CPU

CPU

CPU

Operating system

Application

Application

Application

RAM

I/O

controller

I/O

controller

(10)

SAS Intelligence Database

Partitionering af data

Metadata

Data

SAS IDB tabel

CPU

CPU

CPU

CPU

Operating system

Application

Application

Application

RAM

I/O

controller

I/O

controller

(11)

SAS Intelligence Database

Partitionering af data – Cluster-database

SAS

IDB-tabeller

ETL-processer

parallelt

Data A

Load-proces

Fact A

Fact B

Udfør

cluster-program

Load-proces

Data B

Fact C

Load-proces

Data C

Fact D

Load-proces

Data D

(12)

SAS Intelligence Database

Partitionering af data – Cluster-database

Cluster

SAS

IDB-tabel

Data A

Load proces

Load proces

Load proces

Load proces

Fact A

Fact C

Fact B

Fact D

Data B

Data C

Data D

ETL-processer

parallelt

Udfører

cluster-program

proc spdo library = mylib;

cluster create

ThirdQuarter2003

mem = Oct2006

mem = Nov2006

mem = Dec2006;

quit;

(13)

SAS Intelligence Database

Partitionering af data – Cluster-database

Cluster

SAS

IDB-tabel

ETL-processer

parallelt

Data A

Load-proces

Load-proces

Load-proces

Load-proces

Fact A

Fact C

Fact B

Fact D

Data B

Data C

Data D

Udfører

un-cluster-program

(14)

SAS Intelligence Database

Partitionering af data – Cluster-database

SAS

IDB-tabeller

ETL-processer

parallelt

Data A

Load-proces

Load-proces

Load-proces

Load-proces

Fact A

Fact C

Fact B

Fact D

Udfører

un-cluster-program

proc spdo library = mylib;

Uncluster TirdQuarter2003;

quit;

Data B

Data C

(15)

SAS Intelligence Database

Partitionering af data – Cluster-database

ƒ

Tidsbaseret cluster-database

SAS IDB-tabel

Data

Metadata

Cluster SAS IDB-tabel

Table2

Table3

Cluster-metadata

Table1

Table4

Table5

Table6

Table7

Table8

(16)

SAS Intelligence Database

Partitionering af data – Cluster-database

Lynhurtig opdatering og oprydning

ƒ

Tidsbaseret cluster SAS IDB-database

ƒ

Forbered nye SAS IDB-tabeller

ƒ

Kør un-cluster-program

ƒ

Kør cluster-program

Nye SAS IDB-tabeller

Påvirker ikke

den kørende

SAS

IDB-database

Der skal kun

dannes nye

cluster-metadata

(17)

SAS Intelligence Database

Partitionering af data – Dynamisk cluster-database

ƒ

Forbered nye tabeller

ƒ

Udfør cluster add

Cluster-metadata

Jan 2003

Feb 2003

Apr 2003

Jul 2003

Aug 2003

Mar 2003

May 2003

Jun 2003

Sep 2003

Oct 2003

Nov 2003

Dec 2003

Jan 2004

Feb 2004

Apr 2004

Jul 2004

Aug 2004

Mar 2004

May 2004

Jun 2004

Sep 2004

Oct 2004

Nov 2004

Dec 2004

Jan 2005

Feb 2005

Mar 2005

Apr 2005

(18)

Jan 2005

Feb 2005

Mar 2005

Apr 2005

May 2005

Jun 2005

Cluster Metadata

Jan 2003

Feb 2003

Apr 2003

Jul 2003

Aug 2003

Mar 2003

May 2003

Jun 2003

Sep 2003

Oct 2003

Nov 2003

Dec 2003

Jan 2004

Feb 2004

Apr 2004

Jul 2004

Aug 2004

Mar 2004

May 2004

Jun 2004

Sep 2004

Oct 2004

Nov 2004

Dec 2004

Cluster-metadata

Jan 2003

Feb 2003

Apr 2003

Jul 2003

Aug 2003

Mar 2003

May 2003

Jun 2003

Sep 2003

Oct 2003

Nov 2003

Dec 2003

Jan 2004

Feb 2004

Apr 2004

Jul 2004

Aug 2004

Mar 2004

May 2004

Jun 2004

Jan 2005

Feb 2005

Mar 2005

Apr 2005

May 2005

SAS Intelligence Database

Partitionering af data – Dynamisk cluster-database

(19)

SAS Intelligence Database

Teknikken bag den nye database

CPU

CPU

CPU

CPU

Operating system

Application

Application

RAM

I/O

controller

I/O

controller

1.

Partitionering af data

2.

Parallel processing

3.

Index

Cluster-database

4.

Connectivity – Data & klienter

5.

Metadata, administration og

(20)

SAS Intelligence Database

Teknikken bag den nye database

1.

Partitionering af data

2.

Parallel processing

3.

Index

4.

Connectivity – Data & klienter

5.

Metadata, administration og

SAS

®

Intelligence Platform

CPU

CPU

CPU

CPU

Operating system

Application

Application

RAM

I/O

controller

I/O

controller

(21)

SAS Intelligence Database

Parallel processing

Optimal hardwareudnyttelse

ƒ

Indbygget parallel programlogik:

Where processing

Sorting

Group by

Table join

Multi-index builds & updates

(22)

SAS Intelligence Database

Parallel processing

Thread 1 Thread 2 Thread 3 Thread 4

1

6

2

7

3

5

4

8

Partielle datafiler

WHERE, KEEP, SORT,

SUMMARIZE, GROUP BY …

Partielle resultater

Samling af delresultater

(23)

SAS Intelligence Database

Parallel processing – Table join

A

A

A

A

B

B

B B

Threaded Sort on Table A

A1 A2 A3 A4

B1 B2 B3 B4

Threaded Sort on Table B

A1 B1 A2 B2 A3 B4 A4 B4

Thread 1

Thread 2

Thread 3

Thread 4

Parallel sortering

af partitionerede

datafiler

Tabel A og B sorteret

(24)

SAS Intelligence Database

Parallel processing – Index creation

Partitionerede datafiler

Parallel dannelse af index

Index-dannet

(25)

SAS Intelligence Database

Avanceret hybrid-index-teknologi

Metadata

Data

Traditionel tabel

SAS IDB-tabel

Descriptor

Data

Index

Metadata

ƒ

Parallel

index-evaluering

ƒ

Flere tråde evaluerer

WHERE samtidigt

ƒ

Index-metadata

ƒ

Index-segmenter

(26)

SAS Intelligence Database

Avanceret hybrid-index-teknologi

Index

column_a

Segment 2

Index-

meta-data

ƒ

Parallel index-evaluering

ƒ

Flere tråde evaluerer WHERE samtidigt

ƒ

Index-metadata og segmenter

Segment 1

Segment 4

Segment 3

Segment 5

Segment 6

Segment 7

Segment 8

Segment 2

Segment 3

Segment 4

Segment 5

where column_a

in ('A','

B

') ;

(27)

ƒ

Cached index-metadata

MIN, MAX, COUNT, COUNT DISTINCT

NMISS, RANGE

ƒ

Undgå fuld table scan

Select count(*) from table;

“Index-statistiktabel”

SAS Intelligence Database

Avanceret hybrid-index-teknologi

Metadata

Data

SAS IDB-tabel

Metadata

Index

X

(28)

SAS Intelligence Database

Adgang til data fra enhver klient

SAS

®

-klient

ƒ

Libname statement

Åbner adgang fra alle SAS-klienter

Transparent …… som enhver anden database

libname mylib sasspds "class"

server=localhost.5200

user='student' passwd=

'

Metadata0';

Datatype: SAS Intelligence Database

(29)

SAS Intelligence Database

Adgang til data fra enhver klient

SAS

®

-klient

ƒ

SQL pass-through facility

SQL udføres på server og ikke klient

SQL udføres i SAS Intelligence Database

proc sql;

select *

from connection to sasspds

( . . . sasspds kode . . . );

(30)

SAS Intelligence Database

Adgang til data fra enhver klient

Øvrige klienter:

ƒ

ODBC

Windows-klient

ƒ

JDBC

Java-program med web-interface

ƒ

htmlSQL

Webapplikation

(31)

Informationsbruger

Avanceret bruger / analytiker

Forretningsadministrator

Datawarehouse

Web server

Server tier

Eksterne

datakilder

Dataintegrations-ekspert

Teknisk

administrator

Metadata

SAS

®

Intelligence Platform – Rollebaserede applikationer

SAS Intelligence

Danner og

opdaterer SAS IDB

Brugerstyring,

logning, integration

(32)

SAS

®

Intelligence Platform

SAS

®

Management Console – Administration

ƒ

Administration af alle

ressourcer

ƒ

Servere

ƒ

Databaser

ƒ

User, group, rolle

ƒ

Rapporter

ƒ

SAS

®

Stored Processes

ƒ

SAS

®

-licenser

ƒ

Job schedulering

(33)

SAS Intelligence Database

Teknikken bag den nye database

4.

Connectivity – Data & klienter

5.

Metadata, administration og

SAS

®

Intelligence Platform

1.

Partitionering af data

2.

Parallel processing

3.

Index

CPU

CPU

CPU

CPU

Operating system

Application

Application

RAM

I/O

controller

I/O

controller

(34)

Figure

Tabel A og B sorteret

References

Related documents

Subjects were asked at the conclusion of the study to rate the effectiveness of the address bar, status bar, the security toolbar that they used in differentiating authentic web

Total sample of company contacts located outside of India that are hiring in India Estimates include fixed effects for year x job type x country of company contact Panel A:

The concentration pro fi le of solute in the dispersed phase (toluene) all along the microchannel is obtained by mass balance from which a droplet side mass transfer coef fi cient

The study found that majority of the students were aware and used e-resources once a week while too long to view/download web pages the major challenges encountered

To compare the cost of the wrap fee program with non-wrap fee portfolio management services, you should consider the frequency of trading activity associated with our

Risk factors for acute esophagitis in non-small-cell lung cancer patients treated with concurrent chemotherapy and three-dimensional conformal radiotherapy.. Int J Radiat Oncol

The project included the compilation of a soil and terrain database of the DR of Congo at scale 1:2 M and of Burundi and Rwanda at scale 1:1 M following the standardized

If you are interested please contact the Empty Property Team on 01634 333666 or email [email protected] We will arrange to inspect your property to discuss the work needed