• No results found

Programming Techniques for SAS In-Memory Analytics with Hadoop

N/A
N/A
Protected

Academic year: 2021

Share "Programming Techniques for SAS In-Memory Analytics with Hadoop"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

2

What Is Hadoop?

(3)

What Is Hadoop?

The Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.

(4)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

4

Core Hadoop Modules

Core Hadoop modules include the following choices: HDFS

(Hadoop Distributed File System)

A file system that distributes large files across the Hadoop cluster of computers

Hadoop YARN A framework for job scheduling and

cluster resource management Hadoop MapReduce A YARN-based system for parallel

(5)

HDFS: Hadoop Distributed File System

(6)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

6

Using HDFS Commands and Files

(7)

Using HDFS Commands and Files

This HDFS command moves a local file into the HDFS cluster:

(8)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

8

(9)

Base SAS Interfaces for Hadoop

Tool Purpose

FILENAME statement

Enables the DATA step to read and write HDFS data files. PROC

HADOOP

• Copy or move files between SAS and Hadoop.

• Execute Hadoop file system commands to manage files and directories.

(10)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

10

SAS/ACCESS Interface to Hadoop

Tool Purpose

SQL pass-through

• Submit HiveQL queries and other HiveQL statements from SAS directly to Hive for Hive processing.

• Query results are returned to SAS. LIBNAME

statement for Hadoop

• Use the SAS programming language to access Hive tables as SAS data sets.

(11)

Additional SAS Technologies for Hadoop

Hadoop is also one of the file storage systems that SAS uses for SAS In-Memory Analytics product solutions.

• SAS High-Performance Analytics products

• SAS Visual Analytics

• SAS Visual Statistics

(12)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

12

Base SAS: FILENAME for Hadoop and PROC HADOOP

(13)

SAS metadata server

SAS workspace server

SAS/ACCESS: SQL Pass-Through and LIBNAME Statement

(14)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

14

SAS In-Memory Analytics Architecture for Hadoop

(15)

SAS In-Memory Interfaces for Hadoop

Interface Purpose Product

High-Performance Analytics Procedures

Perform complex analytical computations on Hadoop tables within the data nodes of the Hadoop distribution via SAS procedure language. HPDS2 allows for manipulation of the data structure (column derivation).

SAS High-Performance Analytics Solutions

SAS Visual Analytics and SAS Visual Statistics

Web interfaces to generate graphical visualizations of data distributions,

relationships, and analytical reports on Hadoop tables that are pre-loaded into memory within the data nodes of the Hadoop distribution.

SAS Visual Analytics and SAS Visual Statistics

(16)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

16

SAS In-Memory Interfaces for Hadoop

Interface Purpose Product

High-Performance Analytics Procedures

Perform complex analytical computations on Hadoop tables within the data nodes of the Hadoop distribution via SAS procedure language.

SAS High-Performance Analytics Solutions

WEB Browser Web interfaces to generate graphical

visualizations of data distributions,

relationships, and analytical reports on Hadoop tables that are pre-loaded into memory within the data nodes of the Hadoop distribution.

SAS Visual Analytics and SAS Visual Statistics

PROC IMSTAT,

PROC LASR, and several other procedures and global statements

A programming interface to perform complex analytical calculations on Hadoop tables that are pre-loaded into memory within the data nodes of the Hadoop distribution.

SAS In-Memory Statistics

(17)
(18)

Copy rig ht © SA S Institute Inc. A ll rig hts re served. 18

In-Memory Analytics

SAS Metadata Server SAS Workspace Server SAS In-Memory Analytics Worker Node SAS In-Memory Analytics Worker Node SAS In-Memory Analytics Worker Node Hadoop DataNode 1 Hadoop DataNode 2 Hadoop DataNode 3 SAS Client Hadoop NameNode Hive SAS In-Memory Analytics Root Node

A SAS process in the root node

(19)

SAS In-Memory Analytics Worker Node SAS In-Memory Analytics Worker Node SAS In-Memory Analytics Worker Node

In-Memory Analytics

SAS processes in each HDFS data node execute in parallel.

(20)

Copy rig ht © SA S Institute Inc. A ll rig hts re served.

20

These SAS High-Performance Analytics products use a SAS High-Performance grid:

• Statistics • Data Mining • Text Mining

These products use a SAS LASR Analytic Server grid:

• Visual Analytics • Visual Statistics • In-Memory Statistics

SAS Technologies That Use In-Memory Analytics Grids

• Econometrics • Forecasting • Optimization

References

Related documents

Subjects were asked at the conclusion of the study to rate the effectiveness of the address bar, status bar, the security toolbar that they used in differentiating authentic web

The main wall of the living room has been designated as a "Model Wall" of Delta Gamma girls -- ELLE smiles at us from a Hawaiian Tropic ad and a Miss June USC

One recent study by the European Commission of several electronic health record and CPOE/CDS implementations over a period of 12 years has reported that it takes at least four,

Figure 24:U373 cells were infected with newly synthesized pAdlox primiR 21 virus 76 Figure 25: Influence of pAdlox sponge eraser 21 infection on miR-21 expression and function

In recent years, a subset of early-stage Internet companies (companies whose primary product is a website or Internet application) have been following different principles –

1\ 11'.. Since the headmaster's strategic planning, supportive and facilitators' roles in care of Bangladesh city secondary schools, have come to encapsulate a

4 – Shear bond strength onto CAD-CAM composite blocks at 24 h and after 15,000 thermo-cycles of the adhesives Scotchbond Universal (3M ESPE) and Clearfil S3 ND Quick (Kuraray

This full potential includes not only mantras such as ” think hard before you use the cloud ”, or ” let’s bypass the service provider ” but also alternative techniques at