• No results found

Visual Data mining SAS/SPECTRAVIEW Software

N/A
N/A
Protected

Academic year: 2021

Share "Visual Data mining SAS/SPECTRAVIEW Software"

Copied!
36
0
0

Loading.... (view fulltext now)

Full text

(1)

SAS/SPECTRAVIEW Software

Annie Postic / Bengt Bengtsson

SAS Institute

SAS/SPECTRAVIEW Software

Annie Postic / Bengt Bengtsson

SAS Institute

:HOFRPH :HOFRPH

Visual Data mining

(2)

• SAS/SPECTRAVIEW software

– Advanced Visualization Technology!

• Data Mining

– Turning data into profits!

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Introduction

,QWURGXFWLRQ

(3)

• Business Challenges • Data Mining Solution

• Importance of Data Visualization

• Advanced Visualization Technology

– SAS/SPECTRAVIEW software

• Business Example

,QWURGXFWLRQ

,QWURGXFWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(4)

• Turn large quantities of data into meaningful

information

• Turn information into profits

• Gain a competitive advantage !

7KH&KDOOHQJHV

7KH&KDOOHQJHV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(5)

• Customer Retention

– 10 times more expensive to acquire new customers than to keep the customers we currently have.

• Profiling/Segmentation

– What are the Traits of Our Most Profitable customers?

%XVLQHVV'ULYHUV

%XVLQHVV'ULYHUV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(6)

• Cross-Selling

– How can I sell additional products/services to customers based on what they have

already purchased?

• Fraud Detection

– What are the characteristics of a fraudulent transaction?

%XVLQHVV'ULYHUV

%XVLQHVV'ULYHUV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(7)

The Data Mining Solution

– SAS Institute defines data mining as :

«The process of selecting, exploring, and modeling large amounts of data to uncover previously unknown patterns for a business advantage»

7KH6$66ROXWLRQ

7KH6$66ROXWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(8)

• The Data Mining Solution

– These data stockpiles mainly contain customer data, but the data's hidden value--the potential to predict business

trends and customer behavior--has largely gone untapped.

7KH6$66ROXWLRQ

7KH6$66ROXWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(9)

'DWD0LQLQJ3URFHVV

'DWD0LQLQJ3URFHVV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

The Data Mining Process

SEMMA

– S

ample

-

extract portion of data

– Explore -search for patterns/trends

– Modify -reduce # of variables

– Model -analyze data

(10)

'DWD0LQLQJ3URFHVV

'DWD0LQLQJ3URFHVV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

The Data Mining Process

SEMMA

– S

ample

-

extract portion of data

– Explore -search for patterns/trends – Modify -reduce # of variables

– Model -analyze data

(11)

• Data Visualization Software...

...is one of the most versatile tools for data mining exploration. It enables you to

visually interpret complex patterns in multidimensional data. By viewing data summarized in multiple graphical forms and dimensions, you can uncover trends and spot outliers intuitively and

immediately.

'DWD([SORUDWLRQ

'DWD([SORUDWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(12)

• Data Visualization Software...

...In the data mining process, visualization tools help you explore data before

modeling--and verify the results of other

data mining techniques. Visualization tools are particularly useful for detecting

patterns found in only small areas of the overall data.

'DWD([SORUDWLRQ

'DWD([SORUDWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(13)

• SAS/SPECTRAVIEW software

– Advanced Visualization Technology

– Interactive Data Exploration

– 3D Animation and Color Coding

– Integrated component of the SAS System

'DWD([SORUDWLRQ

'DWD([SORUDWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(14)

SAS/SPECTRAVIEW software

– Explore up to 5 variables at one time using... • Cutting planes • Point Clouds • Volume Rendering 'DWD([SORUDWLRQ 'DWD([SORUDWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(15)

• SAS/SPECTRAVIEW 6.12 Enhancements

– Categorization - easy to read in data

– Visual Subsetting - easy to capture data

– 3D Probe - easy to pin-point values

– Navigation Tools - easy to manipulate data

'DWD([SORUDWLRQ

'DWD([SORUDWLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(16)

• Health sector

• Business Problem

– High costs for lengthy hospital stays and difficulties to allocate beds

• Business Solution

– Better understand the length of stay to be able to predict the number of occuped bed

([DPSOH

([DPSOH

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(17)

• The Process

– examine characteristics of lengthy hospital stay

• The Tool

– explore data using SAS/SPECTRAVIEW

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Example

([DPSOH

(18)

• Patient Data from an Hospital

– Characteristics

• Hospital Length of Stay

• Country, City of residence, Age, Origin, Sex etc • 90,000+ observations

• Modified data for confidentiality reasons

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Example

([DPSOH

(19)

• Examine Data

– Response Variable

• Average Length of Stay

– Independent Variables

• Age • Origin • Country

• City of residence

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Example

([DPSOH

(20)

• Color Coding -

Response Variable

– Average Length of stay

• > 20 days as Yellow

• 10 - 20 days as Red

• < 10 days as Green

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Example

([DPSOH

(21)

All Countries, All Ages

All Countries, All Ages

by Origin

by Origin

Average Length of Stay

Countries Countries Age Groups Age Groups 0-110-11 95 + 95 + AVG Stay AVG Stay Origin = Europe Origin = Europe

(22)

All Countries, All Ages

All Countries, All Ages

by Origin

by Origin

Average Length of Stay

Origin = Africa Origin = Africa Countries Countries 0-11 0-11 Age Groups Age Groups 95 + 95 + AVG Stay AVG Stay

(23)

• Narrow in on individual countries • Color Coding -

Response Variable

– Average Length of stay

• > 10 days as Red

• < 10 days as Green

• See if age group is a key attribute in our

modeling process

%XVLQHVV&DVH

%XVLQHVV&DVH

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(24)

All Origins, All Ages

All Origins, All Ages

by Coutries

by Coutries

(25)

All Origins, All Ages

All Origins, All Ages

by Coutries

by Coutries

Average Length of Stay

Origin Origin Age Groups Age Groups 0-11 0-11 95 + 95 + AVG Stay AVG Stay

(26)

All Origins, All Ages

All Origins, All Ages

by Coutries

by Coutries

Average Length of Stay

Origin Origin Age Groups Age Groups 0-11 0-11 95 + 95 + AVG Stay AVG Stay

(27)

• Identify some exceptionnal long lengths for

young people coming from certain countries

• Long lengths of stay occur at a higher age group

but American people have different behaviour – African people stay longer and are younger than

European people whatever their living country is

– Very few americans living in France going to hospital – No long stays for americans older than 70 years

– European people between 30 and 40 years old coming from France have exceptional

long stays )LQGLQJV)LQGLQJV SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(28)

• Standard statistical analysis

ÖVery similar lengths

)LQGLQJV

)LQGLQJV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Findings France Italy => 50 years of Age N=712 N=408 Mean=12.3 days Mean=13.4 days < 50 years of Age N=408 N=871 Mean= 9 days Mean= 8 days Switzerland N=4473 Mean=13.4 days N=5034 Mean= 8 days

(29)

• Extract data to Model and continue with the

Data Mining Process

– Handle americans separatly

– Use decision trees to find other determining characteristics (I.e. Medical History, Family background, ...)

– Model the length using influent characteristic – Assess these characteristics for our length

forecasting process

&RQFOXVLRQ

&RQFOXVLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(30)

• Other Visualization methods – Point Clouds – Isosurfaces – Cutting Planes )XUWKHU$QDO\VLV )XUWKHU$QDO\VLV

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(31)

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Further Analysis

• Point cloud

– PC sales studies

• By date, store and brand

BRAND

STORE

(32)

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Further Analysis

• Point cloud

– PC sales studies

• By date, store and brand

BRAND

STORE

(33)

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Further Analysis

• Volume

– Pollution study

• By longitude, latitude, level and time period

(34)

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Further Analysis

• Isosurface

– Pollution study

• By longitude, latitude, level and time period

(35)

Visualization of the data helps us to – Better understand data

– Spot patterns and trends not evident in just the numbers

– Discover new relationships

– Save time analyzing your data

• Reveal a subset of attributes to be most productive in the modeling phase of the data mining process

• Intuitive tools for the business professional

&RQFOXVLRQ

&RQFOXVLRQ

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

(36)

SAS/SPECTRAVIEW and Data

SAS/SPECTRAVIEW and Data MMiningining

Thank you for your attention

Thank you for your attention

References

Related documents

Phylogenetic analyses for nucleotide sequence (A) and amino acid sequences (B) of P1 protein re- gion of Ukrainian isolate (UA1Gr) and those of 21 known strains and isolates

Business Application Business Application User Interface Processes Rules Data Use of Database (DBMS) User Interface Processes Rules Data Business Application User Interface Data

In this study, the relationship between ground-glass opacity (GGO) visibility and physical detectability index in low-dose computed tomography (LDCT) for lung cancer screening

Miyoshi myopathy (MM) is caused by the mutations of dysferlin gene (DYSF), which impairs the function of dysferlin protein causing muscle membrane dysfunction.. We report a

In line with some of the main objectives of Nordplus, which are to contribute to lifelong learning and educational cooperation be- tween the Nordic countries as well as

In this study, a new three-part method for holistic sensory evaluation of fashion products was developed and used to understand consumer perceptions of novel bacterial

involves an entrustment of power from one party to another to be exer- cised under a purposive and other-regarding mandate.&#34;62 Again, the United States Congress or

Verifica-se que os participantes consideraram o fórum de discussão como o menos relevante para a aquisição de conhecimentos seguido da sessão síncrona (Gráfico