SAS/SPECTRAVIEW Software
Annie Postic / Bengt Bengtsson
SAS Institute
SAS/SPECTRAVIEW Software
Annie Postic / Bengt Bengtsson
SAS Institute
:HOFRPH :HOFRPH
Visual Data mining
• SAS/SPECTRAVIEW software
– Advanced Visualization Technology!
• Data Mining
– Turning data into profits!
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Introduction
,QWURGXFWLRQ
• Business Challenges • Data Mining Solution
• Importance of Data Visualization
• Advanced Visualization Technology
– SAS/SPECTRAVIEW software
• Business Example
,QWURGXFWLRQ
,QWURGXFWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Turn large quantities of data into meaningful
information
• Turn information into profits
• Gain a competitive advantage !
7KH&KDOOHQJHV
7KH&KDOOHQJHV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Customer Retention
– 10 times more expensive to acquire new customers than to keep the customers we currently have.
• Profiling/Segmentation
– What are the Traits of Our Most Profitable customers?
%XVLQHVV'ULYHUV
%XVLQHVV'ULYHUV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Cross-Selling
– How can I sell additional products/services to customers based on what they have
already purchased?
• Fraud Detection
– What are the characteristics of a fraudulent transaction?
%XVLQHVV'ULYHUV
%XVLQHVV'ULYHUV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• The Data Mining Solution
– SAS Institute defines data mining as :
«The process of selecting, exploring, and modeling large amounts of data to uncover previously unknown patterns for a business advantage»
7KH6$66ROXWLRQ
7KH6$66ROXWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• The Data Mining Solution
– These data stockpiles mainly contain customer data, but the data's hidden value--the potential to predict business
trends and customer behavior--has largely gone untapped.
7KH6$66ROXWLRQ
7KH6$66ROXWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
'DWD0LQLQJ3URFHVV
'DWD0LQLQJ3URFHVV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
The Data Mining Process
SEMMA
– S
ample
-
extract portion of data– Explore -search for patterns/trends
– Modify -reduce # of variables
– Model -analyze data
'DWD0LQLQJ3URFHVV
'DWD0LQLQJ3URFHVV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
The Data Mining Process
SEMMA
– S
ample
-
extract portion of data– Explore -search for patterns/trends – Modify -reduce # of variables
– Model -analyze data
• Data Visualization Software...
...is one of the most versatile tools for data mining exploration. It enables you to
visually interpret complex patterns in multidimensional data. By viewing data summarized in multiple graphical forms and dimensions, you can uncover trends and spot outliers intuitively and
immediately.
'DWD([SORUDWLRQ
'DWD([SORUDWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Data Visualization Software...
...In the data mining process, visualization tools help you explore data before
modeling--and verify the results of other
data mining techniques. Visualization tools are particularly useful for detecting
patterns found in only small areas of the overall data.
'DWD([SORUDWLRQ
'DWD([SORUDWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• SAS/SPECTRAVIEW software
– Advanced Visualization Technology
– Interactive Data Exploration
– 3D Animation and Color Coding
– Integrated component of the SAS System
'DWD([SORUDWLRQ
'DWD([SORUDWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
SAS/SPECTRAVIEW software
– Explore up to 5 variables at one time using... • Cutting planes • Point Clouds • Volume Rendering 'DWD([SORUDWLRQ 'DWD([SORUDWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• SAS/SPECTRAVIEW 6.12 Enhancements
– Categorization - easy to read in data
– Visual Subsetting - easy to capture data
– 3D Probe - easy to pin-point values
– Navigation Tools - easy to manipulate data
'DWD([SORUDWLRQ
'DWD([SORUDWLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Health sector
• Business Problem
– High costs for lengthy hospital stays and difficulties to allocate beds
• Business Solution
– Better understand the length of stay to be able to predict the number of occuped bed
([DPSOH
([DPSOH
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• The Process
– examine characteristics of lengthy hospital stay
• The Tool
– explore data using SAS/SPECTRAVIEW
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Example
([DPSOH
• Patient Data from an Hospital
– Characteristics
• Hospital Length of Stay
• Country, City of residence, Age, Origin, Sex etc • 90,000+ observations
• Modified data for confidentiality reasons
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Example
([DPSOH
• Examine Data
– Response Variable
• Average Length of Stay
– Independent Variables
• Age • Origin • Country
• City of residence
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Example
([DPSOH
• Color Coding -
Response Variable
– Average Length of stay
• > 20 days as Yellow
• 10 - 20 days as Red
• < 10 days as Green
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Example
([DPSOH
All Countries, All Ages
All Countries, All Ages
by Origin
by Origin
Average Length of Stay
Countries Countries Age Groups Age Groups 0-110-11 95 + 95 + AVG Stay AVG Stay Origin = Europe Origin = Europe
All Countries, All Ages
All Countries, All Ages
by Origin
by Origin
Average Length of Stay
Origin = Africa Origin = Africa Countries Countries 0-11 0-11 Age Groups Age Groups 95 + 95 + AVG Stay AVG Stay
• Narrow in on individual countries • Color Coding -
Response Variable
– Average Length of stay
• > 10 days as Red
• < 10 days as Green
• See if age group is a key attribute in our
modeling process
%XVLQHVV&DVH
%XVLQHVV&DVH
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
All Origins, All Ages
All Origins, All Ages
by Coutries
by Coutries
All Origins, All Ages
All Origins, All Ages
by Coutries
by Coutries
Average Length of Stay
Origin Origin Age Groups Age Groups 0-11 0-11 95 + 95 + AVG Stay AVG Stay
All Origins, All Ages
All Origins, All Ages
by Coutries
by Coutries
Average Length of Stay
Origin Origin Age Groups Age Groups 0-11 0-11 95 + 95 + AVG Stay AVG Stay
• Identify some exceptionnal long lengths for
young people coming from certain countries
• Long lengths of stay occur at a higher age group
but American people have different behaviour – African people stay longer and are younger than
European people whatever their living country is
– Very few americans living in France going to hospital – No long stays for americans older than 70 years
– European people between 30 and 40 years old coming from France have exceptional
long stays )LQGLQJV)LQGLQJV SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Standard statistical analysis
ÖVery similar lengths
)LQGLQJV
)LQGLQJV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Findings France Italy => 50 years of Age N=712 N=408 Mean=12.3 days Mean=13.4 days < 50 years of Age N=408 N=871 Mean= 9 days Mean= 8 days Switzerland N=4473 Mean=13.4 days N=5034 Mean= 8 days
• Extract data to Model and continue with the
Data Mining Process
– Handle americans separatly
– Use decision trees to find other determining characteristics (I.e. Medical History, Family background, ...)
– Model the length using influent characteristic – Assess these characteristics for our length
forecasting process
&RQFOXVLRQ
&RQFOXVLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
• Other Visualization methods – Point Clouds – Isosurfaces – Cutting Planes )XUWKHU$QDO\VLV )XUWKHU$QDO\VLV
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Further Analysis
• Point cloud
– PC sales studies
• By date, store and brand
BRAND
STORE
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Further Analysis
• Point cloud
– PC sales studies
• By date, store and brand
BRAND
STORE
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Further Analysis
• Volume
– Pollution study
• By longitude, latitude, level and time period
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
Further Analysis
• Isosurface
– Pollution study
• By longitude, latitude, level and time period
• Visualization of the data helps us to – Better understand data
– Spot patterns and trends not evident in just the numbers
– Discover new relationships
– Save time analyzing your data
• Reveal a subset of attributes to be most productive in the modeling phase of the data mining process
• Intuitive tools for the business professional
&RQFOXVLRQ
&RQFOXVLRQ
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining
SAS/SPECTRAVIEW and Data
SAS/SPECTRAVIEW and Data MMiningining