TDWI Best Practice BI & DW Predictive Analytics & Data Mining
Course Length : 9am to 5pm, 2 consecutive days2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7
Venue & Cost : Click here or visit C3 Education at www.c3businesssolutions.com Inclusions : Morning tea, lunch & afternoon tea each day
: Course workbook & presentation notes
Course Outline
This two-day hands-on methodical workshop offers a comprehensive project-level orientation to:
1. Predictive analytics solutions -- from project assessment and preparation to industry standard process, case demonstrations, pragmatic exercises, and model lifecycle management.
2. Data mining methods and process at the tactical level.
Day 1: Predictive analytics. Low risk strategies for high impact projects
If you are looking for an intensive vendor-neutral and non-promotional introduction to data mining best practices and an approach to predictive analytics that is critical to modeling success, then this course is designed for you. Those in attendance will actively step through the industry standard process for data mining and realize why an advanced degree in statistics, mathematics, or computer science not required to establish a productive internal predictive analytics practice. Live working sessions reveal real-world obstacles and breakthroughs from which to interpret, learn, and apply.
You Will Learn
Process, principles and terminology for predictive analytics Who is utilizing predictive analytics, and why
Common project pitfalls and how to avoid them Project performance and maintenance issues
How to define business objectives for a decision-support system Hands-on exposure to the natural messiness of data mining How to get started
Day 2: Data mining methods & techniques. Data preparation, model-building and
evaluation
Attendees will observe demonstrations of machine learning methods and computer-guided analytical techniques for extracting and interpreting complex patterns and relationships from large volumes of data. If you desire an
intensive functional orientation to data mining concepts, tools, techniques, and supporting methods, this session is designed for you.
This vendor-neutral course broadly covers data-driven information discovery techniques and model-building tactics free of bias to any particular modeling tool or method. Popular open source and commercial packages are
leveraged to illustrate methods, but not to showcase the tools.
You Will Learn
The data mining process and general implementation How to prepare raw data and benefit from visualization Key data mining methods and how they compare How to validate models and assess their value Data mining product selection
Solution integration, ongoing performance, and maintenance Where to begin and how to obtain resources and support
Ideal for
IT professionals who wish to expand their business intelligence skills IT/IS executives and managers: CIOs, CKOs, CTOs, technical directors Project leaders who must extract value from their data
Line of business executives and functional managers, analysts and forecasters Decision support system architects
Technology planners who survey emerging technologies in order to prioritize corporate investment Consultants requiring competency in data mining and related emerging information technologies.
Presenter
Thomas A. (Tony) Rathburn
Senior Consultant, The Modeling Agency
Tony Rothburn has more than 20 years of experience in the business utilization of predictive analytics technologies. Mr. Rathburn taught MIS and statistics while an instructor in the College of Business at Kent State University. He also served as vice president of applied technologies for NeuralWare, Incorporated, a neural network tools and consulting company. Mr. Rathburn is a senior consultant with The Modeling Agency—a Pennsylvania company that provides guidance and results for those who are data rich, yet information poor.
Registration
Please register your interest on the Education page to secure your place and receive date confirmation notifications.
About TDWI
TDWI, a division of 1105 Media, is the premier provider of in-depth, high-quality education and research in the business intelligence and data warehousing industry. Starting in 1995 with a single conference, TDWI is now a comprehensive resource for industry information and professional development opportunities. TDWI sponsors and promotes quarterly World Conferences, topical seminars, onsite education, a worldwide Membership program, business intelligence certification, resourceful publications, industry news, an in-depth research program, and a comprehensive website, www.tdwi.org.
Course Detail: TDWI Best Practice BI & DW Predictive Analytics &
Data Mining
Day 1: Predictive analytics. Low risk strategies for high-impact projects
Core Concepts
Beyond Traditional Statistics
o ‘Assumptions’ of Traditional Statistics
o Shift your thinking…
Behaviours of Interest
Goal of Modeling
Modeling Human Behaviour
Components of Mathematical Models
Uses of Formulas
Winning at the ‘game’ we call business
Attributes of a game
Project Success Survey
Predictive Analytics ROI Survey
Predictive Analytics Business Goals
Analytic Goals
Why Predictive Analytics?
Lab 1: Introduction Types of Models Response Risk Attrition Activation
Cross-Sell and Up-Sell
Profile Analysis
Segmentation
Net Present Value
Lifetime Value
Why Predictive Analytics?
Low-Risk / High-ROI Project Design
Low-Risk / High-ROI Projects
Phased Development Cycle
Positive Impact Behaviour Modeling
Negative Impact Behaviour Modeling
Conflict Resolution Modeling
Ranking Across the Continuum
Dimensionality Enhancement
Refining Precision
Forecasting
Lab 2: Opportunity Conceptualization The CRISP-DM Process Model
o Business Understanding o Relationship Solution Space o Determine Modeling Objectives o Data Understanding
o Data Preparation o Modeling
A Sampling of Commercial Data Mining Software Products o Validation
o Implementation
Predictive Analytics is Analysis… not Engineering
Business Understanding
Project Team
Performance Metrics
o Determine Business Objectives o Conversion: Objectives to Metrics o Handling Multiple Metrics
o Lift and Gains Chart Interpretation o Custom Performance Charts
o Enhancing Performance with Threshold Evaluation o Calculating the Current Baseline: Uplift Analysis
Lab 3: Performance Metrics
Modeling Objectives
o The Case for Classification o Prioritize the Dependent Variable o Precision Requirements
o Training for what “should be done…” not what “was done”
o Confirming Compatibility o Defining Modeling Objectives o Resource Availability
Experimental Design
o How Much Data is Needed to Develop a Model?
o Training Data for Classification o Training Data for Prediction o How Many Variables?
o Purpose of Experimental Design
o Experimental Design: Statistics vs. Predictive Analytics o Data Sets Used
o Type of Data Distribution
Lab 4: Experimental Design & Data Sandbox Construction
Data Understanding
Data Set Determination
Availability
Requirements Planning
Data Quality Issues
o Data Errors o Outliers o Missing Data
Data Types: Behavioural Characteristics
o Behavioural Data o Psychographic Data
Data Types: Mathematical Characteristics
o Qualitative Variables Categorical Data Nominal Data o Quantitative Variables Ordinal Data Interval Data Continuous Data
Lab 5: Data Understanding
Data Preparation
Data Representation Expectations
o Natural Values o Binning
o Bin Boundary Determination o Open Ended Ranges o Collapsed Sets o 1ofN Representations
o Thermometer Representation o Bipolar Representation o Fuzzy Boundaries
o Multiple Boundary Strategies o Controlling Error
Data Transformation Expectations
o Conversion to Linear
o Converting the Shape of the Distribution o Ratios
o Roll-ups
o Domain Specific Transformations
Data Resource Consumption Considerations
Data Extraction for Replicability
Lab 6: Data Preparation
Modeling
Matching Techniques to the Project Goals
o Classification Modeling Techniques o Forecasting Modeling Techniques
Variable Selection
Candidate Model Evaluation
Lab 7: Model Development
Validation & Evaluation
Lab 8: Model Evaluation & Validation
Deployment
End User Interface
Model Run Cycle
Summary and Next Steps
Formal Project Assessment
o Business Understanding o Data Understanding
o Report of Findings & Recommendations
Resources
Day 2: Data mining methods & techniques. Data preparation, model-building and
evaluation
Introduction
Why build predictive models?
Why use advanced technologies?
Why project definition is critical
Why standard statistical analysis is not enough
Shifting our focus
Data Understanding and Preparation
The Data Sandbox
o The opportunity development space o Supporting phased development efforts o Capturing extract detail
o Documenting representation and transformation options o Data sets developed
o Training o Test o Validation
Data Types – Content attributes
o Demographic variables o Psychographic variables o Behavioral variables o RFM variables
Data Types – Analytic Attributes
o Qualitative variables Categorical variables Nominal variables o Quantitative variables Ordinal variables Interval variables Continuous variables Data Errors
o Identification of data errors o Treatment of data errors
Outliers
o Identification of outliers o Treatment of outliers
o How PA differs from traditional statistics
Traditional statistics works
o Ensemble solution development
o Multiple models for different areas of the solution space o What works where
Basic statistics – the ‘in general’ perspective (demo)
Data transformation strategies (demo)
Model Development & Evaluation
Solution Types
Supervised training techniques
o Classification
Single tail models Two-tailed models
Ranking across the continuum o Forecasting models
o Related behaviour modeling
Unsupervised training techniques
o Segmentation and clustering
Candidate Model Development
Technique Overview
Linear regression
Logistic regression
Decision trees
Clustering & segmentation
Neural networks
Evaluating Performance
Analytic metric evaluation
Business metric evaluation
Model Selection
Model Validation
Phase 1 – Positive Benefit Model (demo) Phase 2 – Negative impact model (demo) False positive reduction (demo)
Summary and Next Steps
The Complementary Strategic Course