• No results found

How To Perform Predictive Analysis On Your Web Analytics Data In R 2.5

N/A
N/A
Protected

Academic year: 2021

Share "How To Perform Predictive Analysis On Your Web Analytics Data In R 2.5"

Copied!
35
0
0

Loading.... (view fulltext now)

Full text

(1)

A GACP and GTMCP company

FREE Webinar by

June 19

th

, 2013

How to perform predictive analysis on your

web analytics tool data

(2)

A GACP and GTMCP company

?

Q

&

A

www

Before we start...

(3)

A GACP and GTMCP company

Our speakers

Carolina Araripe

Inbound Marketing Strategist @Tatvic

http://linkd.in/YazvVn

Amar Gondaliya

Data Model Engineer @Tatvic http://linkd.in/16cpDQI

Kushan Shah

Web Analyst @Tatvic http://linkd.in/18rfFfV
(4)

A GACP and GTMCP company

Talking about Analytics…

Analytics

Descriptive: What has happened? Prescriptive: What should happen? Predictive: Predicts the outcome or future
(5)

A GACP and GTMCP company

Talking about Analytics…

Analytics

Descriptive: What has happened? Prescriptive: What should happen? Predictive: Predicts the outcome or future
(6)

A GACP and GTMCP company

In other words…

“Technology that learns from experience (data) to

predict the future behavior of individuals in order

to drive better decisions.”

Source: Siegel, E. (2013) “Predictive Analytics. The power to predict who will click, buy, lie or die.”

(7)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(8)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(9)

A GACP and GTMCP company

Introduction to R

What

Open source statistical computing language, widely used by

organizations to solve business problems.

Why

Easy to integrate

Data frame

Pre developed

packages

How to get

started

Download

and install

Choose and download

a user-friendly GUI

RStudio

Applications

Data Analysis

Data Visualization

Statistical Tests

Predictive Model

Forecasting

(10)

A GACP and GTMCP company

R Packages

Data Extraction

Time Series

Machine Learning

For this webinar

Categories of Packages

Data Visualization

RGoogleAnalytics

Usage: To extract Google Analytics data into R

Contibutors: Michael Pearmain, Nick Mihailovski, Amar Gondaliya and Vignesh Prajapati

ggplot2

Usage: Build plots and charts

(11)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(12)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(13)

A GACP and GTMCP company

Google Analytics data

Extracting your GA data into R

User performing

data extraction

Google OAuth2

Authorization

Server

Google Analytics

API

Access Token Request

Access Token Response

Call API for list

of profiles

Call API for

(14)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(15)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(16)

A GACP and GTMCP company

Business Problem

$194.70 $225.50 $258.90 $296.70 $338.90 $384.90 2011 2012 2013 2014 2015 2016

US Retail eCommerce Sales 2011-2016

(in billion $)

Projected Growth of Retail eCommerce in US

(17)

A GACP and GTMCP company

Business Problem

Product return

Average Return Rate 9 % 7 %

Average Order Value $100 $100 Orders Per Day 500 500 Total Income $50,000 $50,000 Loss due to returns $4,500 $3,500

Revenue post loss $45,500 $46,500 Increase in Revenue/day --- $1000

Product Return Impact (per day)

“Returns are on the rise-up 19% from 2007. For every US$1 spent on merchandize, 9¢ are returned.”

“Average return rate for ecommerce retailers varies from 3-12%.”

Source: Time Magazine, Sept. 04th, 2012

Increase in Revenue with

recovered returns in long run

Month x30 $30,000

(18)

A GACP and GTMCP company

Transactional Data

Pre Purchase

Data

In Purchase

Data

Browsing Behavior up to shopping

cart

Purchase Behavior from shopping

cart to thank you page

(19)

A GACP and GTMCP company

Loading Input Data

Introducing Model Variables

Model Creation

Model Performance

Applying Model to Test Data

(20)

A GACP and GTMCP company Training Data Machine Learning Algorithm

Test Data Predictive

Model

Predicted Outcome

labels Labels

Supervised Learning Model Variables

Labels are right answers from historical data

e.g.: Spam Detection

Input Data: Contains emails marked Spam/No Spam

Supervised Learning

Generates a function that maps inputs (labeled data) to desired outputs (e.g.: Spam Detection)

Variables

(21)

A GACP and GTMCP company

Loading Input Data

Introducing Model Variables

Model Creation

Model Performance

Applying Model to Test Data

(22)

A GACP and GTMCP company

Loading Input Data

Introducing Model Variables

Model Creation

Model Performance

Applying Model to Test Data

(23)

A GACP and GTMCP company

E.g.: Products purchased as gifts are less likely to be returned

Create a New Variable with binary values: 1 – Product purchased as gift, 0 –

otherwise

Products purchased in holiday season are more likely to be returned

Based on Purchase date, create new variable with binary values: 1 – Product

purchased in the month Nov-Dec, 0 - otherwise

Going beyond algorithms and using domain knowledge to augment new

variables to model

(24)

A GACP and GTMCP company 0.00 100,000.00 200,000.00 300,000.00 400,000.00 500,000.00 600,000.00 700,000.00 800,000.00 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 5,000 Pri ce of Hou se ( $) Size of House (sq ft)

Predictor Variable

R

es

po

n

se

V

ar

iable

Predictor/Response Variables

(25)

A GACP and GTMCP company

Loading Input Data

Introducing Model Variables

Model Creation

Model Performance

Applying Model to Test Data

(26)

A GACP and GTMCP company

Formula

Response ~ Predictor (This argument shows which all variables are

independent (predictor) variables and which variable is/are

dependent(response) variable/s

Family

Binomial (Since the output variable (which is product return is

defined as binary value 0 or 1, we are using binomial family)

Data

Train data set – This data set consists values of all 18 variables (i.e.

values of dependent variables and independent variables are

given). This dataset is also called labeled data.

glm (formula, family, data)

(27)

A GACP and GTMCP company

Loading Input Data

Introducing Model Variables

Model Creation

Model Performance

Applying Model to Test Data

(28)

A GACP and GTMCP company

Loading Input Data

Introducing Model Variables

Model Creation

Model Performance

Applying Model to Test Data

(29)

A GACP and GTMCP company Training Data Machine Learning Algorithm

Test Data Predictive

Model

Predicted Outcome

labels

Labels

Supervised Learning Model

Variables

Labels are right answers from historical data

e.g.: Spam Detector

Input Data: Contains emails marked Spam/No Spam

Supervised Learning

Generates a function that maps inputs (labeled data) to desired outputs (e.g. Spam Detection)

Variables

(30)

A GACP and GTMCP company

Call customer before shipping

> 60 % < 60 % Number of Tr an sa ct ions

Probability of Product Returns

Probability of product return > 60% Probability of product return ≤ 60%

≤ 60 % > 60 %

(31)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(32)

A GACP and GTMCP company

Outline of this webinar

Predictive Analytics

Tool

Data

Model

R

Analytics Google Regression Logistic
(33)

A GACP and GTMCP company

Geometric Shapes

Scales and Coordinate Systems

Plot Annotations

ggplot2

(34)

A GACP and GTMCP company

(35)

A GACP and GTMCP company

Thank you!

Carolina Araripe

carolina@tatvic.com +91 7600-515-354

http://www.emarketer.com/Article/Retail-Ecommerce-Set-Keep-Strong-Pace-Through-2017/1009836

References

Related documents

Our highly skilled Color Specialists will carry out an in-depth bespoke complimentary color consultation; advising you on the colors and techniques most suitable for you and your

Переривчасте шліфування застосовується для зменшення нагрівання поверхні, що шліфується за рахунок періодичного переривання її контакту з колом,

Training Data Machine Learning Algorithm Test Data Predictive Model Predicted Outcome labels Labels. Supervised

I We also consider a noisy variant with results concerning the asymptotic behaviour of the MLE. Ajay Jasra Estimation of

These TAC selection methods are compared under different transmit correlation factor to highlight the importance of the proposed methods S-EGSM in highly correlated channel..

“Disability” means, unless otherwise set forth in an Award Agreement or other written agreement between the Company, the Partnership, the General Partner or one of their Affiliates

А для того, щоб така системна організація інформаційного забезпечення управління існувала необхідно додержуватися наступних принципів: