• No results found

A real world example of using predictive analytics in large corporations for forecasting, budgeting and planning

N/A
N/A
Protected

Academic year: 2021

Share "A real world example of using predictive analytics in large corporations for forecasting, budgeting and planning"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

A real world example of using predictive

analytics in large corporations for

forecasting, budgeting and planning

By Terry Simmonds

(2)

Much of Terry’s career has been involved with developing data information stores, business forecasting, analytics and business modelling, undertaking internal stakeholder management, training, coaching staff on the use of analytics, presenting and consulting.

Recently in Australia’s largest Telecommunications company he directed the rollout of a national forecasting and analytics platform for 140+ users. In February this year, he presented on how Predictive Analytics is used in Telstra at the Certified Practicing Accountants (CPA) forecasting conference in Sydney, Melbourne and Brisbane. He has developed and presented many courses on forecasting, analytics and project management over his 30 year career.

He is a financial mathematician/data scientist by trade with a Bachelor of Science degree (1988) from the University of Queensland and post graduate studies in Actuarial mathematics, investments and management.

Terry has Coaching for Performance and Front Line Management qualifications and he is currently finalising his Australian national accreditation in Workplace Training and Assessment.

www.EndureDSandBI.com

Terry Simmonds – Principal Consultant and Founder

Endure Data Science and Business Intelligence

(3)

2.

Introduction

3.

Table of Contents

4.

Acknowledgement and Thanks

5.

Predictive Analytics @ Work –

Summary

6.

What is Predictive Analytics?

7.

Why use Predictive Analytics?

8.

Part 1 – Predictive Analytics –

General

9.

Data Collection in General

10.

Knowledge Base

11.

Predictive Models - General

12.

Part 2 – Predictive Analytics @ Work

13.

Model Example

14.

Customer Adds Statistical Models

15.

So which one to choose?

16.

Model – sample error distributions

17.

Selected Customer Sales Forecast

18.

Collaboration

19.

Final Collaborated Forecast

20.

Prediction System Features

21.

Connect with us

(4)

Thank you to all the people who inspired this e-book. Thanks for your help whether it was in person or via your presentations, books, videos, interviews, tweets and blogs!

Teresa Simmonds Martin Baker John Turner Evan Stubbs Lynley Vinton Tony Ward Joseph Panagiotakis Adam Franklin Toby Jenkins Tony Simmonds Jesus Christ Gary Cokins Burton Wu Richard Branson Peter Edwards David Williams Mike Loukides Charles Nyce International Institute of Forecasters

Many other folk who over the years have provided their insight and support both professionally and personally.

Please Share

 © 2012 by Endure Data Science and Business

Intelligence.

 This e-book is free and licensed under the Creative

Commons License, Attribution 3.0

http://creativecommons.org/licenses/by/3.0/

 If you find this e-book useful please feel free to blog

about it, tweet it, email it to a friend and otherwise share it with the world. The author just asks that you don’t alter, transform, or build upon it without prior consent.

Version 1.0

 Please feel free to forward any of your content ideas

for this document at www.EndureDSandBI.com

Full disclosure

 Affiliate links may have been used in this e-book, which

means if you click through and buy something, we are likely to earn a small commission. Plus we may have

(5)

Predictive analytics assists with developing forecasts for business decision making.

It will NOT work well, however, when used in a slavishly dictatorial way or as a mindless black box approach. It works well when used to augment judgemental based and other relevant modelling and forecasting

approaches to deliver a collaborated view which is seen as more accurate and reliable than any one view on its own.

When challenging seasoned executive “gut feels” talk to the method and numbers rather than the opinions. This generates respect and objectivity.

(6)

“Predictive Analytics is a broad term describing a variety of statistical and

analytical techniques used to develop models that predict future events or

behaviours.” Charles Nyce

Predictive Analytics

incorporates a range of activities including visualizing data, developing assumptions and data models, overlaying modelling theory and mathematics, then estimating/predicting future outcomes.

At its core it relies on capturing relationships between the explanatory variables and the predicted variable from past data points, and exploiting those relationships to predict future outcomes.

Some techniques include

• Mathematical modelling of many varied and different types to determine true drivers; • Data mining;

• Game theory;

• Time series analysis and forecasting • Just to name a few

(7)

Some of the “why’s” for using Predictive Analytics include

1. How will future profits be generated?

2. How will customer purchasing patterns drive future infrastructure demand? 3. How, where and when will your customers take-up your new product(s)? 4. What is the likely future customer churn behaviour?

5. What is the future company financial position for market disclosure? 6. I am sure you can think of many more …

So how does this all work in practice for your organisation?

Part 1 of this eBook outlines a general predictive analytics approach while Part 2 provides a more concrete example for those interested a little more detail.

Of course, if you are looking for how your organisation can implement a specific predictive analytics or data science road map then do not hesitate to contact us at www.EndureDSandBI.com .

Please forward any ideas you have regarding the content of future iterations of this eBook at

(8)

4 general

steps

Define the problem or hypothesis or objective Identify and collect relevant data and information Analyse data, develop knowledge base, design predictive models Utilise predictive models to determine most likely outcome(s)

Information/Data store

Knowledge base

Predictive models

Tools

Processes

People

When dealing with large volumes of predictive analytics a broad process is required.

Defining the problem will drive all aspects of your data collection through to predictive

analytics modelling.

The next few pages will

outline the general approach

from gathering information

to producing predictive

modelling and bringing all

together with the people

and the processes.

(9)

Looking at the how to develop the Information data stores and undergoing data collection in general, you will need to consider your data needs for the following areas.

Big ticket information needs

1. How big is the market currently?

2. What has happened in the market historically?

3. What competitors are out there and what are their market shares, product offerings, etc? 4. Much more ...

More product centric information

1. What is the historical financial General Ledge (GL) information for the product? 2. What additional “driver” based information is available?

• Sales

• Channels to market

• Types of customers buying and using the products • Customer usage patterns

• What is the social media “chatter”& other “big data” sources ...

What external factors influenced the historical results?

1. What marketing & pricing campaigns were implemented? 2. What investments were made in the network?

3. Much more ...

(10)

Now comes the fun part

Once you have collected sufficient data you can then turn your attention to turning the facts and data into

knowledge, ie identify plausible relationships between explanatory and prediction variables; assess the quality of the information; determine the extent to which the information at hand can be used to predict future outcomes.

Utilising all this information to develop a picture of

• The market

• The customer base

• The delivery side of things, ie the channels to market, infrastructure requirements, product offerings, etc • What are the significant drivers that influence profit

• The supply side and demand side price elasticity models where applicable • Appropriate econometric models

• How the data looks, ie graph some data – “a picture paints a thousand words” • The beginnings of predictive models based on what is now known

After all this you may need to go back and get more data if there is not enough

information to develop suitable models.

(11)

Once the knowledge base is established then the next step is to generate the

predictive models

This is the very general predictive modelling structure used for statistical analysis and ultimately it can take many forms.

 Y is often called the dependent variable and it is generally the thing you want to predict such as profit,

revenue, phone sales, etc.

 f(x) is the “model” that builds on the knowledge and assumptions about the driver’s “x” variables. The

key is to get both the driver’s “x” and the relationships between them and Y, correct.

 e is the error, risk, unknown, stochastic term assumed generally to have a normal distribution with mean

0 and standard deviation of s.

 e can be time dependent as in time series forecasting or time independent as in multiple regression or

econometric models and data mining models.

A statistical predictive model, Y = f(x) +

e,

for each month (t) might be

Y(t) =

Customer Sales(t) * {Average Customer Initial purchase Yield(t) + ½ Average Customer months spend (t)} + e The full year revenue forecast for sales would therefore be the sum of Y(t) for each month of the year.

The triangle that makes it all happen

While you can get the tools (the data and IT systems, etc) all established, you are not likely to gain full benefit from your predictive analytics without have the right

People

doing the right

Processes

with the right

Tools

on an ongoing basis. This is where a Predictive Analytics Road Map adds a great deal of value.
(12)

In this case utilise your organisation’s General ledger (GL). This can provide many

thousands, if not hundreds of thousands, of financial and non-financial accounts. The data elements which might exist in your GL (financial and physical) systems include financial and non-financial account codes, organisation and customer codes and Time/Date attributes.

The atomic level data will need to be aggregate into a multi level product, measure and organisation/customer grouping hierarchies to begin with (using standard OLAP data architecture and functionality). All in line with organisational structures and strategic priorities.

Also establish a series of measure hierarchies providing the physical and yield driver elements for reports, models and forecast calculations.

Establish version control for audit and comparison purposes

In order to make appropriate operational business decisions, this approach provides the necessary level of detail not available with any top down, market driven or judgement based approaches.

Ensure aggregation is consistent across the organisation, otherwise collaboration cannot occur and the effort to provide a robust and defendable forecast or predictive analytical process fails. (put this into a caption)

The next page gives an example of a mathematical driver model which enables predictive analytics to be used to drive the outcome.

Challenge if

you accept

Use

predictive

analytics to

forecast

detailed

product

revenue for

the rest of

this year

and then for

the next 3

years

Warning

Take care working

with aggregated

(13)

Customer

net

Growth

for Period

Gross

Revenue

Predict this as an example

This simple model utilises the customer sales and cancellations to determine net customer growth as well as combining Sales Yields to generate a value for gross revenues.

Predictive Analytics can be applied to the Relevant Drivers of the model (the grey boxes) or they can be applied to the Customer net growth and Gross Revenue themselves.

(14)

Additive Winters (exponential smoothing)

Mean Average Percentage Error (MAPE) 5.76

Auto Regressive iterative Moving Average (Arima)

(pdq) 200 (PDQ) 100 no intercept MAPE 7.71

Du m m y Da ta U se d Du m m y Da ta U se d

What on Earth are these?

These graphs show a time series forecast for the customer adds (or sales) driver for the model on the previous page. The information is monthly sales from 2007 which is then forecasted using two different types of time series forecasting model forms.

For the purpose of this eBook we wont go deeply into the models themselves other than to say they are a type of predictive analytics model which are used to forecast future values. In this case they are being used to forecast the customer monthly sales.

• The Black dots are the historical data.

• The vertical dotted blue line is the forecast date. • The red line is the model which has been “fitted” to

the historical data and then used to forecast into future months.

• The pink shaded areas are the 95% Confidence Interval generated by the different models. Generally, the lower the MAPE the smaller the

difference between the values of the fitted model and the actual black dots (actual historical data). The MAPE is a statistical measure that aids in determining if the model is close to lining up with the historical results.

(15)

Black-box approach would be to accept smallest MAPE, ie Additive Winters model

However

Business knowledge may drive an alternate outcome • Do you need to capture seasonality?

• Do you need to capture short, medium or long term trends?

Work to incorporate historical events in modelling, e.g. step functions in June 2010 and January 2011. If you are wanting to capture past events then you will need to user ARIMA type models. This is not the one with the smallest MAPE currently.

An alternate approach is to consider other statistics such as “White Noise” and distribution of errors (are they normally distributed, etc)

As a starting point, go back to predictive analytics theory and work with Y = f(x) + e

Revisit Y = F(x) +

e

• Y is the predictor variable

• F(x) is the function of explanation variables x

• e - is the error we get when we haven’t got the right function f() in place or we may have missed an explanation variable or we may not have captured the correlation between the variables completely. • The power of statistics is to get the function and explanation variables to a position where the error has an

approximate normal distribution, an average value of 0 and a known variance.

While this only applies to the actual results – we don’t know the actual outcomes

as yet for future forecasts - the theory is that if we get the errors distributed

normally around an average value of 0, we have a statistically ‘good’ model.

Endure Data Science and Business Intelligence

Page 15

Take care you

don’t exceed 4:1

ratio of historical

data to forecast

(16)

Additive Winters (exponential smoothing)

Mean Average Percentage Error (MAPE) 5.76

Auto Regressive iterative Moving Average (Arima)

(pdq) 200 (PDQ) 100 no intercept MAPE 7.71

Looking at the error distributions for the same models as shown on page 14.

What on Earth are these?

These graphs are histograms of the errors of the Prediction models for customer sales.

• The pink boxes are the histogram bars

• The Red line is the estimated normal distribution of the results

• The blue line is the “smoothed” distribution of the errors

• The closer the red and blue lines come together the more the error results will be “normally”

distributed.

You can also see that that each graph has a peak around 0 which shows that on average the prediction models do come close to the historical data.

Why is it important to have the peak at 0?

It means that the model is pretty good at guessing the past, which is a good indicator that it will be OK for estimating possible future values.

So What!

You can see by these two graphs that the model with the lowest MAPE – the Winters Additive Model - does

(17)

There can be a degree of “professional judgement” in using Predictive Analytics models.

In practice, the most appropriate statistical forecasts are selected based on a combination of factors including

Business knowledge Statistical goodness of fit Forecast quality controls

For completeness and to finish this example, the Predictive Analytics model results for Customer Sales can be loaded into the model on page 13 along with the predict values for the other

drivers and let the model calculate the resulting gross revenue.

Arima (pdq) 200 (PDQ) 100 no intercept

MAPE 7.71

Du m m y Da ta U se d

Challenge

what happens

when you have to do this

for thousands of predict

(18)

International Institute of Forecasters research

It has been demonstrated many times by the international forecasting community (www.iif.com ) that by combining a number of independent and different forecasting and prediction techniques/models, the final view will provide a more accurate and reliable forecast than any one view on its own.

What might be simple way of collaborating different views?

 1/3 of the (a Top Down Market view + a Bottom up detailed view + an independent statistical

predictive analytics view)

 Alternatively, throughout the process an agreed position can be reached utilising inputs for the 3

major techniques.

 Or it could be as complex as using a historical scoring process based on the level of bias of previous

predictions from various sources.

The following process has been used in practice to gain a collaborated final

position.

(19)

Update final views to reflect modified forecasts

and assumptions and consolidate results for total

organisational outlook

Provide draft position using these forecasts

Present to subject matter experts and capture their feedback; modify explanatory variable forecasts where appropriate

Assess current predictive analytics views

Summarise statistical predictive analytics forecasts

Evaluate in context with past, current and proposed business initiatives and

external factors where relevant

Modify forecasts to accommodate for business initiatives

(20)

One last thought before we finish

If you are looking to undertake a large number of predictive analytic modelling tasks on a regular (say monthly) basis, then unless you have an army of highly skilled data scientists and analysts, you will need to implement a platform that can assist in completing the work.

Some attributes the platform needs to have

• Strong linkages between financial, customer and operational level data for predictive

analytics and forecasting purposes.

• Regular and timely automated, consistent, managed data extractions.

• Access to powerful trajectory and statistical based forecasting and analytical applications via

MS Excel and custom Graphic User Interfaces (GUI’s).

• Access to sophisticated large volume (millions of records) data management and web hosting

Dashboard and Business Reporting applications.

• Seamless integration with Microsoft Office Excel, Word and PowerPoint.

• Genuine multiuser, workflow management application for as many concurrent users as

needed.

• Utilises OnLine Analytical Processing (OLAP) functionality for ease of dissecting data

information stores and building analytic driver models.

So how did you go with the challenge?

(21)

www.EndureDSandBI.com

We’d love to hear from you.

If you would like a hand with @work predictive analytics, business navigation or putting together your Data Science road map then please get in contact with us at Endure DS & BI.

You can phone Australia on +61 409 389 060, write to us at [email protected]

or follow Terry on LinkedIn.

Feel free to share this eBook.

We wish you all the best with implementing your prediction

analytics needs @work.

Terry Simmonds

Principal Consultant and Founder Endure Data Science and Business Intelligence

(22)
www.EndureDSandBI.com http://creativecommons.org/licenses/by/3.0/ www.iif.com [email protected]

References

Related documents

As you may recall, last year Evanston voters approved a referendum question for electric aggregation and authorized the city to negotiate electricity supply rates for its residents

This incl udes not only volca nic eruptions but a lso the deep-seated intrusion of granites a nd other rocks ( p. These three processes act so that at any time the form

Business Intelligence & Data Warehousing Training & Support Predictive Analytics Advanced Analytics & Big Data Operations Planning / Forecasting / FP&A

I We also consider a noisy variant with results concerning the asymptotic behaviour of the MLE. Ajay Jasra Estimation of

Previous studies have reported estimates of gaming revenue from casino-style games added to existing race tracks. Other reports and studies have examined the potential revenue

– Promoting a common level of understanding between the consumers and providers of cloud computing regarding the security requirements and attestation of assurance.

Risk Reporting, KRIs Budgeting, Planning, Profitability Modeling Predictive Planning / Forecasting Risk-adjusted Planning Sales Planning, Marketing Planning, … Retention and

V grafu 4.13 je znázorněn vývoj počtu přistěhovalých osob a vývoj počtu uchazečů na jedno pracovní místo v Moravskoslezském kraji v letech 1993–2017..