• No results found

Data Analytics: Answering business questions with data

N/A
N/A
Protected

Academic year: 2021

Share "Data Analytics: Answering business questions with data"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

Grameen Foundation’s

Savings Seminar

Data Analytics:

Answering business questions with data

Oct 22nd, 2013

(2)

Speakers

p

Tanaya Kilara, Financial Sector

Analyst at CGAP

Jacobo Menajovsky, Senior Data

Analyst at Grameen Foundation

Analyst at Grameen Foundation

(3)

The Role of Data

Grameen Foundation Savings Seminar October 22, 2013

(4)

Warm-up Quiz

How long does it take Google to get 2 million

i

?

queries?

How much do consumers spend on web

shopping in a an hour?

How many emails sent in a minute?

(5)

More Data with Every Passing Day

Big Data Analytics Modelling Data Mining 5

(6)

Significantly Better Analytical Capacity

(7)

Implications

Gleaning Customer Insights Fitting More Data Capacity to Analyze g Products to Needs Managing g g Risk Designing Customer Experience Optimizing Channel 7 Channel

(8)

Challenges in Financial Inclusion

Banks

Have customer data, need to build analytical capacity

MFI

Need to build systems to

MFIs

capture and analyze datay

Telcos

Have the capacity, need to use it to generate insights relevant to financial services

(9)

Asking the Right Questions

What is the problem I am looking to solve?

What types of data do I need to answer my

question?

question?

How do I get the mix of data right (quant vs qual,

g

g

(q

q

,

internal vs external)?

Data gives me the ‘how’. What methods to

answer the ‘why’?

(10)

Advancing financial access for the world’s poor

(11)

Agenda

 Some guiding principles for doing Analytics

 Data is everywhere. Why?

A li d i i 101 d

 Applied statistics 101, concepts and most common problems and mistakes

 Using, mixing, benchmarking, visualizing and testing data to support decisions and respond to business questions

(12)

A few guiding principles

 Not all products are created equal.

 Not all customers have the same needs.

 Discovering customers profiles and usage patterns can support product and service (re)design.

 Understanding big trends and patterns in the portfolio can

 Understanding big trends and patterns in the portfolio can help orgs to drive change and take decisions.

(13)
(14)
(15)
(16)

Data is everywhere

 Start small

 Think data as signs and indicators, not as numbers in an excel file

excel file

 All of us are using and modelling data all the time to make even the simplest decisions

 Put your questions first and then go to the data

 Don’t overcomplicate things, but be careful because it is really easy to “lie to yourself” with statistics

(17)
(18)

Statistical lies? Are you sure?

 The average annual salary of a Lakeside school graduate is e a e age a ua sa a y o a a es de sc oo g aduate s around 2,000,000 per year.

(19)

What a class!

(20)
(21)

How many households below the poverty

line does your organization reach?

Find out with the Progress out of Poverty

What is the PPI?

Find out with the Progress out of Poverty

Index® (PPI®)

 A poverty measurement tool for organizations with a mission to serve the poor

 10 easy-to-answer questions and a scoring system

 Provides the likelihood that the survey respondent’s

 Provides the likelihood that the survey respondent s household is living below the poverty line

 Country-specific; there are PPIs for 45 countries

Why use the PPI? Why use the PPI?

With the PPI, your organization can:

21

To download the PPI and learn more, visit:

(22)

PPI as a segmentation tool

-Survey for the Philippines

Survey for the Philippines

Segmentation Segmentation Family size Schooling Educational  level Employment 

For the complete survey and look up tables go to: progressoutofpoverty.org

(23)

About the data we used

 From partners and public sources

 Financial, demographic and poverty data

 Transactional level

 Customer levelCustomer level

 Aggregated level

 Data comes under different formats, dirty and dispersed

(24)

What are we doing with the data?

 Measuring poverty outreach and benchmarking against national figures.

 Tracking main trends like product performance penetrationTracking main trends like product performance, penetration, uptake, and dormancy levels.

 Discovering behavioral patterns and interactions in the data.

 Running models to discover main drivers of certain events.

(25)

Partner’s overview and poverty outreach

benchmarking

benchmarking

India Philippines India Cashpor 100K+ active savers R.232 (US$3.50) average Philippines CARD Bank 750K+ active savers Php 2900 (US$65) average g savings balances <1% PAR 30 p ( ) g savings balances <3% PAR 30 96% of Cashpor’s customers are living below the $2 line

48% of CARD Bank’s customers are living below the $2.50 line

(26)

Scaling up savings - Some initial questions (CARD

Bank)

Bank)

 What did the savings business look like when the project started (and after)?

a) What was their product offering and cross selling product penetration?

b) What was CARD’s strategy for scaling up savings?

I Customer base expansion?

I. Customer base expansion?

II. Product deepening and cross selling?

(27)

Product penetration mapping at CARD Bank

Before and after

Before and after

a) Before

I. 300K accounts

II. 97% monoproduct, only 2.5% cross sold into just one

savings product savings product

b) After

I 750K acco nts

I. 750K accounts

II. 84% monoproduct, 15% cross sold into 4 different

savings products targeting 4 different customer

segmentsg

Kids savings, Convenient access, Increased returns, Regular savings

(28)

A few business and social questions we wanted to

answer with data

(29)

Which should be the main target segment when

introducing a new savings product at CARD Bank

Cross sold profiling and customer lifecycle analysis

g

g

and when?

Cross sold profiling and customer lifecycle analysis

Average savings by tenure (in years) and poverty level Average savings by tenure (in years) and poverty level

PPI

Profile data PPI

Much higher cross sell penetration

(30)

Is it possible to launch an aggressive customer

expansion strategy without affecting poverty

outreach?

(31)

Is ATM technology a barrier for the poorest

customers?

customers?

Transactional savings volume by channel and poverty level

(32)

Is it possible that transactional fees had an effect on

saving behaviors at Cashpor?

saving behaviors at Cashpor?

How much are they saving? (average amount)

Pay as you go

(a e age a ou )

Yearly fee: Unlimited transactions

h sdfkhd khsdfkhd fk dfhds fk h fk dfhds f

Last 12 months of activity Last 12 months of activity

N=64,841 N=21,731,

(33)

“H

ypotheses can be rejected or supported, never proven”

P tti d t t t t

Putting your data to test

 Why is it important to test hypothesis and assumptions?Why is it important to test hypothesis and assumptions?

 What are the data and tools required to do so?

 What are the most common methods?

Your questions and data will help you identify which tests you should apply. pp y

 Use correlations to look at whether changes in one variable are

accompanied by changes in another variable accompanied by changes in another variable.

 Use the chi-square test to look at whether actual data differ from

a random distribution.

(34)

Is tenure correlated with the historic total number of

loans disbursed?

Correlation

Correlation refers to any of a broad class of statistical relationships involving Correlation refers to any of a broad class of statistical relationships involving dependence. Dependence refers to any statistical relationship between two random variables or two sets of data.

disbursed

N

umber of loans

Tenure (length as a customer in months)

N

Pearson’s correlation=.789 R2= 62%

(35)

Is tenure correlated with the historic total number of

loans disbursed?

Correlation

disbursed N umber of loans

Tenure (length as a customer in months)

N

Pearson’s correlation=.789 R2= 62%

(36)

Is tenure correlated with the historic total number of

loans disbursed?

Correlation

disbursed

Above average loan takers

N

umber of loans

Tenure (length as a customer in months)

N

Pearson’s correlation=.789

Below average loan takers

Pearson s correlation .789 R2= 62%

(37)

Hypothesis: Are women in my portfolio poorer than

men?

Chi-Square test

 The Chi Square test tests a null hypothesis stating that the frequency

distribution of certain events observed in a sample is consistent with a particular theoretical distribution.

(38)

Hypothesis: Are women in my portfolio poorer than

men?

Chi-Square test

 The Chi Square test tests a null hypothesis stating that the frequency

distribution of certain events observed in a sample is consistent with a particular theoretical distribution.

Pearson's Chi Square 0 0000000063

Hypothesis supported

(39)

Is there a significant difference on declared assets

across poverty segments?

T-tests

T tests can be used to compare two groups or p g p treatments.

(40)

Is there a significant difference on declared assets

across poverty segments?

across poverty segments?

T-tests

Hypothesis supported

Hypothesis supported

(41)

Closing remarks

g

Wh i d t

l ti

b

i

iti

l

Why is data analytics becoming critical

(42)

References

Related documents

Swing voters can not discriminate between the two leaders in office when both leaders vote in favor of both policies, and when both leaders vote in favor of their own policy and

6 | Professional Network - Supporting Local Business WE PROVIDE I T SUPPORT TO BUSINESSES theitdesk.net 0330 088 8058 OPTIMISED END-TO-END PRINT SOLUTIONS 01474 815

estimated average time savings per DBA after the migration to Azure SQL managed databases, the manager of enterprise data analytics at a business services company cited four hours

This paper has provided insights into the nature of entrepreneurial intention amongst recently arrived Syrian refugees in the UK, including the role that the

Resultados: Todos os extratos vegetais analisados apresentaram atividade antimicrobiana para a maioria das espécies de Streptococcus , sendo todos sensíveis a, pelo

In terms of classification accuracy measured by the Kappa value, LDB+CSP per- forms significantly better than sliding window with CSP for feature extraction, and has an

All studies in Table 2 and the studies in Table 3 that monitored for longer than 5 years call for: (i) adjusting expectations for the timescales of water quality response;

If the rays diverge after passing through the lens, the place from which they seem to diverge on the left side at a distance s ' from the lens is the location of a virtual