• No results found

How To Use Big Data To Help Your Business

N/A
N/A
Protected

Academic year: 2021

Share "How To Use Big Data To Help Your Business"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

SRA Summer Event 2013

26 June 2013

Can the use of “big data”

eliminate the need for yet another

traditional Census in 2021?

Keith Dugmore

(2)

Daily Mail

2 March 2011

Page 2

(3)

Agenda

Users’ needs: the value of Census-type information

What was done in 2011 (and since 1801). Time for a change?

Is there a better way? Can we learn from other countries? ONS’s

“Beyond 2011” project.

Which Big Data files of government administrative and commercial

customer records would be most valuable?

Opportunities, limitations, and barriers to be overcome

& my thanks and acknowledgement to:

– Barry Leventhal, Peter Furness, and Corrine Moy for using material from our joint paper in the IJMR Vol. 53 Issue 5

(4)
(5)

The Census of Population (e.g. 2011)

A unique data source

– Compulsory, 100% count of population and households (achieves 94%, plus imputation)

– Many questions / topics, including age, health, ethnicity, language, religion, qualifications, occupation, travel to work, housing tenure, household size, car ownership, etc...

– Counts of not only residents, but also workplace populations – All four countries of the UK

Products

– Detailed statistics for very small Output Areas

– Geographical data too – OA boundaries, and a postcode / OA directory – + Microdata files

– & all are free

(6)

Why is Census data so important to commercial

companies?

Decisions, decisions……

– What areas are best for our new branches? – What should we offer in each outlet?

– Where should we advertise?

– Who are our best customers, and prospects? – Which areas & people should we survey?

Investments of £00s of millions to be targeted every year

The Census provides a unique range of topics, small area statistics, &

consistent and often UK-wide coverage to companies such as…..

(7)
(8)

Sainsbury’s estate since release of 2001 Census data

March 2003

March 2010

Sainsburys Stores

Main Stores Convenience Stores

(9)

User sectors – the organisations

Commercial – DUG as the tip of

the iceberg of 2.3 million

businesses

Other sectors have similar needs

(seeking to target services to the

public efficiently)

•Central government

•Local government

•Health Service

•Charities

Or have similar interests in society

•Academics – teaching and

research

(10)

Analysis, and the need for Government data generally

Analyses

Local areas

Profiling individuals

Designing surveys

Data – with national coverage

Statistics

– Census-type counts for very

small areas

– Sample surveys

Map data

– Background, point locations,

road network, boundaries,

postcode look-ups

Lists

– Big files of individual addresses

& sometimes people

A38 A4 A37 A4174 A4320 A3029 M5 M4 19 M32 M49 18 18A 17 341 Be dm ins te r 404 Br is tol Cr ibbs Caus e w ay

509 Br is tol Galle r ie s

689 Br is tol Im pe r ial Par k

577 Em e r s ons Gr e e n

Weekly Household Income

638.87 523.91 647.13 754.63 774.70 387.13 671.81 609.63 0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 900.00 Average 1.00 2.00 3.00 4.00 5.00 6.00 7.00 OA C S u pe r G roup s

(11)

Census data collection

(what, still largely

(12)

A Traditional Census

Winding back 50 years to 1961……… no fundamental change

Household forms (paper)

Delivery

Collection

Some innovation: post out (2001 & 2011); post back (2011); online

option 2011

(13)
(14)

But what do they do in other

countries?

(15)
(16)

Big Data –

Administrative and Customer

files

(17)

Customer

Information

System

(DWP+HMRC)

Patient

Register

Linkage Of Administrative Sources

Electoral

Roll

Resident Population

Higher

Education

(HESA)

School

Census

(18)

Big Data –

(19)

“Information collected by commercial companies”

A report for ONS in 2009. You can find it at:

http://www.esrc.ac.uk/_images/UKDF0710-%20Keith%20Dugmore_tcm8-8500.pdf

ONS’s interests – population estimates, 2011 Census planning, &

Beyond 2011

The UK population, as customers

What information is collected? It varies greatly by sector and product

(& we must remember that commercial companies that resell

information – VARs such as Experian, Equifax, CACI, etc. – are a

different species)

(20)

The information collected – its weaknesses, & strengths

Weaknesses:

– Not representative samples or subsets of the population as a whole: biases by region, and demographics, and the impact of marketing campaigns

– Updating of records, especially addresses and demographics, can also be patchy – [Big government files from DWP, HMRC, Health, Education must be the first

targets]

Strengths:

– Large stocks of customers (often >10 million) – Large flows of new customers

– Timeliness

– Very detailed data on customer behaviour – transactions (+ debt & fraud) – Insight / Intelligence from partial counts (c.f. accredited “National Statistics”)

And pooling of records within sectors to build near 100% coverage can

be very powerful

(21)

Information collected – some headlines for 6 sectors

Retail

– Huge range of products

– Superstores & local shops, but also online, catalogue, etc.

– Major companies often have 10-15 million customers

– Limited demographics collected at time of application

– Loyalty cards track spending in great detail

(22)

Information collected – Financial Services

Financial Services

– Wide range of products, e.g. current account, mortgage, savings, loans

– Various sales channels – branches, ATMs, online, post, etc.

– Often >10 million customers; aim to create a customer (c.f. product) view

– Detailed demographics collected for some products, e.g. mortgage – Current accounts & credit cards

track spending in great detail – Pooling of databases is well

established, e.g. mortgages, savings, credit, fraud

(23)

Information collected – Leisure

Leisure

– Whitbread as an example • Restaurants • Costa Coffee • Hotels

– Millions of customers, but we don’t provide leisure companies with much information about ourselves

(24)

Information collected – Energy

Electricity (& Gas)

– Electricity has 100% coverage, gas c.80%

– Coverage across the UK is patchy / regional

– Minimal demographic information – Lots of effort put into maintaining

address / meter files

– Good data on fraud & debt – Pooling of databases well

established – meter list used by ONS for 2011 Census to identify multi-occupied addresses; DECC statistics on energy consumption

(25)

Information collected – Water

Water

– Each water company has its own territory (NB)

– Many properties still billed

according to rateable value (c.f. metered)

– Lots of effort put into maintaining address files

– Minimal demographic information – Good data on debt

(26)

Information collected – Telecoms

Telecoms

– Mobile telephone & broadband has major players, each with >15

million customers

– Mobiles – Post Pay (monthly contract – application form)

– Mobiles – Pre Pay (little information collected)

– Address information – only basic PAF for c.50% on contract

– Transaction information – stunning: full detail of every call, inc. location

(27)

Beyond 2011 – commercial data workshop

(28)

Daily Mail

2 March 2011

Page 2

(29)

Daily Mail

2 March 2011

Page 1

(30)

Beyond 2011 – commercial data workshop

ONS, January 2011: 11 B2C companies + 2 resellers

Headline conclusions

– Sharing of individual records may be prevented by various factors, such as reputational risk, or limitations due to the Data Protection Act

– Achieving a single customer view can be very difficult due to data matching problems

– Computing power is much less of an issue

– Companies would seek to help ONS if they could do so – aiming to minimise risk, effort & cost

– Anonymised records or aggregate statistics may provide mechanisms

(31)
(32)

Opportunities, limitations, and

barriers to be overcome

(33)

Cost profile (real terms)

2011

2021

2031

2041

Cost

Census

???

Alternative method

(34)

Statistical benefit profile

2011

2021

2031

2041

Benefit

Census

Alternative

method

loss

gain

loss

gain

(35)
(36)

Can Big Data meet users’ needs?

Frequency – e.g. annual

Geography – Output Areas

Topics?

– Omissions (e.g. Language)?

– Additions (e.g. Income); also proxies? – Accuracy?

– Change / instability? – All UK?

Multivariate analysis?

– Not just complex tables: even simple 2-way

Data sharing

(37)

Keith Dugmore

Demographic Decisions Ltd.

Tel: (0044) 020 7834 0966

Email: [email protected]

Web: www.demographic.co.uk

References

Related documents

Therefore, this study was aimed to determine the magnitude of medication non-adherence and its associated factors among type II Diabetes Mellitus patients in Adama hospital

Target audience: Executive directors of health and social care provider organisations; service managers, governance leads and executive quality leads in health and social

And in the 1980s, as the United States began to experience a marked rise in inequality, including a growing gap in wages between skilled and unskilled workers, it was natural to

These cavities spent the least amount of time above 35˚C and 40˚C (Fig 9A-F) and thus a model cannot be run because there are so few non- diapausing individuals spending

15 There is no research that tracks gender differences in environments like the contracting games considered here in which there are gains from exchange and cheap-talk

On 6 July 2016, Timpetra executed binding heads of agreement with the major shareholders, trustee and owners (individuals) (as the case may be) of Bidgee, Silverwater and

We consider two cases: when   ˆ kt has been obtained with aggregate data from real labour compensation (column 1 in Table 1) and with SILC data correcting for composition