• No results found

What Managers Need to Know about Data Science. Annie Flippo

N/A
N/A
Protected

Academic year: 2021

Share "What Managers Need to Know about Data Science. Annie Flippo"

Copied!
45
0
0

Loading.... (view fulltext now)

Full text

(1)

What  Managers  

Need  to  Know  

about  Data  Science  

(2)

Outline  

What is data science

Industry trends

What is data

The Optimal Data Scientist

The Optimal Manager

Topics in Data Science

(3)

Who  am  I?  

Annie Flippo Data Scientist Software Engineer Product Manager Database Developer

(4)
(5)

Usage

 of  Data  Science  

Finance:  fraud  detecAon,  score   buying  habits,  calculate  risks  

Insurance:  inspect  driving  habits,   assess  risks,  determine  premiums  

(6)

Usage

 of  Data  Science  

Biometrics:  wearable  devices  to   monitor  and  improve  health  

Digital  MarkeAng:  recommender   systems,  audience  segmentaAon,   retargeAng,  churn  predicAon  

(7)

Usage

 of  Data  Science  

Retail:  Walmart  launches  

compeAAon  to  solve  business  

problems  and  to  recruit  talent   Online:  NeHlix  launched  $1  million  prize  to  improve   recommendaAon  system  

(8)

Usage

 of  Data  Science  

Healthcare:  Heritage  Network     launched  a  compeAAon  to  

predict  the  probability  of   hospitalizaAon  of  paAents.  

ScienAfic:  NaAonal  Data  Science   Bowl  to  predict  ocean  health:   one  plankton  at  a  Ame  

(9)

Why  Should  

YOU

 Care?  

According  to  McKinsey1  (2011),  Big  Data:  

The  next  fron5er  for  innova5on,   compe55on,  and  produc5vity.  

“By  2018,  the  United  States  alone  could   face  a  shortage  of  140,000  to  190,000   people  with  deep  analyAcal  skills  as  well   as  1.5  million  managers  and  analysts  with   the  know-­‐how  to  use  the  analysis  of  big   data  to  make  effecAve  decisions”  

(10)

Why  Should  

YOU

 Care?  

According  to  Forbes2  (Oct  2015),  The  Hunt  

For  Unicorn  Data  Scien5sts  LiCs  Salaries  For   All  Data  Analy5cs  Professionals  

•  Experienced  data  scienAsts  are  paid  more  than  $200k  

per  year  

•  Median  salary  for  data  scienAst  increased  from  

$115,250  to  $125,000  in  one  year  

•  Managers  managing  large  teams  can  expect  a  median  

(11)

Because  it’s  a   growing  and   exciAng  field   with  high  

(12)

Explosion  of  Data  Science

Why  now?  

•  Storage  cost  has  decreased  dramaAcally  

•  CompuAng  power  has  increased  exponenAally  

•  People  are  carrying  smartphones,  mini  

supercomputers  in  their  pockets  

•  Perfect  intersecAon  of  data  availability  and  

(13)

Massive  amount  of  data

(14)

What  is  data?

(15)

What  is  data?

(16)

What  is  data?

Or,  structured  data  from  databases…  

…  what  to  do  with  all  this  data?  

(17)

The  Data  ScienAst

Can  wrangle  

data  from  

many  

sources  or  

formats  

(18)

The  Data  ScienAst

do  deep  data  

exploraAons  …  

and  perform  

(19)

DS  Skills  Inferred  by  Job  Openings

•  Ph.D.  in  math,  staAsAcs,  engineering  or  

physical  science  (Is  it  really  required?)  

•  Has  5+  years  in  programming  experience  in  

Java,  Scala,  Python,  R,  SQL,  MapReduce,  etc.   •  Has  5+  years  experience  in  most  of  the  

Apache  Open  Source  Technologies  (e.g.   Hadoop,  Spark,  Hive,  Pig,  Kaka,  etc)*  

•  Tell  a  story  like  a  novelist  (coherently  and  

beauAfully)  

(20)

The  OpAmal  Data  ScienAst

Is  a  person  with  deep  staAsAcal  and  machine   learning  knowledge,  extensive  somware  

engineering  skills  and  well-­‐versed  in  business   strategy!  

(21)

The  OpAmal  Data  ScienAst  –  Take  2

Personality  Traits3  

•  Compulsive  

•  Propulsive  laziness  

•  Drive  to  create  and  learn  

•  Irritable  determinaAon  

•  InsensiAvity  to  pain  (hmm…)  

•  Integrity  

(22)

The  OpAmal  DS  Manager

•  Former  data  scienAst  (good  to  have  but  not  

necessary;  that’s  just  asking  for  another   unicorn!)  

•  Actually  interested  in  managing  people  

•  Thirst  to  learn    

•  Apt  in  managing  different  projects  

•  PaAent  and  diplomaAc  to  manage  a  diverse  

group  of  data  scienAsts  and  business  owners   •  Understand  when  to  go  with  an  80/20  

(23)

Data  ScienAsts:  The  Challenge  of  Managing  

Stubbornly  Autonomous  Experts

4

 

 

“I  no5ced  …  that  data  

scien5sts,  but  also  sta5s5cians   and  top  coders,  oCen  have  

difficul5es  accep5ng  orders   from  managers  who  don’t   have  technical  skills  

(24)

Journey  to  become  a  DS  Manager  

 

Nate  Silver  on  Finding  a  Mentor,  Teaching  

Yourself  StaAsAcs,  and  Not  Sesling  in  Your  

Career

5

 

•  Find  a  Mentor  (Yes,  even  if  you’re  already  a  

senior  manager)  

•  Teach  Yourself  (online  resources,  MOOCs)  

•  Understand  the  life-­‐cycle  of  a  data-­‐driven  

project  

(25)

Why  Just  Do  It?  

 

Why  do  I  need  to  learn  about  data  science  

and  manage  data  projects?  

 

“I  have  [insert  #  of  years]  years  of  

experience  in  [insert  my  industry].    

I’m  comfortable  and  successful  

(26)
(27)

Data  Sources

(28)
(29)
(30)
(31)

Machine  Learning

(32)

Machine  Learning

(33)

Data  Science  Concepts

PredicAve  AnalyAcs  

ClassificaAon  

(34)
(35)

Topics  in  Cloud  CompuAng

(36)

Your  Job:  Provide  Guidance

Tell  us  a  data  story    

…  about  your  business

 

Do  you  understand   the  outcome?  

 

What  is  your  

recommendaAon  to   the  business?  

(37)

Gezng  Started:  Locally

Meetups  

•  LA  R  users  group  

•  LA  Machine  Learning  

•  LA  Data  Warehouse,  BI  &  AnalyAcs  

•  LA  Big  Data  Users  Group  

Conferences:  

•  datascience.la  

(38)

Gezng  Started:  Podcasts

(39)
(40)

Good  Places  to  Start

Data  Science  for  Business    

by  Foster  Provost     &  Tom  Fawces  

(41)

Good  Places  to  Start

Doing  Data  Science    

by  Rachel  Schus  &  Cathy   O’Neil  (mathbabe.org)    

Free  at  

(42)

Good  Places  to  Start

The  Art  of  Data  Science      

by  Roger  Peng  &  Elizabeth   Matsui  

 

(43)

Get  Kids  Started

(44)

Thank  You!

Annie  Flippo   @ACflippo  

(45)

References  

  1.  hsp://www.mckinsey.com/insights/business_technology/ big_data_the_next_fronAer_for_innovaAon   2.  hsp://www.forbes.com/sites/gilpress/2015/10/09/the-­‐hunt-­‐for-­‐unicorn-­‐ data-­‐scienAsts-­‐lims-­‐salaries-­‐for-­‐all-­‐data-­‐analyAcs-­‐professionals/   3.  hsp://cdn.oreillystaAc.com/en/assets/1/event/119/Data%20Science %20Bootcamp%20PresentaAon.pdf   4.  hsp://www.ibmbigdatahub.com/blog/data-­‐scienAsts-­‐challenge-­‐managing-­‐ stubbornly-­‐autonomous-­‐experts   5.  hsps://hbr.org/2013/09/nate-­‐silver-­‐on-­‐finding-­‐a-­‐mentor-­‐teaching-­‐yourself-­‐ staAsAcs-­‐and-­‐not-­‐sesling-­‐in-­‐your-­‐career/   6.  hsp://www.nyAmes.com/2012/06/26/technology/in-­‐a-­‐big-­‐network-­‐of-­‐ computers-­‐evidence-­‐of-­‐machine-­‐learning.html   7.  hsp://research.google.com/archive/unsupervised_icml2012.html      

References

Related documents

This study evaluated four large hotels’ recruitment websites in terms of website design characteristics and aesthetics, and measured job seekers’ attitude toward the hotel

We previously reported the use of a nonoxide sol-gel route to produce TiN inverse opal films using the same divinylbenzene cross-linked, amidine-capped 500 nm polystyrene spheres as

(1991) "An annotated bibliography of computer supported cooperative work: Revision 2." Research Report, Department of Computer Science, University of Calgary,

Produk elpiji dijual oleh Pertamina untuk menggantikan penggunaan bahan bakar minyak tanah yang selama ini lebih banyak digunakan di rumah tangga dan industri.. Pada tahun

- Thermal ocean energy generators are best used in tropical areas since heat is a big factor in this type of generator....

El cambio de dureza inicia más prontamente durante un envejecimiento artificial a El cambio de dureza inicia más prontamente durante un envejecimiento artificial a comparación de

The PAWC IT Local Area Network infrastructure uses a 3-tier model made up of Core layer, Distribution layer and the Access layer. The

Libraries are therefore seen as spaces and places that can be used to include and integrate youth into civil society (Derr and Rhodes, 2010; Feinberg and Keller, 2010 and