What Managers
Need to Know
about Data Science
Outline
•
What is data science
•
Industry trends
•
What is data
•
The Optimal Data Scientist
•
The Optimal Manager
•
Topics in Data Science
Who am I?
Annie Flippo Data Scientist Software Engineer Product Manager Database DeveloperUsage
of Data Science
Finance: fraud detecAon, score buying habits, calculate risks
Insurance: inspect driving habits, assess risks, determine premiums
Usage
of Data Science
Biometrics: wearable devices to monitor and improve health
Digital MarkeAng: recommender systems, audience segmentaAon, retargeAng, churn predicAon
Usage
of Data Science
Retail: Walmart launches
compeAAon to solve business
problems and to recruit talent Online: NeHlix launched $1 million prize to improve recommendaAon system
Usage
of Data Science
Healthcare: Heritage Network launched a compeAAon to
predict the probability of hospitalizaAon of paAents.
ScienAfic: NaAonal Data Science Bowl to predict ocean health: one plankton at a Ame
Why Should
YOU
Care?
According to McKinsey1 (2011), Big Data:
The next fron5er for innova5on, compe55on, and produc5vity.
“By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analyAcal skills as well as 1.5 million managers and analysts with the know-‐how to use the analysis of big data to make effecAve decisions”
Why Should
YOU
Care?
According to Forbes2 (Oct 2015), The Hunt
For Unicorn Data Scien5sts LiCs Salaries For All Data Analy5cs Professionals
• Experienced data scienAsts are paid more than $200k
per year
• Median salary for data scienAst increased from
$115,250 to $125,000 in one year
• Managers managing large teams can expect a median
Because it’s a growing and exciAng field with high
Explosion of Data Science
Why now?
• Storage cost has decreased dramaAcally
• CompuAng power has increased exponenAally
• People are carrying smartphones, mini
supercomputers in their pockets
• Perfect intersecAon of data availability and
Massive amount of data
What is data?
What is data?
What is data?
Or, structured data from databases…
… what to do with all this data?
The Data ScienAst
Can wrangle
data from
many
sources or
formats
The Data ScienAst
do deep data
exploraAons …
and perform
DS Skills Inferred by Job Openings
• Ph.D. in math, staAsAcs, engineering or
physical science (Is it really required?)
• Has 5+ years in programming experience in
Java, Scala, Python, R, SQL, MapReduce, etc. • Has 5+ years experience in most of the
Apache Open Source Technologies (e.g. Hadoop, Spark, Hive, Pig, Kaka, etc)*
• Tell a story like a novelist (coherently and
beauAfully)
The OpAmal Data ScienAst
Is a person with deep staAsAcal and machine learning knowledge, extensive somware
engineering skills and well-‐versed in business strategy!
The OpAmal Data ScienAst – Take 2
Personality Traits3
• Compulsive
• Propulsive laziness
• Drive to create and learn
• Irritable determinaAon
• InsensiAvity to pain (hmm…)
• Integrity
The OpAmal DS Manager
• Former data scienAst (good to have but not
necessary; that’s just asking for another unicorn!)
• Actually interested in managing people
• Thirst to learn
• Apt in managing different projects
• PaAent and diplomaAc to manage a diverse
group of data scienAsts and business owners • Understand when to go with an 80/20
Data ScienAsts: The Challenge of Managing
Stubbornly Autonomous Experts
4
“I no5ced … that data
scien5sts, but also sta5s5cians and top coders, oCen have
difficul5es accep5ng orders from managers who don’t have technical skills
Journey to become a DS Manager
Nate Silver on Finding a Mentor, Teaching
Yourself StaAsAcs, and Not Sesling in Your
Career
5• Find a Mentor (Yes, even if you’re already a
senior manager)
• Teach Yourself (online resources, MOOCs)
• Understand the life-‐cycle of a data-‐driven
project
Why Just Do It?
Why do I need to learn about data science
and manage data projects?
“I have [insert # of years] years of
experience in [insert my industry].
I’m comfortable and successful
Data Sources
Machine Learning
Machine Learning
Data Science Concepts
PredicAve AnalyAcs
ClassificaAon
Topics in Cloud CompuAng
Your Job: Provide Guidance
Tell us a data story
… about your business
Do you understand the outcome?
What is your
recommendaAon to the business?
Gezng Started: Locally
Meetups
• LA R users group
• LA Machine Learning
• LA Data Warehouse, BI & AnalyAcs
• LA Big Data Users Group
Conferences:
• datascience.la
Gezng Started: Podcasts
Good Places to Start
Data Science for Business
by Foster Provost & Tom Fawces
Good Places to Start
Doing Data Science
by Rachel Schus & Cathy O’Neil (mathbabe.org)
Free at
Good Places to Start
The Art of Data Science
by Roger Peng & Elizabeth Matsui
Get Kids Started
Thank You!
Annie Flippo @ACflippo