• No results found

BIG DATA PDF PDF 100# MISTAKES REPORT. First#Hits# 100 First Hits on Google. Analyzed, Summarized and Narrated!

N/A
N/A
Protected

Academic year: 2021

Share "BIG DATA PDF PDF 100# MISTAKES REPORT. First#Hits# 100 First Hits on Google. Analyzed, Summarized and Narrated!"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Copyright  ©  2013.  100FirstHits.com  

www.100FirstHits.com  

BIG DATA

MISTAKES REPORT

2013

PDF

PDF

100#

First#Hits#

TM#

(2)

The$Big$3

$

1.Lack$of$Competence$

2.Lack$of$Goals$

3.Lack$of$Strategy$and$$$

$$Corporate$support$

53,7%$

Poor$Data$Quality$

11,4%$

Seek$Data$PerfecGon$

9,9%$

Aiming$to$high$

5,9%$

Lack$of$Change$ Management$ 5,1%$ Seeking$cause$over$ correlaGon$ 4,9%$ Lack$of$Data$Relevance$ 4,5%$ Lack$of$Governance$ 2,4%$ Select$the$right$ visualizaGons$ 1,5%$ Lack$of$Permission$ 0,8%$ Lack  of  

Competence   Lack  of  Goals   Corporate  Support  Lack  of  Strategy  &  

1

2

3

Scores

Lack of Competence

Big data platforms are relatively new. Most organizations don't have trained teams in place to use the platforms successfully.

Understanding what Big Data technologies are good for is crucial for success. Traditional warehouse solutions could still be a better fit depending on the application.

Combining Big and Small Data is key to Success. Big Data tells what

based on what happened in the past. Small data can tell why and explain the past so that we can predict a changing future based on a better understand of what is causing this change.

–Albert Fitzgerald

Your team need to understand how to create training models, identifying best predictors from a wide range of independent variables, etc., etc…

Lack of Goals

Many organizations fall into the trap of collecting and analyzing data purely for the sake of doing it. The problem is without a clear business objective, the data is unlikely to yield any clear benefits or useful business intelligence.

–Nelson Estrada

If you don’t ask the right questions, the data will ultimately be useless to you. Example of good questions to ask are: -Who will buy?

-Who will leave / churn? -What will break? -Who will pre-pay? -What will sell? -Etc., Etc.

Another common mistake is to attempt to set goals that the data does not support.

Lack of Strategy & Corporate Support

A lot of companies consider Big Data a “side-job”. Going down that road, you might as well stop doing Big Data at all.

What comes out of Big Data analysis is often affecting both corporate and operational strategies, and the decisions to make can be hard and risky.

After all, Big Data is all about unlocking new forms of value, and in the end, it is still up to humans to decide what to go after.

TOP 3

53,7%

(3)

1450  

655  

608  

575  

499  

297  

256  

245  

228  

121  

74  

42  

0   200   400   600   800   1000   1200   1400   1600   Lack  of  

Competence   Lack  of  Goals   Lack  of  Strategy  and  Corporate   support  

Poor  Data  

Quality   Perfec<on  Seek  Data   Aiming  to  high   Lack  of  Change  Management  over  correla<on  Seeking  cause   Lack  of  Data  Relevance   Governance  Lack  of   Select  the  right  visualiza<ons   Permission  Lack  of  

4

5

Seek Data Perfection

Big data beats sampling, hands down. In the past everyone relied on small data sets, or “samples.” But you needn't settle for samples. Now it’s about using as much data as you can get your hands on, which lights the way to new insights never before available.

-Daniel Kehrer

The benefits of using vastly more data of variable quality outweigh the costs of using smaller amounts of very exact data.

-Cukier and Mayer-Schoenberger

What we see here is two different approaches to data quality, each having its own area of application, correlation vs. predictive modeling.

Poor Data Quality

Shit in. Shit out. Understanding the quality of existing data in legacy systems is a huge pitfall that companies often don't spend enough time on.

Many data scientists spend most of their time preparing data to ensure that results are not skewed or subject to confirmation bias.

-Eric A. King

Contradiction?

0  

20  

40  

60  

80  

100  

Lack of Competence Lack of Goals Lack of Strategy and Corporate

Poor Data Quality Seek Data

Perfection Aiming to high Lack of Change

Management Seeking cause over

correlation Lack of Data

Relevance Lack of Governance Select the right

visualizations Lack of Permission

Average Score

6

7

8

Aiming Too High

How do you eat an elephant? -One bite at a time! Don’t try to import, load and link all your data at once. That is cost- and time consuming.

Too many companies start out with expensive and high-risk big-data initiatives. Big-bang implementations are rarely a path to success.

-Shira Ovide

Lack of Change Management

Many companies are afraid of running their business on data. People are mourning the loss of creativity and common sense. With proper change

management you can instead heighten the need for creativity and common sense.

Seeking Cause Over Correlation

Sometimes it is enough to understand what happens, and not bother too much over why it happens. Google for instance did correlate certain search queries with geographic locations, and by doing so were able to track flu outbreaks much faster authorities world-wide.

(4)

About the Author

First#Hits#

100#

TM#

Daniel  Garplid,  born  in  1976  

•  Founder  of  the  internet  research  company  100FirstHits.com  

hCp://www.100firsthits.com  

•  Founder  of  the  Real-­‐Jme  Business  Intelligence  company  Manifact  AB  

hCp://www.manifact.com  

Contact  and  Feedback  

E.  [email protected]   Ph.  +46  735  10  2770  

hCp://twiCer.com/danielgarplid   hCp://www.linkedin.com/in/garplid  

Interested  in  being   contacted  for:  

-­‐Feedback   -­‐Talks  and  Trainings   -­‐Consultancy   -­‐Business  deals  

Methodology

First#Hits#

100#

TM#

I searched for big data mistakes on Google. I then opened the first link and recorded every header/bullet and took notes on what they were all about. First header/bullet got a score of 100. Second header/bullet got a score of 99, and so on until I had collected 100 headers/bullets. I then categorized the data and summarized the score for each category. Search engine used in this research: Google

Lack of Data Relevance

228

121

Select the right visualizations 74 Lack of Permission 42

THE LAST FOUR

9

10

11

12

Scores

Categories Sum of Score % of total Count of Header Average of Score

Lack of Competence 1450 28,7% 27 54 Lack of Goals 655 13,0% 14 47 Lack of Strategy and Corporate support 608 12,0% 12 51 Poor Data Quality 575 11,4% 13 44 Seek Data Perfection 499 9,9% 6 83 Aiming to high 297 5,9% 8 37 Lack of Change Management 256 5,1% 4 64 Seeking cause over correlation 245 4,9% 3 82 Lack of Data Relevance 228 4,5% 5 46 Lack of Governance 121 2,4% 5 24 Select the right visualizations 74 1,5% 1 74 Lack of Permission 42 0,8% 2 21

Grand Total 5050 100,0% 100 51

“Big%Data”%Mistakes%

Just because big data allows you to use huge data sets does not mean you should include all of your data in an analysis.

Who has the rights to create, approve, edit, or remove data from the system? You will dilute the value of even the best statistical models if you don’t choose the right type of visualizations. More and more data is collected from consumers without their explicit knowledge or permission.

Receive information about updated researches and release information of new researches by subscribing to our mailing list. No SPAM, and we will never ever share or sell your email address to third parties. Sign up at:

http://www.100FirstHits.com/BigData.html

THE BIG 5

75%

Lack of Governance

@  

1

2

3

4

5

(5)

Important Message!

In previous research releases we

included the raw data in the end of

every report.

The response from our audience is

that this is unnecessary.

However, if you are interested in

the raw data, we will happily send it

over to you upon request. Just send

an email to:

[email protected]

Best regards,

References

Related documents

As shown in this white paper, a signaling gateway that transports application signaling between an SS7 network and an IP network can easily be implemented using AdvancedTCA

The most comprehensive and effective programs identified are those run by the European Centre for Disease Prevention and Control (combining the European Antimicrobial

This paper presents an estimate of the price elasticity of supply for tobacco output in Zimbabwe using an adapted Nerlovian model.. The results indicate a short-run elasticity

Image : ESRF ID24 http://www.esrf.fr/UsersAndScience/Experiments/XASMS/ID24/ CCD sensor Lens Lens MCP (fast shutter) Phosphor screen Field size 50x3 mm 2 Pixel size 25 µm Noise

The course of study for students enrolled in a cosmetology course shall consist of sixteen hundred (1600) clock hours which include a minimum of 335 clock hours of

The statute creates a legal fiction that the former spouse or domestic partner has predeceased the insured, the former spouse or domestic partner “having died at the time of entry

BATTERY Green LED flashes (with horn) Horn “chirps” about once a minute ALARM CONDITION Interconnected Series of Smoke/CO Alarms Smoke or CO Red LED flashes rapidly on the unit

 Realize upside potential from permitted, built in capacity to expand mining and milling rates through the upgrade and development of known resources and the discovery of