• No results found

The Convergence of Big Data Processing and Integrated Infrastructure

N/A
N/A
Protected

Academic year: 2021

Share "The Convergence of Big Data Processing and Integrated Infrastructure"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Research

Report

Abstract:

The Convergence of Big Data Processing

and Integrated Infrastructure

By Evan Quinn, Senior Principal Analyst and Bill Lundell, Senior Research Analyst

With Brian Babineau, Vice President of Research and Analyst Services

(2)

Introduction

Research Objectives

In order to assess current data analytics and processing trends, as well as plans for the next 12-18 months, ESG recently surveyed 399 North American IT and business professionals representing midmarket (100 to 999 employees) and enterprise-class (1,000 employees or more) organizations. Respondents were familiar with their organization’s current data analytics environment and processes, as well as forward-looking strategies involving the infrastructure and platforms necessary to support data analytics initiatives.

The survey was designed to answer the following questions:

• How important is the enhancement of data analytics capabilities relative to all of an organization’s business

and IT priorities?

• What is associated with the term big data?

• What are the trends for current usage and planned adoption of MapReduce framework technology?

• What is the size of the largest data set upon which an organization conducts data analytics activities?

• How many unique sources do organizations integrate as part of their largest data sets?

• How frequently do organizations update their largest data set?

• What kind of tools do organizations use to integrate the data sources populating their largest data sets?

• What sources and data types comprise organizations’ largest data sets?

• What data analytics and/or processing challenges do organizations face with respect to their largest data

sets?

• What types of data analytics platforms have organizations deployed to support their largest data sets?

What benefits have they derived from these platforms?

What types of data analytics platforms do organizations anticipate deploying in support of their fastest

growing data sets? What requirements are driving these changes?

• Are the sources populating organizations’ largest data sets geographically dispersed? What challenges does

this present?

• What are the must-have data management features/functionality for data analytics platforms and

infrastructure?

• What kind of storage technologies do organizations use to support their data analytics and processing

activities? Which are most pervasive and how will this change going forward?

• How much downtime can organizations tolerate when it comes to their data analytics platforms? What

data protection technologies do they have in place to support these requirements?

Survey participants represented a wide range of industries including manufacturing, financial services,

communications and media, health care, and retail. For more details, please see the Research Methodology and

(3)

Research Methodology

To gather data for this report, ESG conducted a comprehensive online survey of IT professionals from private- and public-sector organizations in North America (United States and Canada) between March 5, 2012 and March 12, 2012. To qualify for this survey, respondents were required to be IT or business professionals personally responsible for their organization’s data analytics and processing environment, including the software/applications and/or the underlying platforms and systems. All respondents were provided an incentive to complete the survey in the form of cash awards and/or cash equivalents.

After filtering out unqualified respondents, removing duplicate responses, and screening the remaining completed responses (on a number of criteria) for data integrity, we were left with a final total sample of 399 IT and business professionals.

Please see the Respondent Demographics section of this report for more information on these respondents. Note: Totals in figures and tables throughout this report may not add up to 100% due to rounding.

(4)

Respondent Demographics

The data presented in this report is based on a survey of 399 qualified respondents. The figures below detail the demographics of the respondent base, including individual respondents’ current job responsibility, technology responsibility, and job function, as well as the respondent organizations’ total number of employees, primary industry, and annual revenue.

Respondents by Data Analytics Job Responsibility

The breakdown of current job responsibility within an organization among survey respondents is shown in Figure 1.

Figure 1. Survey Respondents, by Data Analytics Job Responsibility

Source: Enterprise Strategy Group, 2012.

Respondents by Technology Responsibility

IT operations respondents’ primary area of technology responsibility is shown in Figure 2.

Figure 2. Survey Respondents, by Technology Responsibility

IT Operations – my primary responsibility

includes supporting the underlying data

analytics and processing infrastructure, 50% IT Application Development & Support – my primary responsibilities include

the support and maintenance of data analytics and processing software, 28% Line-of-business support (non-IT) – my responsibilities include

data analytics and processing support for

the business, 22%

Which of the following best describes your current responsibility with respect to your organization’s data analytics and processing environment? (Percent of respondents,

N=399) IT operations, 29% Applications/database, IT architecture/planning, General IT, 16% Data protection, 6% Servers, 5%

Storage / SAN, 2% Other, 1%

Which of the following would you consider to be your primary area of technology responsibility? (Percent of respondents, N=199)

(5)

Respondents by Job Function

The primary job function among survey respondents responsible for their organization’s application environment is shown in Figure 3.

Figure 3. Survey Respondents, by Job Function

Source: Enterprise Strategy Group, 2012.

Respondents by Number of Employees

The number of employees in respondents’ organizations is shown in Figure 4.

Figure 4. Survey Respondents, by Number of Employees

Source: Enterprise Strategy Group, 2012.

Business manager, 41% Business analyst, 15% Applications/database, 13% Data analyst, 11% Data warehouse/business intelligence, 5% Data scientist, 3% Reports administrator, 3% Other, 10%

Which of the following best describes your primary job function? (Percent of respondents, N=200) 100 to 249, 18% 250 to 499, 17% 500 to 999, 11% 1,000 to 2,499, 9% 2,500 to 4,999, 12% 5,000 to 9,999, 8% 10,000 to 19,999, 8% 20,000 or more, 18%

How many total employees does your organization have worldwide? (Percent of respondents, N=399)

(6)

Respondents by Industry

Respondents were asked to identify their organization’s primary industry. In total, ESG received completed, qualified responses from individuals in 20 distinct vertical industries, plus an “Other” category. Respondents were then grouped into the broader categories shown in Figure 5.

Figure 5. Survey Respondents, by Industry

Source: Enterprise Strategy Group, 2012.

Respondents by Annual Revenue

The annual revenue of respondents’ organizations is shown in Figure 6.

Figure 6. Survey Respondents, by Annual Revenue

Manufacturing, 18% Financial (banking, securities, insurance), 13% Government (Federal/National, State/Province/Local), 13% Communications & Media, 10% Business Services (accounting, consulting, legal, etc.),

9% Retail/Wholesale, 8%

Health Care, 6% Other, 25%

What is your organization’s primary industry? (Percent of respondents, N=399)

Less than $50 million, 20% $50 million to $99 million, 15% $100 million to $499 million, 15% $1 billion to $4.999 billion, 12% $5 billion to $9.999 billion, 7% $10 billion to $19.999 billion, 5% $20 billion or more, 10%

Not applicable (e.g., public sector,

non-profit), 9%

What is your organization’s total annual revenue ($US)? (Percent of respondents, N=399)

(7)

Contents

List of Figures ... 3

List of Tables ... 4

Executive Summary ... 5

Report Conclusions ... 5

Introduction ... 7

Research Objectives ... 7

Research Findings ... 8

The Increasing Importance of Analytics – Thank You, Big Data ... 8

The Impact of Big Data on Analytics ... 9

Big Data Analytics Platforms ... 18

Security Considerations for Big Data ... 22

Data Analytics Storage and IT Infrastructure Requirements... 24

Increasing Interest in Hadoop MapReduce Framework Technology ... 30

Conclusion ... 32

Research Implications for Technology Vendors... 32

Research Implications for IT Professionals ... 33

Research Methodology ... 34

Respondent Demographics... 35

Respondents by Data Analytics Job Responsibility ... 35

Respondents by Technology Responsibility ... 35

Respondents by Job Function ... 36

Respondents by Number of Employees... 36

Respondents by Industry ... 37

(8)

List of Figures

Figure 1. Importance of Enhancing Data Processing and Analytics Activities ... 8

Figure 2. Meaning of the Term Big Data ... 9

Figure 3. Size of Largest Data Set for Data Analytics and Processing Functions ... 10

Figure 4. Number of Data Sources Integrated to Support Data Analytics Activities on Largest Data Set ... 11

Figure 5. Number of Data Sources Integrated to Support Data Analytics Activities on Largest Data Set, by Company Size ... 11

Figure 6. Update Frequency of Largest Data Set ... 12

Figure 7. Primary Method of Integrating Data Sources in Largest Data Set ... 13

Figure 8. Primary Method of Integrating Data Sources in Largest Data Set, by Largest Data Set Update Frequency ... 13

Figure 9. Sources Responsible for Populating Largest Data Set ... 14

Figure 10. Types of Data in Largest Data Set ... 15

Figure 11. Types of Data Processing and Analytics Activities Conducted on Largest Data Set ... 16

Figure 12. Data Processing and/or Analytics Challenges with Largest Data Set ... 17

Figure 13. Data Processing and Analytics Platforms Currently Deployed to Support Largest Data Set ... 18

Figure 14. Key Benefits Organizations Have Derived from Data Analytics Platforms ... 19

Figure 15. Plans to Deploy New Data Analytics Platform to Support Fastest Growing Data Set ... 20

Figure 16. Data Analytics Platform Organizations Plan to Deploy to Support Fastest Growing Data Set ... 20

Figure 17. Requirements Driving Organizations to Evaluate New Data Analytics Solutions for Fastest Growing Data Set ... 21

Figure 18. Geographic Dispersion of Largest Data Set ... 22

Figure 19. Challenges of a Geographically Dispersed Data Set ... 23

Figure 20. Importance of Features/Functionality in Considering Data Analytics Infrastructure and Platforms ... 24

Figure 21. Disk-based Storage Used to Support Data Analytics and Processing Activities ... 25

Figure 22. Percent of Total Volume of Data Analytics/Processing Activity Stored on Disk-based Storage ... 26

Figure 23. Challenges Scaling Storage Environment to Support Data Analytics and/or Processing Activities ... 27

Figure 24. Infrastructure for Data Analytics and Processing Activities ... 28

Figure 25. Amount of Downtime Data Analytics Platforms Can Tolerate... 29

Figure 26. Data Protection / Availability Technologies Currently Deployed to Support Data Analytics Platforms .. 29

Figure 27. Interest in MapReduce Technology ... 30

Figure 28. Interest in MapReduce Technology, by Company Size ... 31

Figure 29. Interest in MapReduce Technology, by Size of Largest Data Set ... 31

Figure 30. Survey Respondents, by Data Analytics Job Responsibility ... 35

Figure 31. Survey Respondents, by Technology Responsibility ... 35

Figure 32. Survey Respondents, by Job Function ... 36

Figure 33. Survey Respondents, by Number of Employees ... 36

Figure 34. Survey Respondents, by Industry ... 37

(9)

List of Tables

Table 1. Size of Largest Data Set for Data Analytics and Processing Functions, by Company Size ... 10

Table 2. Sources Responsible for Populating Largest Data Set, by Company Size ... 14

Table 3. Data Processing and/or Analytics Challenges with Largest Data Set, by Role... 17

Table 4. Geographic Dispersion of Largest Data Set, by Company Size ... 22

Table 5. Challenges of a Geographically Dispersed Data Set, by Company Size ... 23

Table 6. Disk-based Storage Used to Support Data Analytics and Processing Activities, by Role and Size of Largest Data Set ... 25

Table 7. Percent of Total Volume of Data Analytics/Processing Activity Stored on SAN-based Storage, by Size of Largest Data Set ... 26

All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should

(10)

References

Related documents

The supply of market information for the agricultural sector in West Africa is highly variable. A number of MIS provide data on cereals, including public, private,

After the simple setup of combined parameters with SCPI commands, WLAN list sequence performs measurements on all 45 bursts in seconds and returns the transmitter power, SEM,

In conclusion, for the studied Taiwanese population of diabetic patients undergoing hemodialysis, increased mortality rates are associated with higher average FPG levels at 1 and

any legal representative of the whistleblower in the Commission action or related action; (c) the programmatic interest of the Commission in deterring violations of the

Measures of well-being play an increasingly important role in applied research. Within psychology the expanding role of overall measures of well-being indicates a greater interest in

‘Zefyr’ caused by Gnomonia fragariae in the greenhouse 11 weeks after inoculation: (A) Severe stunt of plants inoculated by root dipping in ascospore

The second sub-question, “How have Nigerian techniques of material fabrication developed to produce new styles of furnishings, sets and promotional forms?” was addressed in