• No results found

Meeting the Big Data Challenge: Get Close, Get Connected

N/A
N/A
Protected

Academic year: 2021

Share "Meeting the Big Data Challenge: Get Close, Get Connected"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

Meeting the Big Data Challenge:

Get Close, Get Connected

Sponsored by

CITO Research

(2)

Introduction

1

The Untold Story of Big Data

2

Use Cases for Big Data Analytics—

Your Network Matters

3

Use Case 1: Moving the Data

to the Best Architecture

3

Use Case 2: Collecting Data Across the Internet

4

Use Case 3: Bursting Into the Public Cloud

When Data Gets Too Big

5

Use Case 4: Time-Sensitive Big Data Applications

6

(3)

1

Meeting the Big Data Challenge: Get Close, Get Connected

Executive Summary

In boardrooms around the world, C-suite executives are clamoring to join the big data party. And for good reason: the value that can be derived from analyzing big data is well known. Cost savings, productivity gains and compelling customer insights are but a few of the benefits of big data in the enterprise.

There is a hidden challenge to analyzing big data, one that is often ignored: the fluid movement of data from one place to another. In order to collect, store and analyze big data, it has move, and it has to move fast.

When it comes to the movement of big data, there is an important component that can make or break success: the network. This CITO Research white paper explains the big data network challenge and offers solutions to overcome it so that you can derive value faster.

Introduction

Big data is changing everything. What is big data? It’s data so big that your current infrastructure and tools can’t handle it. Excel breaks down. Databases break down. Data warehouses can’t handle the volume. That’s big data.

It’s no secret that companies around the world are doing big data deep dives to find out more about their customers and to drive sales. Facebook is now matching its advertising partners’ loyalty card data with individual Facebook profiles using email addresses and phone numbers. Considering that there are close to 900 million monthly active users on the social network, that’s a lot of data to crunch. The value of learning how to use big data for competitive advantage cannot be overstated. Companies that leverage big data to better understand and target their customers and transform their business processes have or will quickly outpace those who don’t.

If you are in the process of determining how you plan to use big data in your company, this is an important paper to read. This paper provides information about some practical aspects of working with big data that you probably haven’t considered.

The value of

using big data cannot

be overstated

(4)

The Untold Story of Big Data

CITO Research has found an important big data story that is not being reported. If you think of an overall architectural solution for computing as a three-legged stool, what you find is that two of the three legs are solid. The first leg, hardware, has advanced over the last 10 years. So has the second leg, software, with software as a service, apps, and big data toolsets such as Hadoop and Amazon RedShift crunching data at scale.

There’s one area where advances are conspicuously absent and that is the net-work. We are more reliant on networks than ever, and variations in network perfor-mance across the Internet are extreme and our three-legged stool is lopsided as a result. To illustrate, Figure 1 shows the variability in moving a mere 12 TB of data from one cloud provider to another, hardly big data. It took one provider 4 hours and another a full 7 days.

Estimated Minimum Hours to Transfer 12 TB

Number of Hours 175 150 40 35 30 25 20 15 10 5 -S3 S3 S3 S3 S3 S3 1 2 40 168 4 5 4 Figure 1: Wild Variations in Moving Data from Cloud to Cloud (Data source: Nasuni,

To be successful with

big data, barriers

within infrastructure

must be overcome

(5)

3

Meeting the Big Data Challenge: Get Close, Get Connected

If you want to gain competitive advantage in using big data, you need to consider where big data is coming from, where it will go, and how you plan to analyze it. To be successful with big data, barriers within the infrastructure must be overcome to achieve high throughput and lower latency. Big data analytics projects have different network requirements, which vary based on use case.

Use Cases for Big Data Analytics—

Your Network Matters

There are many use cases for big data analytics; this paper addresses four of them: 1. Moving big data to the most effective architecture to process it

2. Collecting big data from multiple sources (i.e., social networks) across the Internet for analysis

3. Using the cloud’s elastic capability when data is too big for your infrastructure, requiring you to move the data to the public cloud

4. Offering a big data application that depends on (or is differentiated by) speed

Use Case 1: Moving the Data

to the Best Architecture

Big data comes with a build or buy decision, and the cloud makes this choice more interesting (and potentially mind boggling) than ever. Should we set up a Hadoop cluster or rent one?

Whether you’re working with a company such as GoodData, which offers a com-plete analytics infrastructure in the cloud, or moving your data into Amazon RedShift’s online data warehouse to do your crunching there, or taking advantage of any number of other leading-edge services, there’s one common denominator. You have to move big data. The speed with which you can move your data and get analytical results becomes a clear differentiator. Choosing a provider with high-speed direct connections to your service means that each time you move data, it’s going as fast as possible. This can mean the difference between seeing results in a day or in two weeks.

The choice of provider

can mean the

difference between

seeing results in a day

or in two weeks

(6)

What’s a direct connection?

A direct connection bypasses the Internet entirely. Instead of having your data routed across the Internet (with the variation and unpredictability of that best-effort routing), a direct connection is a shortcut from one well-connected provider to another. It’s up to 80% faster than sending data over the Internet.

Use Case 2: Collecting Data Across the Internet

Information can flow in different ways, but if your big data use case depends on collecting data from multiple sources in a time-sensitive fashion, having fast connections to those sources makes a big difference.

Let’s say you’re pulling data from your own point-of-sale systems, from Facebook, Foursquare, and Twitter. Whether you’re monitoring your brand or offering targeted promotions to customers via mobile, there’s a great deal of value in being able to pull data from all these sources. To collect that much data and then crunch it in time to matter, you need to be able to access it fast.

As another example, say Coca-Cola decided to advertise to people who have bought Pepsi in the last month. In order to do that, data from hundreds, possibly thousands, of retailers would have to be quickly collected so the ad opportunity isn’t lost.

Choosing a provider that has direct connections to the vast majority of these data sources means that you can collect big data fast. Once you have the data, whether you’re gauging reaction to your new product initiative or a new marketing campaign or making real-time offers to customers, you can analyze it and use it effectively.

For these applications, stale data is next to worthless. Your offers need to keep pace with what’s happening. Direct connections to data sources is differentiating and gives you a competitive edge.

(7)

5

Meeting the Big Data Challenge: Get Close, Get Connected

Use Case 3: Bursting Into the Public Cloud

When Data Gets Too Big

Some companies have huge batch jobs for big data. When you are looking at billions of transactions, it’s impossible to crunch those kinds of numbers in a traditional infrastructure. Doing the monthly or quarterly reports on numbers like those requires additional infrastructure. This is when it’s time to burst into the Amazon public cloud.

Having a direct connect into the public cloud provides elasticity to handle large jobs as needed.

This use case can occur at the beginning of a company’s exploration with big data, where the public cloud is an attractive option, as well as later in the lifecycle, when a company has used the public cloud intensively enough and needs a more cost effective private cloud infrastructure. Periodically you need to go beyond the capabilities of your private infrastructure. When this happens, bursting into the public cloud securely and easily makes good sense (this is referred to as a hybrid cloud).

Selecting a provider that offers a direct connection into the Amazon public cloud enables cost effective scaling whenever it becomes necessary.

This strategy also works for companies that need to handle usage spikes. Public cloud elasticity provides support to scale under heavy loads.

Manage your regulatory constraints

Use hybrid cloud to connect private and public

Retain access during demand spikes

Private Hybrid Public

Your Company

Amazon Public Cloud

Analyze Big Data with Hybrid Cloud

Figure 2: Using a Hybrid Cloud Solution to Increase Big Data Efficiency

Stale data is

worthless; offers need

to keep pace with

what’s happening

(8)

Use Case 4: Time-Sensitive Big Data Applications

The fourth and most latency-sensitive use case involves an application that depends on bringing in big data, crunching it, and fueling an application’s response with the results.

One of the most difficult cases is advertising exchanges. According to Forrester Research, successful real-time bidding (RTB) for advertising requires companies to receive, analyze, and bid on individual ad impressions within 100 milliseconds. The faster bidding happens, the higher the likelihood of completed transactions— and larger transaction volumes create more revenue for all participants.

Looking to boost the execution speed of its video ads, which would give digital video advertising buyers precious additional milliseconds of decision-making time to improve audience targeting and increase ROI for their ad buys, BrightRoll needed high-speed connectivity. By choosing a provider that offers direct con-nections, BrightRoll increased performance by 80% compared with going over the Internet and boosted transaction volume by 10%.

There’s another key aspect of this use case. Applications have users all over the world. Performance depends on having a provider with fast global points of pres-ence. Direct connections bypass the Internet where needed; global data centers provide fast access into the largest regional providers.

For example, content sharing platform Box turned to a well-connected provider to add capacity to Box Accelerator, its global data transfer network. These upgraded connections augmented the speed of the Box service for enterprise users by 60 percent.

Similarly, gamification and behavior management platform Badgeville lever-aged direct connections to improve latency between its database and the public cloud. Partially as a result, overall response time from the Badgeville application APIs improved 15 percent, so customers experience overall better performance. Badgeville has cut approximately 40 percent of its monthly operational costs for cloud services.

Performance depends

on having a provider

with fast global points

of presence

(9)

7

Meeting the Big Data Challenge: Get Close, Get Connected

Conclusion

The Internet is a network of networks. From the beginning of the Internet, there have been carrier-neutral interconnection points where all the major providers positioned their data centers. This has never been more important than today, when applications depend on using, moving, and pivoting in response to crunch-ing big data.

What is less well-known is that you can enjoy the same benefits as these major players. By choosing to partner with Equinix, you tap into this network of networks; upwards of 80% of all connections over the Internet traverse Equinix data centers. By locating your big data where all the major players are, Platform Equinix offers:

Direct connections to the public cloud Interconnects to all the major social networks High speed global access

Connections to market-leading and up and coming big data partners That last point deserves special attention. The face of big data is constantly changing as new products and services come on the market. What you use for big data today and what you’ll use a year from now are unlikely to be the same. By positioning yourself at Equinix, you can take advantages of the latest technol-ogy for big data with minimal business impact. The latest technoltechnol-ogy is just a cross-connect away.

Whether you’re building a big data cluster or seeking help to do so, Equinix is strategically positioned to help you move faster than the competition and make the most of your big data initiative.

This paper was created

by CITO Research and

sponsored by Equinix.

CITO Research

CITO Research is a source of news, analysis, research, and knowledge for CIOs, CTOs, and other IT and business professionals. CITO Research engages in a dialogue with its audience to capture technology trends that are harvested, analyzed, and communicated in a sophisticated way to help practitioners solve difficult business problems.

Visit us at http://www.citoresearch.com

About Equinix

Equinix, Inc. One Lagoon Drive 4th Floor

Redwood City, CA 94065 Main: +1.650.598.6000 Fax: +1.650.598.6900 Email: [email protected]

References

Related documents

Using 2006 Census data, this paper explores the development of a similar gender-related index as a tool to enable a relative ranking of the performance of Indigenous males and

The main wall of the living room has been designated as a "Model Wall" of Delta Gamma girls -- ELLE smiles at us from a Hawaiian Tropic ad and a Miss June USC

Overall, these results tentatively appear to be consistent with the theoretical prediction that firms which offer financial incentives are expected to have lower absence rates

Methodologies/Principal Findings: We evaluated the influence of migration and acculturation (i.e., migration status and length of residence) on the prevalence of type-2

If you expect search engine spiders to execute Flash, Java or Javascript code in order to access links to further pages within your site, you'll usually be disappointed with

The findings of this study point to a trend of increasing use of mobile devices and intensification of private online access among children in Brazil. In the light of this

Aims: The objective of this study is to systematically analyse the randomised, controlled trials comparing tacker mesh fixation (TMF) versus no mesh fixation (NMF) in

It is certainly possible that the answer is yes, and this for four reasons: (1) the ancient Lydians considered electrum as a separate metal 91 and would accept the coins as of equal