• No results found

COULD VS. SHOULD: BALANCING BIG DATA AND ANALYTICS TECHNOLOGY WITH PRACTICAL OUTCOMES

N/A
N/A
Protected

Academic year: 2022

Share "COULD VS. SHOULD: BALANCING BIG DATA AND ANALYTICS TECHNOLOGY WITH PRACTICAL OUTCOMES"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

WITH PRACTICAL OUTCOMES

(2)

The business world is abuzz with the potential of data. In fact, most businesses have so much data that it is difficult for them to process and analyze it using traditional applications. There's a term often used in today's business lexicon to describe this issue: "big data."

When it comes to extracting value from the vast array of available data, there is almost no limit to the technology and real-time

analytics capabilities that you could develop. The key is finding the solution you should use—one that provides the right balance

between sophisticated technology and cost-effective business

results.

(3)

WEALTH OF DATA OR WEALTH OF INSIGHT?

As your business becomes more and more “digital,” everything you do leaves a trail of data—high-volume, highly variable data about customers, transactions, operations, finances, competitors, markets, and more.

Even social media such as web blogs and Twitter generate valuable data.

There’s no question that data can hold the key to better performance and competitive advantage. The real question is how best to gain insight by examining every available data source.

In today’s environment of complex operations and abundant data, that type of insight is possible using advanced analytics. Just about every company recognizes the value of analyzing its “big data,” but many still struggle to develop the right analytics capabilities to support their business. In fact, it is very easy to over- engineer—or, conversely, under-develop—an analytics solution.

JUST WHAT DO WE MEAN BY "BIG DATA"?

In its simplest form, the term "big data" generally describes both the data itself— its volume, velocity, and variety —as well as the process of data-driven decision-making through analytic insights, predictive modeling, and optimization.

Big data solutions require the ability to Integrate and manage data, as well as to analyze it in a timely and effective way. Commonly, these solutions combine low-cost storage, open-source tools for refining and integrating the data with high- performance analytics, modeling, and visualization platforms.

BIG DATA TOOLS AND PLATFORMS

Big data solutions generally fall into two major categories: tools and platforms. The tools provide the method of analysis while the platforms execute the analysis. Both of these are generally divided between open-source and proprietary implementations.

Open Source Platforms: Open-source high performance data management and parallel processing tools have shown the greatest growth recently. Apache Hadoop, a distributed computing platform, has popular releases made by Hortonworks and Cloudera. Other platforms and tools such as MongoDB, HBase, and R are rapidly gaining prominence. Hadoop and R in particular are experiencing dramatic uptake with R finding its way into statistical and actuarial circles.

Proprietary Platforms: Proprietary platforms include Teradata and SAP HANA, as well as other players. Microsoft's HD Insight takes a slightly different route by being a 100-percent Apache Hadoop compatible platform that runs on the Windows operating system. Microsoft's decision to embrace the Hadoop ecosystem and its support of open source development is a continuation of its involvement and active development in Linux.

(4)

Tools: Some of the greatest advances have been in tools and methodologies including genetic algorithms, natural language processing, and predictive modeling. Applying these machine learning approaches to big data enhances their efficacy and opens new possibilities. Hive is an example of a tool that brings big data analysis within easy reach of most organizations.

START BY ASKING THE RIGHT QUESTIONS

In general, we have found that asking and answering a few questions in four key areas can help you design the best, most cost-effective “big data” analytics solution for your specific business and operations:

1. Context—what business problem(s) are we trying to solve?

2. Action—which specific actions or decisions do we need an analytics solution to support?

3. Use—how, when, and where will we use analytics?

4. Data—is our data accurate and sufficient to support decision making? Is it available continuously, or do we have to acquire it on a one-off basis (often, with great effort)?

By focusing on these four areas, you will be able to design a big data “ecosystem” that creates timely insights and drives better decision making.

CONTEXT

First, it is important to view a potential analytics solution in terms of the business value it produces, not the technology it utilizes. In other words, your focus should be on “big answers” rather than big data.

According to a December 2012 survey of business executives from fifty Fortune 1000 firms by

NewVantagePartners, nearly a quarter of respondents desired a big data solution to improve customer experience. By integrating and analyzing a wider variety of customer data, these organizations want to better understand customers’ desires and intentions and be better- equipped to serve or target them for additional products and services.

While better customer experience should drive return on investment (ROI) through improved customer retention and growth, leading companies are looking beyond customer and market-facing opportunities to also improve their operational and business processes. Manufacturers and retailers are using big data analytics to improve their supply chains and time to market, while insurance companies are pushing the limits of data to do everything from managing claims more efficiently to identifying fraud among their customers.

Understanding the potential value of your solution will help guide how aggressively you pursue data that may be costly to gather and difficult to integrate with other sources and yet only provide incremental insights.

(5)

ACTION

Analytics drive more effective decision making. If your goal is to improve retention of your most valuable customers, what decisions will you need to make and what actions will you need to take to do so?

An analytics solution could help you score, rank, or segment customers more precisely; develop differentiated customer service/support strategies for specific customer segments; or design timely customer intervention and retention strategies. Analytics also could help you analyze and improve

operational processes that impact customer experience—such as redesigning call or order routing, creating more efficient inventory and fulfillment processes, or making other process changes that create financial advantages.

Your desired action or decision, in turn, will affect the timeliness and accuracy required. For example, if your focus is on improving customer experience through better marketing and service, you will need to integrate more data on a timelier basis to ensure an accurate view of your customer relationship. This may involve integrating customer transactional history, likely found in current systems and databases, with other behavioral data such as web click-streams, customer services calls and outcomes, and social activity such as likes, tweets/retweets, etc. Additionally, you also may want to combine this with other, more static data such as demographic and financial data to have an even clearer understanding of each customer.

You may be able to tolerate less accuracy in the resulting analytics and insights, however, when incremental cost of being wrong is low. For example, incremental cost or risk of an incorrect product recommendation prior to online retail check out may be relatively low. Conversely, if you’re a bank and your big data solution is focused on making better, more timely risk decisions, or a life sciences company focused on improving patient outcomes, there are likely compliance and regulatory requirements that demand greater accuracy.

Determining what’s “good enough” to make a decision or take action will save your big data team from an endless quest for the “perfect answer.”

(6)

Case study: A retail giant commits to big data

With multiple brands and more than 32 different business units, Sears Holdings, parent company of Sears, Kmart, Land’s End, Craftsman Tools and others, had a big data problem. Over the years, the company had built up a patchwork of databases, hardware, software, and analytic and reporting solutions that served thousands of different users across the enterprise. This created a number of challenges in running a modern, multi-channel retail business:

 No single version of the truth—the same question produced different answers for different users

 Dozens of proprietary databases and appliances across business units

 Inflated IT capital expenses and turn-around times

 Inflexible data models

 Most importantly, lack of timely customer and business insights for decision-making

Under the leadership of new Chief Technology Officer Dr. Philip Shelley, the company embarked on a three- year journey to improve its data and analytics infrastructure to overcome the challenges above. By

embracing big data technology, the firm was able to replace multiple databases and appliances with a large Hadoop cluster at a fraction of the cost it had been spending on IT infrastructure and support.

As Dr. Shelley stated in a recent article for Retail Information Systems News, “It's involved a huge mindset change relevant to keeping data. We used to only keep aggregates of data and throw away the detail, because it had been too big. Now we don't throw anything away, theoretically, ever.”

While there was a learning curve initially, this new mindset has enabled Sears to shave weeks off of customer campaigns, which leads to more timely and relevant offers. Likewise, the company now has new, data driven insights for planning and investment decisions. For example, because it now has access to granular-level data across its customers, stores, and supply chain, Sears can analyze the seasonality of individual items or SKUs using eight years of detailed historical data—analysis that was impossible in the past because of limited data storage and availability. Similarly, the company’s new big data installation has reduced typical IT timelines to gather and integrate new data sources by up to 70 percent and has nearly eliminated the need for capital investments as a result of replacing proprietary storage with commodity servers.

(7)

USE

Another key to realizing value from analytics tools and processes is carefully defining their use.

Consider where you will use analytics to support decision making—in your call center, for online transactions and interactions, or at the point of sale. Frequency is also important in this equation. For example, if you are using analytics to improve forecasting, batched results may suffice. But if you are using analytics to intervene in situations where a customer relationship is at risk—well, the sooner you can see it, the sooner you can act on it.

Although most executives generally think of big data as being “real time,” the action required should dictate whether you need real-time, near-real-time, or less-frequent batch updates and integration to drive

decision making. Timeliness is critical, as the cost of instrumenting big data increases greatly as you move from batch to real-time insights. While the goal should always be to shorten the time from insight to action, make sure your intended use of the insights—not the timeliness or velocity of the underlying data—drives your big data solution.

DATA

Finally, understanding the nature of your data will help define the most appropriate data management and analytics techniques for your needs. One useful way to look at your data is in terms of 3 “Vs”—volume, variety, and velocity.

Popular business media tend to focus on the volume of data being collected. However, it’s the velocity and variety of data that’s being collected that is much more important. Simply put, the cost to store larger and larger amounts of data is incremental compared to the cost of cleansing, integrating, and analyzing that data. In fact, in the same NewVantage Partners survey referenced above, only 10 percent of respondents cited data volume in their definition of big data, while 40 percent said data variety is the critical factor that defines big data.

Likewise, big data solutions often focus on unstructured data, such as social data, video, and data stored in electronic documents and messages. While this type of unstructured data can be helpful, the promise of big data lies in the ability to “mash up” data from many different domains and perspectives—temporal,

transactional, operational, spatial—to create unique and action-ready business insights. These types of data are often already structured or semi-structured, so the real challenge lies in the ability to cleanse and integrate these sources with each other.

Another key consideration is viewing raw data as “temporary”—data that can be discarded once the key insights are derived. Analytics software giant SAS calls this a “stream it, score it, store it” approach to big data management. In other words, if you are applying an algorithm to a customer’s online shopping activity

(8)

to make product recommendations, once you derive the best product to recommend, the underlying browsing activity can be discarded.

Finally, keep in mind that while big data solutions utilize new tools and techniques for collecting, integrating, and analyzing data, the key to a successful solution requires an understanding and commitment to many traditional data management disciplines:

 Data quality and meta-data management

 Data governance

 Data visualization and business intelligence

WHAT COULD YOU DO, OR WHAT SHOULD YOU DO?

The possibilities that come from using big data are virtually endless. So is the possibility that you can invest too much time and expense in creating analytics capabilities and analyses that are nice but not essential to running your business. The key is finding the right balance between the technology-driven possibilities and practical solutions that can help your business meet its goals.

Answering a few key questions up front can help you sort through the myriad possibilities and refocus your efforts from what you could do to what you should do—and, in the process, increase the potential for

Case Study: Big data isn't all open source--but It's not proprietary either

Klout is a service that measures influence across social networks. It is a dynamic and rapidly expanding organization that continues to improve its product offerings and grow its user base. Klout processes feeds from social networks to measure users’ influence levels across social media.

This scoring involves complex calculations but also very large amounts of raw data. Like most startups, Klout went to market with an open source software stack that is the core of its operating environment.

Apache Hadoop is the center of this open source ecosystem. As Klout grew, it needed to be able to provide ad-hoc query capabilities and advanced analytics for business level reporting.

To bridge the gap between its big data platform and the business level visibility desired by its end users, Klout turned to SQL Server Analysis Services to provide familiar tools and experiences such as Excel. This allowed Klout to preserve investments and experience in both existing IT technologies and business skill sets.

Analysis Services acts as a conduit from user facing tools and queries into Hadoop using ODBC, Hive, and linked servers. This provides an integrated analysis capability reaching from the user to the big data platform. In addition to reporting and analytics, Analysis Services also provides alerts and QoS capabilities.

(9)

return on your analytics investment. Addressing these questions will also ensure that all of your

stakeholders support the solution and its potential benefits. Big data forces us to think differently about data-driven decision making, and alignment is crucial to the success of your project. If IT, marketing, finance, operations, and others are all on the same page with respect to the context, action, use, and data

requirements of your big data project, you’ll be well on your way to a successful outcome.

References

Related documents

We propose that hydrodynamic escape of hydrogen rich protoatmospheres, accreted by forming planets, explains the limit in rocky planet size.. Following the hydrodynamic

Research in the field of systemic digitization of the urban environment identi- fies a set of measures, usually reduced to three stages: (1) the formation of a common structure to

In conclusion, for the studied Taiwanese population of diabetic patients undergoing hemodialysis, increased mortality rates are associated with higher average FPG levels at 1 and

any legal representative of the whistleblower in the Commission action or related action; (c) the programmatic interest of the Commission in deterring violations of the

I We also consider a noisy variant with results concerning the asymptotic behaviour of the MLE. Ajay Jasra Estimation of

ADA and FEHA claims require employers to use the Interactive Process to determine if a reasonable accommodation can be provided that can allow an employee to perform

Considering this new development in governance in Ghana, it can be noted that over the last decade, there has been a development of fresh networks of actors

In the current phase loyalists have drawn on collective memories in ways that sees them express their cultural politics and sense of identity and in a more essentialized