Handbook
1
EDITOR’S NOTE2
IN-MEMORY ANALYTICS TOOLS PACKBIG DATA PUNCH
3
IN-MEMORY, BIG DATA COMBO NEEDS SOLID IT FOUNDATION
4
FASTER ANALYTICS SPEED NOT ENOUGH ON IN-MEMORY APPS VIR TU ALIZA TION CL OUD APPLIC ATION DEVEL OPMENT NETW ORKING ST ORA GE ARCHITE C TURE D AT A CENTER MANA GEMENT BI / APPLIC ATIONS DIS A STER RE CO VER Y/COMPLIANCE SE CURITY
In-Memory Analytics and
Big Data: A Potent Mix?
In-memory analytics software can help companies make sense of
their growing vaults of data, without bogging business users down.
But there are issues and challenges to be aware of.
BY BETH STACKPOLEHome Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
1
EDITOR’S NOTEIn-Memory Technology
Takes On Big Data Challenge
Gartner says in-memory computing is speeding its way toward mainstream
adoption; the consulting company predicts that at least 35% of large and midsize organizations will adopt the technology by 2015, up from less than 10% in 2012. Meanwhile, vendors of self-service business intelligence tools that support in-memory analytics—QlikTech, Tableau, Tibco—have taken up residence in the “leaders” box in Gartner’s Magic Quadrant for BI platforms. Enter another technology trend that’s attracting, oh, a fair amount of attention: big data. Put in-memory tools and big data together, and what do you have? A combination that might help speed up the process of analyzing all the structured and unstructured data that organizations are collecting and hoping to capitalize on. This guide offers insight on getting the in-memory and big data mix right. First we explore the possibilities of in-memory big data analytics and the issues that need to be considered. Next we look at the IT requirements to keep in mind. We end with an interview on managing in-memory and big data projects. n
Craig Stedman
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
In-Memory Analytics
Tools Pack Big Data Punch
In the ongoing effort by companies to tease tangible business value out of
agglomerations of big data, in-memory analytics tools offer a possible path to unlocking insights that can spark operational improvements and point the way to new revenue opportunities.
Unlike conventional business intelligence (BI) software that runs queries against data stored on server hard drives, in-memory technology queries informa-tion loaded into RAM, which can significantly accelerate analytical performance by reducing or even eliminating disk I/O bottlenecks. Consultants and experienced users say the resulting speed boost is particularly compelling for big data analytics applications involving complex what-if scenarios and large amounts of informa-tion from a variety of data sources. “The biggest benefit to in-memory analytics is speed of analysis and explo-ration,” said Cindi Howson, founder of BI Scorecard, a research and consulting company in Sparta, N.J., that publishes technical evaluations of BI and analytics tools. The data latency that often bogs down traditional BI querying “interrupts the whole thought process” for business users, Howson said. She cited analytical flexibility as another big benefit of the in-memory approach: “With in-memory
2
PLANNINGHome Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
2
PLANNING tools, users can ask business questions they could never ask before because the technology was too slow.” That’s the case at Cheezburger Inc. The Seattle-based operator of humor web-sites that attract a total of 500 million page views a month is getting good results from an in-memory big data analytics initiative, according to Loren Bast, who was Cheezburger’s director of BI until he left the company in April 2013.DEEP DIVE INTO TOO MUCH DATA
Initially, Cheezburger stumbled in trying to track and analyze data about its online traffic in an effort to discern user behavior patterns. “We jumped into the deep end of the pool with big data, and we were certainly doing it big, just not doing it right,” Bast said, explaining that only 10% of the data being captured by the company ended up being relevant to the analytics program and clean enough to be trustworthy. The BI team regrouped, turning to the QlikView in-memory analytics software for use against specific data sets stored in Hadoop and other repositories. Bast said the in-memory tools have given Cheezburger’s business users far more flexibility for creating queries on the fly and joining together information from disparate data sources to get answers to their business questions. “Without in-memory, it was really tedious to build reports, especially dynamic,
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
2
PLANNING very customized reports,” he said. “Now we can solve reporting needs much faster than we used to.” That enables users “to get away from the drudge work” and spend more time acting on the traffic data than analyzing it, Bast added. In-memory analytics tools might make it easier for organizations to capitalize on volumes of big data, but that doesn’t mean the combination comes without challenges. The relatively high cost of RAM compared with disk storage has been a barrier to adoption, as have scalability issues related to the memory constraints of servers, Howson said. Those concerns have been somewhat alleviated by falling memory prices and the growing availability of 64-bit systems supporting signifi-cantly expanded memory capacities—but they linger.GOOD GOVERNANCE NEEDED
In addition, data governance is an issue that organizations will need to address as more and more business users get access to in-memory applications, said Tapan Patel, global product marketing manager for predictive analytics and data mining at software vendor SAS Institute Inc. “You have to avoid a scenario where multiple data silos appear,” he said. “Closer integration of in-memory analytics tools with the traditional data layer is going to be critical to avoiding data replication.”
Seamless connectivity with Hadoop—the open source technology that has become nearly synonymous with big data because of its ability to cost-effectively
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
2
PLANNING store massive amounts of structured and unstructured data—is one of the criti-cal integration points for enabling analysis of big data in memory. “In-memory analytics and Hadoop are very complementary technologies, and in most cases they will both have a place [in big data environments],” said John Appleby, head of consulting on deployments of SAP AG’s HANA in-memory computing appliance at Bluefin Solutions, a London-based consultancy and systems integrator.But the linkages between Hadoop systems and in-memory tools are still relatively immature, according to Appleby. He said that Hadoop’s flexibility for handling unstructured data in a schema-less fashion lies in direct contrast to in-memory software’s need to have some level of structure for analyzing data. “The types of [data] models created in the two worlds don’t look the same,” Appleby said. “You have two different foundations in which you need a single view, and no one really has the answer yet. This is a problem organizations are only just start-ing to deal with.” That isn’t stopping Cheezburger. Bast said the company is using QlikView in conjunction with Hadoop to determine what data to look at and then to run analytics against the information to help improve content planning and to detect anomalies that might point to technical or promotional problems—for example, a piece of content that has a large number of comments but doesn’t get a lot of traffic. The result, he added, is less waiting around for queries to run their course: “It’s made our decisions much faster.” n
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
3
ARCHITECTUREIn-Memory, Big Data Combo
Needs Solid IT Foundation
Doing in-memory analytics on pools of big data isn’t just a load-and-go
process. The in-memory-big data pairing raises a raft of IT architecture issues for organizations to sort through before getting started, including system design, scalability and still-evolving data integration requirements. Mapping out the proper hardware infrastructure is one of the first consider-ations. To support in-memory analytics tools, companies must invest in robust, memory-intensive servers while also deciding on the best approach for scaling the systems as analytics needs and big data volumes expand, said John Myers, a business intelligence (BI) and data warehousing analyst at research and consulting company Enterprise Management Associates Inc. in Boulder, Colo.
“One of the biggest architectural decisions when it comes to hardware is whether to scale up into a single piece of big data iron or scale out across multiple machines,” he said. Deploying one large server “means less care and feeding” on the part of systems administrators, according to Myers. But his usual recommen-dation is a scale-out approach—for example, a cluster of commodity servers. “If you scale up into one big box and that box fails, you’re done,” he said. “By scaling out, your points of failure are distributed across multiple nodes.”
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps Locating high-capacity servers stocked with the maximum amount of memory as close as possible to where business users are working can also aid in reducing latency on in-memory analytics applications, said Jeff Boehm, vice president of global marketing at BI and analytics software vendor Qlik Technologies Inc. in Radnor, Pa.
PERSISTENCE PAYS OFF?
Another factor to consider, Boehm said, is whether there’s a need for the in-mem-ory technology to support data persistence in order to prevent information from being lost if a system crashes or an analytics process is interrupted. He added that organizations should also be aware of the data-size limits of different in-memory analytics tools when evaluating the available software options. In-memory system scalability is a long-term concern for online humor net-work Cheezburger Inc. The company has paired the QlikView software with a Hadoop-based big data environment to glean real-time insights into the online activities of website visitors, which enables it to more effectively tailor content for individual users, said Loren Bast, who was director of BI at Cheezburger until April 2013. But as the BI and analytics team looks to expand the in-memory big data analytics system to a growing number of Cheezburger workers, Bast acknowl-edged that there are worries about the cost of in-memory processing.
3
ARCHITECTUREHome Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
3
ARCHITECTURE “While [memory] is not expensive anymore, it’s still not free,” he said. “We have a few high-memory servers running this, but they’re hitting their limitations in terms of how much data we can load in. Also, now we have dozens of users using the system and have no issue keeping the hardware and software costs in line with value. But what happens when we extend out the reporting to more users?” Bast added that both system infrastructure and software licensing costs can “become quite high real quickly when supporting thousands of [business users].”STAY FLEXIBLE FOR THE FUTURE
Flexibility and data integration are other issues that should be on the radar screen of BI and analytics managers looking to support analysis of big data in memory. In-memory analytics and big data technologies alike are still relatively new and continue to evolve. For example, a system architecture might eventually need to encompass more than Hadoop data stores, even though that’s the technology most closely associated with big data. “It’s important that people invest in tools and architectures that are very flex-ible,” Boehm said. “Requirements will change; user needs will change. And if you lock yourself into one architecture that requires only one mode of working, you’re locking out options that may be required as other opportunities come up down the road.”
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
3
ARCHITECTURE ContactLab, an email marketing services provider in Milan, Italy, has already recognized that fact. The company, which has offices in five European countries, is loading email campaign and related Web activity data from a Hadoop system into SAS Institute Inc.’s Visual Analytics software for in-memory analysis. But long term, it expects to integrate forms of relational data, such as trans-actional data about the use of customer loyalty cards, into the in-memory analytics mix, said Mas-simo Fubini, ContactLab’s founder and managing director. “I’m not thinking that Hadoop is going to be the only solution in the future,” Fubini said. “Our relational data is still really important, and I want the ability to mix together the two environments. The future is all about doing data analytics—it’s not about the software. So the real challenge is to [create an environment] where you can change your mind.” n“ The future is
all about doing
data analytics—
it’s not about the
software.”
—MASSIMO FUBINI,
founder and managing director, ContactLab
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
4
PROJECT MANAGEMENTFaster Analytics Speed
Not Enough on In-Memory Apps
Michael Minelli is
vice president of sales and global alliances for the in-formation services division of MasterCard Advisors, the professional services and data analytics arm of MasterCard International. He’s also one of the three co-authors of Big Data, Big Analytics: Emerging Business Intelligence and Analytic
Trends for Today’s Businesses, a book that aims to explain the big data phenomenon
to both IT and business readers.
Minelli, who worked at software vendors Revolution Analytics and SAS In-stitute Inc. before joining MasterCard, has been involved in big data projects at organizations such as Time Inc., Cablevision, Foxwoods, Major League Baseball, Standard & Poor’s and Sony. In an interview with SearchBusinessAnalytics.com, Minelli discussed big data analytics applications and the role that in-memory analytics technology can play in them. One of his pieces of advice: The faster performance supported by in-memory processing won’t provide the hoped-for business benefits unless the analytical results are fed into real decision-making processes. Excerpts from the interview follow:
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
4
PROJECT MANAGEMENTWhat’s the key message in your book?
Michael Minelli: The book’s main theme is that big data analytics are a game changer for the industry, whether you’re in IT or the business. Big data is going to make a big impact, and this is going to continue over time. The message is to think about how to do things differently: “If we can do things faster, then what does that mean for the business?” Big data allows us to innovate and make decisions quickly while transforming the way we do business intelligence. People have been talking about building one version of the truth for a while—that’s the whole genesis of the data warehouse movement. The name of the game was who could build the most valuable re-pository to make better decisions. What’s changed is that it’s not all about what happens in your world, but what happens in other companies and even in other industries. It’s about going from having an insular mind-set around data to having an abundance mentality for leveraging data.
Can you give a couple of real-world examples of the opportunities for taking advantage of big data analytics?
Online up-selling and cross-selling on the fly before a customer’s attention fades. Mining blogs and customer service notes to perform customer sentiment analysis, good and bad. Providing secure e-commerce transactions with built-in anti-fraud controls. Deploying marketing automation that delivers real ROI by
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps enabling actionable insights into customer buying habits. That’s how competitive organizations are using big data analytics today.
What specific role can in-memory analytics technology play in turning big data into a competitive advantage?
It’s all about pure speed—taking advantage of the hardware and RAM capabilities that have become cheaper to create queries on the fly. Doing so removes some of the barriers between IT and the business so that there’s more agility for people to do on-the-fly business intelligence and predictive analytics and to move be-yond traditional sampling techniques if they don’t have to wait 24 to 48 hours for results.
When is in-memory processing for big data analytics not the right fit or more trouble than it’s worth?
It’s the notion of fit to purpose. The main thrust is, do you really need the ad-ditional speed and do you have the types of users and processes in place so that if empowered with this information, they could really do something with it to impact the business? Everyone talks about faster access to data leading to better insights, but the other critical part is connecting that insight to an ac-tual operation. So it’s not just viewing a report, but actually using that report to make a decision on the fly to trigger an event like addressing a fraudulent
4
PROJECTHome Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps
4
PROJECT MANAGEMENT transaction or initiating a cross-sell opportunity. Having faster analytics is great, but [the results] have to be able to make their way into the decision process.So how can organizations get to the level where they truly can take advantage of in-memory analytics in combination with big data?
For starters, IT should work with the analysts to assess the low-hanging fruit where there are some productivity gains [to be had]. A good example is reducing a data mining process from 24 hours to a matter of minutes so that a data scientist can be more responsive to the rapid changes in today’s businesses. The next step would be for the business to develop some use cases where speed can make a dif-ference and then give it a try. The technology part isn’t a no-brainer, but it’s not quantum physics, either. From my perspective, the major challenge is finding and hiring the right talent and then managing that talent to achieve specific business objectives in a reasonable time frame. n
Home Editor’s Note
In-Memory Analytics Tools
Pack Big Data Punch In-Memory, Big Data Combo Needs Solid IT Foundation Faster Analytics Speed Not Enough on In-Memory Apps ABOUT THE
AUTHOR BETH STACKPOLE is a freelance writer who
has been covering the intersection of tech-nology and business for more than 25 years for a variety of publications and websites, including SearchBusinessAnalytics.com, SearchDataManagement.com and other TechTarget sites. Email her at bstack
@stackpolepartners.com.
In-Memory Analytics and Big Data:
A Potent Mix? is a SearchBusinessAnalytics
.com e-publication.
Scot Petersen |Editorial Director
Jason Sparapani |Managing Editor, E-Publications
Craig Stedman |Executive Editor
Melanie Luna |Managing Editor
Linda Koury |Director of Online Design
Neva Maniscalco |Graphic Designer
Doug Olender |Publisher
dolender@techtarget.com
Ed Laplante |Director of Sales
elaplante@techtarget.com TechTarget 275 Grove Street, Newton, MA 02466 www.techtarget.com © 2013 TechTarget Inc. No part of this publication may be trans-mitted or reproduced in any form or by any means without written permission from the publisher. TechTarget reprints are available through The YGS Group. About TechTarget: TechTarget publishes media for information technology professionals. More than 100 focused websites enable quick access to a deep store of news, advice and analysis about the technologies, products and processes crucial to your job. Our live and virtual events give you direct access to independent expert com-mentary and advice. At IT Knowledge Exchange, our social commu-nity, you can get advice and share solutions with peers and experts.