7 Things to Consider Before Implementing
a "Big Data" Solution
Power Your Business with Big Data Solutions
Comrise │ Make IT Happen
1301 State Route 36, Suite 9-10, Hazlet, NJ 07730 • 1-(732)-739-2330 • www.comrise.com
Introduction
In the coming years, big data will impact almost every area of life. It will be the lifeblood for private businesses, non-profit organizations and public services. It will help hospitals provide better service, financial companies make wiser decisions and businesses gain a competitive
advantage. Most companies will need to implement a big data solution, in order to take full advantage of the wealth of information that is available.
Outside the IT sector, however, few
businesses are positioned to select a big data solution. The term “big data” is still new to many who do not work in this industry, and choosing a specific system is daunting.
The following seven questions are designed to help businesses, governments and other organizations wade through the big data solutions that are available today. The questions are loosely based on Gartner Inc.’s definition of big data:
“Big data” is high-volume, -velocity and - variety information assets that demand cost- effective, innovative forms of information processing for enhanced insight and decision making.
Volume, Velocity and Variety
Is the solution capable of handling a high volume?
Any big data solution must be just that:
Big. Businesses and organizations have exponentially more information today than they did just five, let alone 10 or 20, years ago. One study by McKinsey found that in every sector, there are companies that have at least 100 terabytes of data, and many have
more than 1 petabyte. (1 terabyte is equivalent to 1,024 gigabytes, and 1 petabyte is equivalent to 1,024
terabytes). By 2016, the healthcare industry alone is expected to have 15 zettabytes of data – that is about 15 million iPads’ worth of information. A big data solution must be capable of managing all this information.
Is the solution capable of handling a high velocity?
Big data is dynamic, as opposed to static. It does not sit in archives for years until someone pulls out a dusty file. Rather, it is always being accessed, manipulated and compiled. In tech lingo, the data has a high rate of input and a large quantity of
events. A big data solution should not need hours – or even minutes – to compile the data before delivering information. It
should deliver instantaneous information. In some businesses, the difference between live information and a data stream that is delayed by two minutes is significant:
News sites need to stream videos instantly
Financial businesses need live market analysis
Webmasters want live analytics
Municipalities are already using life traffic data to monitor roadways Is the solution capable of handling a high variety?
Data is more than just lists of numbers. Big data includes structured, semi-structured and unstructured data, and a solution must be able to analyze all these types of
data. Customer relationship management (CRM) files and XML spreadsheets must be
Comrise │ Make IT Happen
1301 State Route 36, Suite 9-10, Hazlet, NJ 07730 • 1-(732)-739-2330 • www.comrise.com
merged with emails, phone records and social media messages. Even though some of these types of data, namely the
unstructured data, are difficult for a computer program to analyze, it should at least let users organize the unstructured data in ways that makes the data easier to use.
In order to organize all these types of data in meaningful ways, a big-data software
application must:
Handle a broad range of file formats
Include real-time, analytic and search data
Adapt to new types of files and ways of organizing data
Complexity
Will the solution be able to handle complex environments?
Gartner’s definition of big data does not explicitly mention the level of complexity involved with managing big data, but this is a major consideration for businesses that have growing amounts of data. Because it includes so much information, big data is too large for a single server. It often is too large for a single data center. The
information must be spread over multiple data centers, which may be physically located in different areas.
There are three ways big data can be
distributed among multiple data centers and transferred among these centers. An on- premise, cloud or hybrid solution can be developed. Each has advantages and disadvantages over the others. While a full description of each solution is too long to include here, a few of the main features can
be mentioned. On-premise solutions, in some cases, afford greater security, and they may be the only option for organizations that have highly confidently data. (For instance, the Pentagon has a network not connected to the internet). Cloud solutions, on the other hand, tend to be less expensive to install and maintain. Hybrid solutions attempt to take advantage of both on-site and cloud features.
When considering the complexity of the environments data will be stored in,
hardware is often the primary concern. The software use, though, should also be
considered. There are good software applications for on-site, cloud and hybrid big data solutions, but the best software program for one solution is not necessarily the best for the others. Businesses and organizations should look for an application that was designed for the type of solution they are implementing.
Return on Investment (ROI)
Is the solution affordable?
As Gartner’s definition mentions, any big data solution must be “cost-effective.” In part, this means the solution must be affordable. (This is the first half of determining the ROI).
When comparing the total price of different solution, businesses and organizations must look beyond the initial price tag of the solution. In addition to the price of the solution, the following factors will influence its total cost:
Add-on features that cost more
Ongoing subscription fees
Comrise │ Make IT Happen
1301 State Route 36, Suite 9-10, Hazlet, NJ 07730 • 1-(732)-739-2330 • www.comrise.com
Promotional prices that will expire
The hardware necessary to support the solution
The IT staff required to maintain the solution
The time it will take to implement the solution
What is involved with implementing the solution?
In addition to the above financial costs listed above, there are also indirect costs involved with implementing any big data
solution. The optimal solution will be one that reduces overall time to implement as well as ongoing costs. Some solutions take time, often months, to fully implement a well-designed strategy for handling big data. Below are some of the steps involved with implementation. A delay in any of these steps will delay the entire project.
New hardware may need to be installed to support the solution unless the solution runs in the Cloud.
For some traditional solutions, Data must be cleaned up, since any solution will only be as good as the data is. Incomplete, inaccurate and duplicate data must be fixed before installing any new software to manage the data under current standard solutions.
Customized reports must be design unless a flexible user centric BI tool is employed. While many providers offer turn-key installations that include basic reports, businesses must create their own customized
reports to realize all the benefits of the solution.
Internal expertise must be developed for an in-house solution. For a period of time, businesses can rely on a provider for help with the big data solution. Eventually, though, a certain level of expertise on the solution should be developed
internally if the solution is to remain in-house.
Is the solution scalable?
Because implementing a big data solution could be a major investment of resources, businesses cannot afford to regularly install new solutions. The solution that is set up today must work tomorrow, next week, a month from now and next year. Any big data strategy must be a long-term solution to managing large amounts of information. It should be both compatible with future technology and scalable.
Most providers can promise that their big data solution is going to be compatible with the software that is developed five years from now, although new software is typically made to be backwards-
compatible. Businesses that are comparing different software solutions for big data, though, should look for a solution that is continually being updated and
enhanced. This may be an open-source application, or it might be from a specific developer. Either way, the software that is going to be used should have a history of being updated as needed. This is an indication that it is always being improved and will work for years to come.
A solution also needs to be able to grow with a business. It should be easily
Comrise │ Make IT Happen
1301 State Route 36, Suite 9-10, Hazlet, NJ 07730 • 1-(732)-739-2330 • www.comrise.com
scalable. If data doubles or quadruples, as it is forecasted to do in the coming years, the software and hardware should either be able to meet the challenge or easily scaled up. Integrating a couple new
servers/clusters is much easier than re- implementing an entirely new solution.
Finding a solution
There are many factors to consider when looking for a big data solution. These seven questions are simple mean to be a guide to help businesses through the process of selecting a solution that is appropriate for them. They serve as a guide, reminding businesses that any big data solution should be:
Able to handle the volume, variety and velocity of big data
Capable of meeting the complex needs of big data
Offer a positive, long-term ROI, after the financial costs,
implementation and scalability are considered
No single big data solution will meet every business’ needs. Instead, each business should look for a customized solution that is appropriate for their situation. For more information, businesses should consult a professional provider that is experience with designing, implementing and maintaining big data solutions. In addition to being familiar with the field, the provider should be willing to take the time and listen to each business’ needs, so a customized solution can be developed.
For More Information
To learn more about Comrise Big Data solutions and expertise in your industry, please visit us at
https://www.comrise.com/big-data
About Comrise
Established in 1984, Comrise is a global consulting firm with headquarters in the U.S.
and China. Our teams specialize in Managed IT, Big Data, and Workforce Solutions – Staff Augmentation, Recruiting, RPO, and Payrolling. With nearly 30 years of
experience, Comrise provides local talent and resources on a global scale.
© Copyright Comrise Inc. 2013
Comrise
Concord Center Building 2 1301 State Route 36 – Suite 9 Hazlet, NJ 07730
Produced in the United States of America June 2013