1
Big-Data
Analytics
Using big data analytic skills from the cloud is
more effective, cheaper, and easier than many
companies think. In many cases it’s even safer.
Many organisations are unknowingly in
possession of valuable assets, namely data.
Data on customers, processes, or changes in
the external environment.
This data can derive from a variety of systems: ERP, CRM, register data, data on website clicking behaviour, conversations on social media, etc. Merging and automatically analysing this data can yield highly useful information. Information that provides a better basis not only for important strategic decisions, but also for real-time decisions about opportunities and risks, and serves as intelligent support for many day-to-day routine decisions. In some cases, it can even facilitate full automation
of decision-making processes.
Solvinity is the new name for ASP4all and Bitbrains. A new name calls for a new identity, which is why existing customer case studies of both companies have been formatted to the Solvinity brand.
2
Some inspiring
examples
Here you will find some inspiring examples of big data analytics. The appli-cations in your organisation probably focus on slightly different areas, but translating these examples to your situation may give you ideas which you can use in your environment.
of big data
Big data has turned into a hype and, as a result, the term is often used inap-propriately. Although the term suggests that big data is mainly about volume, the main improvement on traditional BI is that it supports analysis of structured as well as unstructured data. This is why we like to focus on the five Vs of big data:
Volume: the volume of the datasets.
Velocity: the speed with which data is generated, distributed and used. Variety: various types of structured and unstructured data, e.g. call centre,
conversations via social media, and photos and videos.
Veracity: uncertainty as to whether data is accurate, due to typos or the mes-sage content.
Value: although ‘value’ is often disregarded, we consider it highly im-portant. After all, big data is not about collecting as much data as possible, but about generating value from your data.
3
Reference case 1
Crowd control on Coronation Day
The City of Amsterdam feared that the city centre would be congested on Coronation Day. For this reason, it commissioned a website and app (Waarisdekoning.nl) with a heat map showing the most crowded areas along the route the Royal couple would be taking. The heat map was continuously fed with user data from Twitter streams. Visitors could use the heat map info to pick a spot along the route to cheer on the new king and queen. In addition to the heat map, the website and app showed a Twitter stream on the coronation topic.
Coronation Day was a huge success. No less than 7.4M hits we processed on the Waarisdekoning.nl website and app. On average, the website and app continuously supported more than five hundred
simultaneous users, with peaks of well over a thousand users taking a look at the online map at the same moment. This resulted in a continuous data flow averaging 100 Mbps.
The underlying infrastructure consisted of a flexible cloud computing platform of 39 servers across multiple data centres ensuring availability and error tolerance. Together, the systems had 152 cores and 588GB of RAM. Additionally, two Storm clusters with Mongo DB provided real-time analysis of the Twitter streams, while a Hadoop cluster took care of big data analysis.
Solvinity provided this solution, and it can easily reuse the configuration for any other large-scale events, like festivals, national celebrations, sports matches, or even emergencies. It can actually be deployed within a few hours. A valuable additional feature Solvinity can realise is real-time analysis of tweets for the purpose of monitoring what is actually going on in the city at any given time. After all, tweets on events such as a street brawl are often online before the police are alerted. Analysing Twitter messages therefore is a good way of improving security during large-scale events.
4
provides insight
For some analyses, speed is crucial. If it takes thirty minutes before the Amsterdam police receive the Twitter analysis results, a street brawl between two people could already have escalated into a large-scale fight.
Consequently, Solvinity frequently runs performance tests in the Hadoop environment that it uses for big data analytics. The Terasort test is used to measure how much time is required to analyse a terabyte of data. Itis comparable to the 0-to-100 acceleration test for cars. The score achieved by the Solvinity Hadoop platform is 6.01 minutes using a 6-CPU system with 24GB RAM and 6TB disk space. For reference, Cisco can do this 1TB test in 5minutes, but needs 32 CPU’s, 256GB RAM and 24TB of disks to do this. That’s 400-500% more gear (and costs…) for a modest 15% more performance.
6 min
1 sec
6-CPU
24GB
5
Reference case 2
Improved probability of
cure with clinical analytics
Hospitals increasingly use analytics to improve their operational and logistical processes by, for instance, improving their surgery planning based on patient volumes and the availability of surgeons, anaesthetists, support staff and hospital beds. Hospitals can achieve considerable cost savings through improved planning of limited capacity. Use of clinical analytics is also on the
increase. When doctors can choose between different treatment options, they can make better decisions if they know the outcome of the different treatments for patients in similar circumstances. This often involves huge data volumes from many different sources: structured data from electronic patient files, images from PET/MRI/CT scans, echographies, ECGs, and even social media messages from patients on their experiences during their treatment. Some hospitals link external data sources to their data platform, such as information from the weather services (impact on many respiratory diseases) and statistical information about the social status of the part of a town or city where a person lives (when admitted to Intensive Care, individuals from lower social classes have a higher mortality rate than individuals from higher social classes).
Many hospitals use SAS solutions to analyse big data. These systems often run in their own data centre. After all, it
contains highly sensitive information they prefer not to store elsewhere. However, hospitals are increasingly aware that they are unable to meet the ever increasing compliance requirements, such as a SOC2 certification for data security, availability and confidentiality. In fact, their data is safer at the Solvinity data centre than in their own environment. An additional benefit of outsourcing to Solvinity, aside from use of the complex analytical solutions of SAS, is Solvinity’s extensive experience in analysing this type of data, including using SAS analytics. Moreover, the Solvinity datacentre offers security, availability, and confidentiality at levels higher than most hospitals can offer themselves.
Solvinity has an extensive track record in the medical world, particularly in the area of high-performance computing (HPC). One of our customers is the Centre for Molecular and Biomolecular Informatics (CMBI) at the St. Radboud University Medical Centre, which purchases computing capacity in the cloud for research into protein structures. Solvinity offers CMBI the flexibility to quickly start and scale up the calculations on a cost effective pay-per-use basis. Moreover, CMBI recognizes that its data is more secure at Solvinity than on CMBI’s own servers. Our Service Organisation Control (SOC2) report guarantees that our services meet all of the (inter)national requirements for security, availability and confidentiality.
6
Olympic gold thanks
to video analysis
The Dutch women’s rugby team has set its mind on a place in the Olympics in Rio de Janeiro. In the 2016 Summer Olympics, rugby (the sevens version) will be on the programme for the first time. In rugby sevens, each team consists of seven players that plays two seven-minute halves (in regular rugby competitions, teams consist of fifteen players and matches consist of two 45-minute halves).
This fast version of the game calls for a different strategy and different physical skills. For quick analysis of match tactics and identification of physical deficiencies of players, Dutch sporting authority NOC-NSF has proposed the use of image analysis to the Dutch Rugby Union. Gareth Gilbert, head coach of the national women rugby sevens team, says: “In top-level sports, small details often make a huge difference. Using data analysis techniques, we can analyse teams and individual players at a physical, technical and tactical level to increase our knowledge of small details that can improve our
understanding of the game. We are hoping that this will help us to win a medal in Rio.”
Internationally, the Netherlands is a leader in the use of real-time analytics in sports. Maurits Hendriks, leader of the 2012 Dutch Olympic Team, experimented extensively with analytics during his years as coach of the Dutch men’s hockey team. He speaks from experience when he says: “Coaches cannot see everything that happens on the field. Filming a hockey or rugby match with three widescreen cameras will provide a complete picture. Analysing the images on a tablet in real-time during the match will provide a wealth of new information. For example, the left winger of the opposing team could be moving slightly slower than normal, perhaps due to a minor injury. This information is useful to your defenders.”
Video analysis has already helped the
Netherlands gain success on the Olympics in sports such as hockey, swimming, gymnastics, and BMX. NOC-NSF has high expectations of the women’s rugby team. Data analysis is outsourced to Capgemini. Capgemini uses the SAS Visual Analytics (VA) platform provided by Solvinity as a standard service. Capgemini selected Solvinity because of the speed and reliability of the underlying platform.
7
Putting
the business
challenge first
Each of these examples shows that with big data analysis, the business outcome (whether social, commercial, medical, etc.) must come first. Unfor-tunately, many organisations still put technology first. A frequently asked question at conferences and other events is “Do you use Hadoop?” In our opinion, this is not the right question. After all, technology is a tool, not a goal. Technology as such does not provide an answer to the question of what kind of value should be generated from data. That is the question the business owner needs to answer, with the assistance of the CIO or IT manager, who needs to inform the business of the possibilities, supported with examples. Once an organisation recognises the added value of big data analytics, the work is best left to specialists. Because analysing big data is specialist work.
Big data
analytics
from the cloud
Because of the complexity of the technology and the high investments required for building and maintaining an in-house big data platform, in most cases it is advisable to use a cloud-based service. Many cloud providers offer Hadoop as a platform for big data analytics in addition to their IaaS envi-ronment, putting the technology within reach of a large number of organisations.
However, technology is only part of the solution. Getting big data analytics to work from the cloud involves many technical challenges. Just as a racing car does not get proportionally faster as its horsepower increases, the scalability of a big data analytics environment does not increase in propor-tion with the CPU capacity. Adding data sources will also not result in more usable answers. It is all a matter of proper tuning.
For instance, the application may be scalable, but the middleware may not. Big data analytics in the cloud is more than just providing the tech-nology components. The actual performance is determined by the team that tunes it for optimum performance. That team needs to be experienced, and all of the required components must be readily available to allow the team to respond quickly and flexibly.
8
No matter what product you select – Apache Hadoop, Storm or the analytical solutions of SAS – some preconditions must always be met:
Access to the data
Data is stored in a variety of systems: ERP, CRM, Outlook, the social media platform, the audio file system of the contact centre, etc. Data can also be stored in different file formats and in different ways: on disk, in flash memory, or even on tape. Data can also flow in via real-time feeds. To handle those different data sources, and arrive at reliable results is a deeply specialist task.
Security and availability
Cloud providers must have all of the required security measures in place. Be sure to check on the measures for preventing system outage, which would make your data inaccessible. The most effective way to do this is to request a SOC2 audit.
This guarantees that the services meet all of the domestic and international requirements for security, availability and confidentiality.
Performance
Hadoop and Storm were designed for big data, but these platforms need to be managed for optimum performance. Achieving the SLAs demanded by your internal customers calls for in-depth knowl-edge and expertise on the platform and the appli-cations.
Integration
Hadoop or Storm need to connect and relate to the other components in your infrastructure. Smart businesses combine big data ideas and insights with their traditional data sources (BI). BI and big data should be combined, and not treated as two separate environments.
About Solvinity
Solvinity develops innovative customer solutions and provides a high level of security to companies with safe access to private, public and hybrid clouds. Solvinity specializes in cloud services for managed hosting, analytics, workplace and security. The company is an expert in hosting critical infrastructures. Being ‘Secure and compliant by design’ is a leading principle, which is supported with certifications according to international and national standards like ISO27001, ISO14001, ISAE3402 type II, SOC2 and NEN7510. Its annual turnover amounted to 35 million euro’s in 2014. There are 180 employees working in the Netherlands. For more information please visit www.solvinity.com, or follow Solvinity on Twitter and LinkedIn.
Postall address Solvinity B.V. Postbus 58 1270 AB Huizen Visiting address Solvinity B.V. Energieweg 8 1271 ED Huizen T +31 (0)35 523 26 26 F +31 (0)35 523 26 27 www.solvinity.com [email protected]