The 4 Pillars of Technosoft’s Big Data Practice
Analytical tools Big Data Use
End-user applications
Big Data Analytics
Visualisation tools
Big Data
Data management systems
Big Data
beyond possible
As new and ubiquitous sources of data such as documents, customer service records, pictures, videos and sensor data that are either unstructured or semi-structured are being thrown out with ever growing volumes and velocity, never greater has been the challenge in both managing it as well as making sense of it.
BIGDATA Rx is Technosoft’s Big Data practice that is committed to working with our customers in this new and exciting field. It’s your prescription for your Big Data headaches.
Businesses have long managed and analyzed data and have earned benefits. We come across a lot of complexity when it comes to data. Traditional datawarehousing and business intelligence deal with data that is structured and stored in a traditional relational database while other data, including documents, customer service records, pictures, videos and sensor data from machines and new information sources such as data from social media and click stream data form the new and ubiquitous source of data which is semi-structured and unstructured. The challenge is how to make sense of the intersection of these different types of data.
We always had a lot of data, the difference today is that significant more of it exists and is growing and it varies in type and timeliness than ever before.
Hence data needs to be managed differently.
Therein lies the opportunity and challenge of Big Data.
The emphasis is on highly advanced analytical (predictive modeling) and visualization tools (Geospatial mapping, heatmap) thereby greatly increasing the value derived.
Our Big Data practice offers consulting, project management and systems integration services across the Big Data stack just described.
Overview
BI/Traditional Analytics - Big Data Analytics
management Data systems End-user applications
Traditional BI suites and online analytical
processing
e-Commerce, search and social gaming
Telematics, Predictive Asset
Maintenance
Visualization tools
Basic visualization applications
Advanced
visualization applications - Geospatial Mapping, Dendograms, Graphs Heatmaps, Parallel
Coordinates etc.
MapReduce programs
Traditional analytics (BI, data mining)
Advanced analytics (Predictive modeling, modeling & optimization)
In-memory computing
Stream mining
Complex eventprocessing
Consulting, Pr oject management and S ystem Integration IT Servic es
Traditional relational
database
Conventional file system
NoSQL database
Data modeling, Data integration and processing, Data
storage and management
Hadoop
Parallel relational database
Hadoop Distributed File System (HDFS)
Streaming data management (Storm / Apache S4)
Difference between traditional and Big Data technology stack
Traditional applications were primarily limited to data visualization, while insights from Big Data analytics are inputs for data-driven end-user applications.
Big Data analytics requires advanced analytic tools on statistics, algorithms, etc.
Visualization tools required are beyond charts and graphs like clustergram, dashboards, etc.
Big Data storage and database
management systems needs to be highly scalable and connected to low latency access mechanisms.
Environment suitable to handle
unstructured data from multiple sources.
Requires Massively Parallel Processing (MPP) and NoSQL querying tools against SQL for traditional database.
Our View of the Big Data S tack
Analytical tools Big Data Use
Big Data
Big Data Analytics
In setting up BIGDATA Rx, the Big Data practice at Technosoft, we were cognizant of the 4 main questions that repeatedly cropped up in our discussions with customers and executives alike –
Our 200 strong team consists of scientists with deep background in machine learning applications across various sectors, skilled programmers in languages like R, Python and Java and business analysts with deep domain knowledge across multiple verticals.
Our diverse capability is spread across Data Mining (40%), Data Integration (30%), Data Visualization (20%) and Data Architecture (10%).
This ensures that the team is best equipped to help you overcome these and other challenges through the application of Big Data concepts and technologies for a variety of industry applications.
Our Practice
Big Data Services and Capabilities
*Relevant industry use cases
Need
Store large quantities of unstructured
data
Faster data access, storage
and analysis
Real-time analysis of high volumes of data
Gain actionable insights from analytics and respond to issues instantly
Services
Data Architecture
Data Integration
Data Mining
Data Visualization
Description
Big Data architecture, RDBMS.
Big Data integration services, Data migration
and Data management.
Text mining, Taxonomy, Classification, Dynamic Neural Networks, Time Series Modeling, Event Sequence Analysis etc.
QlikView, Tableau, SpotFire, Shiny R,
D3 JS etc.
Application areas*
Website click streams Tweets and Facebook likes Sensor data
Emails
Real-time embedded systems
Algorithmic trading E-commerce Social networking
Risk management Customer intelligence Revenue optimization Assortment
Merchandise planning
Energy management SEO optimization Real-time traffic congestion detection using GPS data
How to store large quantities of unstructureddata?
How to achieve faster data access, storage and analysis?
How to accomplish real-time analysis of high volumes of data?
How to respond to issues instantly through the power of actionable insights from analytics?
1.
2.
3.
4.
7 The following visual is a representation of how
Technosoft believes Big Data systems can and should be architected in a complex network where multiple connected devices and sub-systems (power systems equipment, vehicle information systems and geographical information systems to name a few and which are either mechanical, electronic or computer) are emitting high volumes of unstructured data at very high velocities and are of wide varieties.
The characteristics of complex networks that lend itself well to Big Data Analytics are:
Our Big Data
Solution Architecture
Multiple subsystems, for example, routers, switches, cables, software
Each sub-system having characteristic nonlinear dynamical behavior
Sub-systems interact nonlinearly leading to intractable emergent phenomena
Dynamic Neural Networks can be used to predict behaviors of such networks
1.
2.
3.
4.
Complex networks show similar properties and common behavior independent of technology and include Smart Grids, Internet, Computer Networks, Telecom Networks, Transportation Systems, Oil and Gas pipelines and Storage Area Networks (SAN) to name a few.
The data generated typically is network data, device data, usage data and service data. Data is also captured from secondary sources like social media (Twitter for instance). This streaming and non-streaming data is then processed into three different types of storage mechanism:
Data that requires batch processing is moved into the Hadoop cluster
Data that is of high velocity and generated rapidly is moved into the stream processing engine where for example Storm is used to take action And the traditional data warehouse is used to store data that is neither high volume nor high velocity
Subsequently analytics is performed on the stored data. Rule based analytics is performed on high velocity data in which the data streams are compared with the rule base and action taken based on the program logic.
On the other hand predictive analytics is performed on high volume data wherein algorithms are applied to discover patterns and take resulting action.
Lastly the data is then visualized in the forms of bars, charts or dashboards for actionable insights.
BATCH PROCESSING ENGINE
Create Map
Reduce XML
CSV EDI LOG Objects SQL Text JSON Binary
Environment
HADOOP CLUSTER
Databases
File systems
WarehouseData
Predictive analytics
Business intelligence
STREAM PROCESSING ENGINE
Rule Base Complex networks of connected
Devices emit high-velocity streams Offline, batch & stream Processing must work together
On diverse data in motion
Data Visualization Layer
Network Data
Service Data Device
Data
Usage Data
Universal Data Format
EVENT PROCESSING
Social Media STORM TOPOLOGY
Complex System
Data in motion from complex network of connected devices
Analyzing Data in Motion
Multiple sub-systems
Sub-system has characteristic nonlinear dynamical behaviour Sub-systems interact nonlinearly leading to intractable emergent phenomena
Dynamic Neural Networks can be used to predict behaviours of such networks
We can help
A Predictive Asset Maintenance framework that minimizes equipment downtime, improves productivity, reduces service costs and enhances customer satisfaction
A Big Data solution architecture for the era of complex networks of connected devices. Helps analyze data-in-motion and delivers actionable insights
Predictive Modeling on streaming, unstructured data Scalable stream processing system
Actionable visualizations and dashboards
Why Technosoft
Investments in service products like Predictive Asset Maintenance framework
A unique Big Data solution architecture
Cross-functional teams that comprise Data scientists, Visualization experts, Technology & tool experts and domain specialists
Partner ecosystem – Spotfire, Microstrategy, Mirror 42, Tableau and the Open Source World
Data from Complex Networks of Connected Devices Analysis of Unstructured Data in Event Streams Near-Real-Time Analysis of Data-in-Motion Visualization of Analytical Results in Dashboard In summary our Big Data Solution Architecture helps in
© Copyright 2013, Technosoft. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Technosoft. The information contained herein is subject to change without notice. All other trademarks mentioned herein are the property of their respective owners.
Technosoft Corporation is an IT and BPO services provider with headquarters in Southfield, MI, USA and delivery centers in India. We provide information technology, business process outsourcing and consulting services to companies in North America, Australia and New Zealand and Asia-Pacific Regions. As a privately owned company we answer to only two constituencies - our customers and our employees. Our customers rely on us to provide services and solutions that leverage our industry and domain expertise combined with our technology prowess, delivery focus and quality. Our collaborative culture and work environment helps attract and retain exceptional talent which is a key ingredient of our sustained growth. See how Technosoft can go ‘Beyond Possible’ for your organizational needs.
[email protected] www.technosoftcorp.com
beyond possible
BIGDATA Rx is Technosoft’s Big Data practice consisting of scientists and skilled programmers in lanugages like R, Python and Java offering 4-pillar services: data mining, integration, visualization and architecure.
They are specialists in helping companies manage both the challenges and opportunities associated with unstructured data - volumes, velocities and variety.
BIGDATA Rx is Technosoft’s prescription for your Big Data headaches.
Your Prescription for Big Data Headaches Your Prescription for Big Data Headaches