The Technology Evaluator’s
Cheat Sheets
Business Intelligence & Analy:cs
WWW.SISENSE.COM
Summary
•
So1ware Stacks
–
Full Stacks (DB + ETL Tools + Front-‐End So1ware)
–
Back-‐End Stacks (DB and/or ETL Tools Only)
–
Front-‐End Stacks (Front-‐End So1ware Only)
•
Technologies
WWW.SISENSE.COM
So1ware Stacks
DW ETLBACK-‐END STACK
ETL Features
Data Warehouse Features
Data Mart Features
FRONT-‐END STACK
Data VisualizaLon Features
Data Analysis & Discovery Features
ETL
Query/Import
WWW.SISENSE.COM
The Full Stack. When?
•
Centralized data management and storage
– To deliver a single version of criLcal data
– To make data easier for non-‐techies to access, query and share – To simplify on-‐going or ad-‐hoc data management tasks
•
ETL Func:onality Is Needed
– MulLple data sources, or mulLple tables where views are too complex/slow – The volume of data is expected to cause slow performance
– Data needs to be restructured before being delivered to users – Data is dirty (entry errors, value mismatches)
– Required metrics are in different tables or sources
•
To protect the opera:onal systems from rogue queries
WWW.SISENSE.COM
End-‐Users (Business)
Data Warehouse + Data Marts Data Extracts (No DW)
DW
OLAP Cubes, or In-‐Memory Marts End-‐Users (Business) Data Sources ETL / Mash-‐up In-‐Memory Marts Excel/CSV IT Department Data Warehouse ETL / Mash-‐up Data Sources IT Department
Front-‐end Tools Front-‐end Tools
WWW.SISENSE.COM
Data Warehouse: Pros & Cons
DW + Data Marts Data Extracts (No DW)
Approach SoluLon-‐oriented Project-‐specific
Data Quality & Accuracy Higher Lower
Scalability Higher Lower
Single Version of the Truth Yes No
IniLal Investment Higher Lower
Level of Detail Summarized Granular
Owner IT IT or Business (opLonal)
ImplementaLon Time Longer Shorter
Technical Complexity Higher Lower
WWW.SISENSE.COM
Technologies
WWW.SISENSE.COM
Backend Technologies
•
Data Mart-‐Class, we call it “Small Scale”
–
Online AnalyLcal Processing (OLAP)
–
In-‐Memory Databases (IMDB)
•
Data Warehouse-‐Class, we call it “Big Scale”
WWW.SISENSE.COM
Small Scale. When?
•
When there is only a single data source, which
means the data doesn’t need to be consolidated
(ETL) prior to being delivered for business analyLcs
•
When there aren’t many different abributes and
metrics to cross-‐reference (the Data Mart doesn’t
need to have many fields)
•
For a one-‐Lme project (e.g. one dashboard), with no
added requirements, new data sources or other
WWW.SISENSE.COM
Big Scale. When?
Big Scale Small Scale
Max. Data Mart Size Terabyte -‐ Petabytes Gigabytes Max. Number of Fields (1 mart) PracLcally Unlimited Limited Max. Number of Records (1 table) Billions Millions
•
For a single centralized data store to serve
mulLple users and mulLple business scenarios
(single version of the truth)
•
When data volumes are large, are rapidly
WWW.SISENSE.COM
Data Mart-‐Class
Technologies
WWW.SISENSE.COM
In-‐Memory Databases (IMDB)
•
Achieves fast performance by loading the enLre
data mart into RAM, thus avoiding slow disk-‐
reads (“I/O Boblenecks”)
•
Categorized as “Small Scale” because the size of
data mart is effecLvely limited by the size of
RAM, placing in the Gigabyte scale category
WWW.SISENSE.COM
Online AnalyLcal Processing (OLAP)
•
Achieves fast performance by pre-‐calculaLng metrics (field
aggregaLons) for all sets and subsets of unique values in all
dimensions (fields) ‘over-‐night’. This avoids performing these
slow operaLons in real-‐Lme during the work-‐day.
•
Categorized as “Small Scale” because storing the results of
these pre-‐calculaLons (“The Cube”) takes exponenLally more
storage resources than the actual raw data does, limiLng the
actual size of raw data that can make up a cube to GB scale.
•
The query engines behind most OLAP technologies are based
WWW.SISENSE.COM
Data Warehouse-‐Class
Technologies
WWW.SISENSE.COM
So1ware Appliances
A so1ware appliance is a soUware applica:on
that might be combined with just enough
operaLng system (JeOS) for it to run op:mally
on industry standard hardware (typically a
WWW.SISENSE.COM
Computer Appliances
A computer appliance is generally a separate
and discrete hardware device with integrated
so1ware (firmware), specifically designed to
provide a specific compuLng resource.
WWW.SISENSE.COM
Distributed Databases
A distributed database may be stored in
mulLple computers, located in the same
physical locaLon; or may be dispersed over a
network of interconnected computers.
WWW.SISENSE.COM
Big Scale Technologies, Compared
SoUware
Appliance Computer Appliance Distributed Databases
WWW.SISENSE.COM
Full-‐Stack Vendors
ETL Appliance SoUware Hardware Appliance OLAP IMDB In-‐Chip
WWW.SISENSE.COM