• No results found

How to Build MicroStrategy Projects on Top of Big Data Sources in the Cloud

N/A
N/A
Protected

Academic year: 2021

Share "How to Build MicroStrategy Projects on Top of Big Data Sources in the Cloud"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

How to Build MicroStrategy Projects on Top of Big

Data Sources in the Cloud

(2)

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business and consumer studies, Surveys, Polls

All business performance drivers – Operational efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts, tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to machine communication

Operational efficiency, Cost control, Risk avoidance

SOURCE

VALUE

Use Cases

for Big Data in the Cloud

(3)

Traditional sources moving online

How to take advantage of new technologies

Traditional relational data sources in the cloud

• RDBMS installed in the cloud (e.g. HP Vertica on Amazon EC2)

• Managed RDBMS in the cloud (e.g. Amazon RDS)

Relational Database technology build for the cloud, e.g.

• Amazon AWS (EMR, Redshift, Aurora)

• Google BigQuery

• RDBMS vendor cloud services (e.g. Microsoft, Oracle, Teradata, HP, IBM,

SAP, …)

Cloud services simplify and automate many aspects of data management,

(4)

#mstrworld

Some Database Features Require Conscious Design Choices

Query time often dominated by data access with significant performance impact

4

Data organization

Columnar vs. row based

Minimize data access

Partitioning key selection

Data sorting

(Index selection/strategy)

Compression (on/off; algorithm)

Approximate calculation (e.g. HyperLogLog)

Access and process data in parallel

Data distribution in MPP databases to minimize data movement

Existing best practices for developing MicroStrategy applications apply

Make sure to take advantage of db features designed for analytical workloads

Look for best practices to take advantage of data source strengths in

(5)

Traditional sources

moving online

Company, Government, Financial sector, Business and consumer studies, Surveys, Polls

All business performance drivers – Operational efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts, tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to machine communication

Operational efficiency, Cost control, Risk avoidance

SOURCE

VALUE

Use Cases

for Big Data in the Cloud

(6)

#mstrworld

Identifying Value in Data Requires Utmost Flexibility

Static data models get in the way of analysis at the speed of thought

6

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk management, Fraud detection

SOURCE

VALUE

Technical Characteristics:

Unknown data sources are analyzed for

potential new business value.

Analysis necessary to support the

development of new business models

Data models don’t exist (yet).

(7)

A n a ly ti ca l C o m p le xi ty U se r S ca le

• Trained in modeling and coding • Use a variety of tools

• Want their favorite tools • Look for the truth

• Analytical amateurs • Power users of BI tools • Want to use the right tool • Look for the business edge

• Make the daily decisions • Some may be power users • Most need simple tools

• Look for actionable information

Data Scientists

Business Analysts

Business Users

Back Office Front Line

MicroStrategy Supports All Analytic Needs

(8)

#mstrworld

Choose how to access and analyze data

MicroStrategy

Provides Flexible Data Modeling Options

Direct

Unified MicroStrategy Metadata

• Reusable Data • Reusable Objects • Reusable Design Report

Modeled

Visual Insight Dashboard

ID scans Online

click-stream Application logs

Call/service records Report Dashboard Visual Insight

Flexible data access

• Schema on read

• Supports quick iterations

(9)

Traditional sources

moving online

Company, Government, Financial sector, Business and consumer studies, Surveys, Polls

All business performance drivers – Operational efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts, tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to machine communication

Operational efficiency, Cost control, Risk avoidance

SOURCE

VALUE

Use Cases

for Big Data in the Cloud

(10)

#mstrworld

The Web 2.0 Phenomenon Introduces Specific Challenges

Data access, data structure, and data meshing

10

Web 2.0

phenomenon

Content generated from social media posts, tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand management, Viral marketing

SOURCE

VALUE

Access data where it exists

Web 2.0 data stored in relational data sources

Online services that also provide data services

E.g. Salesforce.com

Online services that provide data

Social

Government

Weather

MicroStrategy offers three ways to access Web 2.0 data

Data often requires structuring or

flattening for analysis

For optimal value data from

multiple sources need to be put in

context

(11)

User /

Departmental Data

Data Warehouse Appliances Big Data & NoSQL Relational Databases Multidimensional Databases Columnar Databases SaaS-Based App Data HANA BigInsights

Parallel Data Warehouse

Elastic Map Reduce Analysis Services Redshift B ri n g A ll R e le va n t D a ta t o D e ci si o n M a ke rs Distribution

No Data Left Behind

(12)

#mstrworld

D

A

T

A

P

R

O

C

E

S

S

IN

G

,

A

N

A

L

Y

T

IC

S

&

D

E

L

IV

E

R

Y

Dashboards Self-Service Analytics Reports and Statements OLAP Analysis

MicroStrategy Analytics Platform

1. Direct connection to source

• Parse structure with lightweight “Schema-on-read” functions • Import data or Create a modeled

environment

2. Using Web Services

• Requires data to be exposed as a Web Service

• Data will need to be structured prior to access

3. Offline “Process and Store”

• Using specialty analytics (text, streaming, image processing) and stored as structured

• Text Analytics Module

Semi-Structured Data

Unstructured Data

D

A

T

A

S

T

O

R

A

G

E

Web Logs Social media posts

Surveys Server Logs Geo-spatial

E-mail Image Audio Video

Sensor + Machine Data Documents

(13)

MicroStrategy Offers Several Paths to Mesh Data For Analysis

Integrating Modeled BI and Self-Service BI

Multi-Source Pushdown Joins

Structured BI Content

Consumption

Structured Data:

Architect

Structured Join:

Multi-Source Model

Corporate Data Sources

Dashboards and MicroApps

Cubes from Model

Ad Hoc / Visual Insight Join Datasets in Documents

Self Service BI Content

Creation

Self Service Data:

Data Import

Self Service Join:

Document Data

(14)

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business and consumer studies, Surveys, Polls

All business performance drivers – Operational efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts, tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to machine communication

Operational efficiency, Cost control, Risk avoidance

SOURCE

VALUE

Use Cases

for Big Data in the Cloud

(15)

Internet of

things

Machine generated sensor data and machine to machine communication

Operational efficiency, Cost control, Risk avoidance

SOURCE

VALUE

Find Insights in Vast Amounts of Machine Generated Data

Machine generated data often does not lend itself for traditional OLAP analysis

Apply the methods of predictive analytics and data mining to

machine generated data

(16)

#mstrworld

Primary Work Horses of Data Mining

“Which Techniques Do You Use Most”

= MicroStrategy Native = via PMML  = via R

Source: 2013 Rexer Data Miner Surveys

www.RexerAnalytics.com

Over 1,250 Data Miners from 75 Countries

MicroStrategy Support for Predictive Analytics

(17)

Predictive Analytics Are Part of MicroStrategy Function Library

Average Mean Count Sum Maximum Minimum Median Mode Product Rank Percentile “N”-Tile N-tile by Step N-tile by Value N-tile by Step and Value

Reporting

Add Days Add Months Current Date Current Date & Time Current Time Day of Month Day of Week Day of Year Days Between Month Start Date Month End Date Months Between Year Start Date Year End Date

Date and Time

Standard Deviation Standard Deviation of a Population Variance Variance of a Population Geometric Mean Average Deviation Kurtosis Skew Statistical Aggregate Running Total Running Std Deviation Running Std Deviation of Population Running Minimum Running Maximum Running Count Moving Difference Moving Maximum Moving Minimum Moving Average Moving Sum Moving Count Moving Std Deviation Moving Std Deviation of Population

First or Last Value in Range Exponential Weight Moving Avg Exponential Weight Running Avg OLAP Functions Beta Distribution Beta Inverse Binomial Distribution Probability Chi Distribution Chi Inverse Confidence Correlation Coefficient Covariance Critical Binomial Distribution Chi Test (Independence) Cumulative Binomial Distribution Exponent Distribution F-Probability Distribution F-Test Fisher Transformation Gamma Distribution Gamma Inverse Gamma Logarithm Homoscedastic Ttest Heteroscedastic Ttest Hypergeometric Distribution Intercept Point Inverse of Lognormal Cumulative Distribution Inverse of F Probability Distribution Inverse of Fisher Inverse of the Std Normal Cumulative Distribution Inverse of the T-Distribution Lognormal Cumulative Distribution Mean T-Test Negative Binomial Distribution Normal Cumulative Distribution Normal Distribution Inverse Number of Permutations for a Given Object Paired T-test Poisson Distribution (Predict Number of Events) Pearson Product Moment Correlation Coefficient RSQ (Square of Pearson) Slope of Linear Regression

STEYX (Standard Error of Predicted “y”Value) Standardize Standard Normal Cumulative Distribution T-Distribution Variance Test Weibull Distribution (Reliability Analysis) Statistical Accrued Interest Accrued Interest Maturity Amount Received at Maturity Bond-equivalent Yield for T-BILL

Convert Dollar Price from Fraction to Decimal Convert Dollar Price from Decimal to Fraction Cumulative Interest Paid on Loan

Cumulative Principal Paid on Loan Depreciation for each Accounting Period Days In Coupon Period to Settlement Date Days In Coupon Period with Settlement Date Days from Settlement Date to Next Coupon Double-Declining Balance Method

Interest Rates Interest Rate Interest Payment Internal Rate of Return Interest Rate per Annuity Macauley Duration Modified Duration

Modified Internal Rate of Return Next Coupon Date After Settlement Date No of Coupons Settlement and Maturity Date Nominal Annual Interest Rate

No of Investment Periods Net Present Value

Payment on Principal Price

Price Discount Price at Maturity Present Value

Prorated Depreciation for each Period Straight Line Depreciation

Sum-Of-Years' Digits Depreciation T-BILL Price

T-BILL Yield

Variable Declining Balance Yield

Yield for Discounted Security

Financial

Absolute Integer A-cosine Ln

Hyp A-cos Log A-sine Log10 Hyp A-sine Mod A-tan Power A-tan2 Quotient Hyp A-tanRadians Ceiling Randbetween Combine Round Cosine Sine

Hyp Cosine Hyp Sine Degrees Square Root

Math Functions Association Rules Clustering General Regression Mining Neural Network Regression Rule Set

Support Vector Machine

Time Series Train Association Train Clustering Train Decision Tree Train Regression Train Time Series Tree Model Variants

(18)

#mstrworld

Deploy Any of 5000+

Open Source R

Analytics

As a MicroStrategy metric, use models and functions in any report or dashboard

MicroStrategy R

Integration Pack

Create Your Own

Custom Functions

MicroStrategy Custom

Function Plug-in

Import Predictive

Models from Popular

Packages

PMML Model

ƒ

Apply(X)

(19)

Industry’s most powerful SQL Engine and 300+ native analytical functions

Predictions

Relationship Analysis

Benchmarking

Trend Analysis

Data Summarization

A n a ly ti c a l M a tu ri ty

What is likely to happen based on past history?

What factors influence activity or behavior?

How are we doing versus comparables?

What direction are we headed in?

What is happening in the aggregate?

Optimization

What do we want to happen?

World’s most popular advanced analytics tool. Free, open source. More Specialty Tools

(20)

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business and consumer studies, Surveys, Polls

All business performance drivers – Operational efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts, tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to machine communication

Operational efficiency, Cost control, Risk avoidance

SOURCE

VALUE

MicroStrategy Supports All Use Cases for Big Data in the Cloud

References

Related documents

Berdasarkan latar belakang penelitian itu, pokok permasalahan adalah bagaimana leg- islasi hukum kewarisan Islam menuju hukum nasional: analisis nilai-nilai keadilan dalam

Major recommendations for the NIH included: identifying opportunities to facilitate coordination between and among the Clinical and Translational Science Award program, Cancer

34 For this paper, the important developments are likely to be: (i) that prices for power sector and industrial consumers will continue to move upwards, probably reaching

Former Justice of the Supreme Court of the State of New York in Westchester County. Attorney

Carpenter ants mainly live outside, by they can also establish their nests inside of homes and buildings where moist, wet or damaged wood is found.. While they do not “eat” wood

foreign policy and intervention in Guatemala’s economy and through the 1954 coup did indeed change and redirect the Guatemalan national government and as a result the lives of

Sanjeev Churiwala: So I think the overall Diageo portfolio to the overall USL portfolio still a small amount but if you look at our Press Release what we are really talking

The third step, learning to listen to the process of reactivity, allows us to turn the energy of reactivity into power.. By power I mean the possibility to create change rather