• No results found

3 Case Studies of NoSQL and Java Apps in the Real World

N/A
N/A
Protected

Academic year: 2021

Share "3 Case Studies of NoSQL and Java Apps in the Real World"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)

3 Case Studies of NoSQL

and

Java Apps in the Real World

Eugene Ciurana [email protected] - pr3d4t0r ##java, irc.freenode.net

This presentation is available from:

(2)

About Eugene...

15+ years building mission-critical,

high-availability systems

15+ years of Java work

Open source evangelist

MapReduce + Hadoop early adopter

VP of R&D at badoo.com - largest social

network in Europe (120M subscribers worldwide!)

State of the art main line of business at

the largest companies in the world - not a web guy!

(3)

Very Important!

Please Ask Questions!

(4)

What Is NoSQL?

Database...

Horizontally scalable

Non-relational

Built-in application support

Custom file system designed for supporting NoSQL

operations

Best for non-OLTP applications

Unstructured data

(5)

NoSQL Topology

Virtual File System

logical table management, load balancing, garbage collection (HDFS, GridFS, Hypertable)

Tablet Server 0

Tablet Server 1

Tablet Server n

Distributed File System

FS 0 FS 1 FS 2 FS n

Node Node Node Node

(6)

Areas of Application

Document storage and management

Object databases

Graph databases

Key/value stores

Eventually consistent key/value stores

Financial modeling

Click stream analytics

Simulations

Protein folding

(7)

Brewer’s CAP Theorem

Pick Any Two

C

A

P

Consistency Availability Partition tolerance Relational Key-Value Column-Oriented Document-Oriented

RDBMs (Oracle, MySQL), Aster Data, Green Plum, Vertica

Dyn amo

, Vol demo

rt, T okyo Cab inet , KAI ,C assa ndra , Simp leD

B, C ouch DB, Ria k mo ng oD B, T erra st ore ,D atast ore , H yp ert ab le , H base , R ed is, Be rke ley D B, Me mca ch eD B, Sca la ris

Pick any

two!

(8)

Three NoSQL Systems

mongoDB

Horizontally scalable

Document-oriented database

No JOIN operations, no row level locking

GigaSpaces XAP

Data grid for replacing application servers

Event processing model

Front-end to various data stores (SQL and NoSQL)

Hadoop/Hive/HBase

MapReduce framework foundation

Optimized for fast search and retrieval

(9)

mongoDB

Document-oriented storage

Querying via JavaScript or custom APIs for all major

programming languages

In-place updates for atomicity

Any attribute in a document can be indexed

Built-in MapReduce

Built-in caching

(10)

mongoDB

mongoDB Server (master)

Data Storage

mongod

Database daemon

mongos

Sharding daemon

mongoDB Server (slave)

Data Storage

mongod

Database daemon

mongos

Sharding daemon Consumer

(11)

GigaSpaces XAP

Data persistence

Distributed processing

Caching

Multi-language support

NoSQL operations:

SQLQuery - SQL-like syntax

Persistency - RDBMS through wrapper

memcached

(12)

GigaSpaces XAP

Application Frameworks

Jetty JEE

Spring Mule

Groovy .Net

C++ Java

XAP Management

and Monitoring

XAP Deployment Virtualization

XAP Middleware Virtualization

(Virtualized Clustering Layer)

(13)

Hadoop and HBase

HDFS - distributed high performance file system

Runs on top of ext3, HFS+, whatever

Alternatives: AWS S3, CloudStore, others

MapReduce - framework for running jobs

Java or anything that works with stdin, stdout

Chukwa - large log analysis framework (not very popular)

Hive - Data warehousing, ETL, and SQL-like language

HBase - Column-oriented NoSQL database

(14)

Hadoop and HBase

HDFS

Disk Disk Disk Disk

MapReduce HBase

Sqoop Chukwa

Hive PIG

Z

o

o

Ke

e

p

e

(15)
(16)

Case Study 1: Large FI Stock Trades

Stock trading system is based on large commercial

database

It can store only up to 4 weeks of trades

Otherwise it’s too expensive

Inability to run long-term forecasting or trend analysis

Robust, Java-based

Mule-based - all messaging going through ESB

(17)

Case Study 1: Large FI Stock Trades

Syphon trades as they fly by through the ESB

Copy every trade to HDFS

Use MapReduce to break the data down for analysis

Commit initial analysis to HBase

Run queries and further mine data through HBase and

MapReduce

Data mining and presentation using WEKA

Forecasting accuracy increased by 11.3% in the first 180

(18)
(19)

Large SaaS

Client Relationships App Dispatcher CRM Custom Queuing System Main App Search Queue Static Files (S3) Reporting query reply Rich Docs (GridFS) update Netezza Lucene Service Providers Various services providers throughout the Internet. Some are public, some are partners End Users Service Consumers Browser RSS Outlook CWS EWS End Users Service Consumers Internal Service Providers Heavy web services Some XML, some custom

Firewall Legend HTTP SOAP Custom RPC ODBC/JDBC Direct/API Internal End Users

(20)

Large SaaS

Search Static Files (S3) Reporting Service

Providers End Users

Service Consumers Browser RSS Outlook CWS EWS Internal Service Providers

Mule ESB Container: Services, Message Routing, and Transformations

Client Relations Services Dispatcher Services Main App Services OpenMQ Other New Services

Tomcat App Container

Main App (zone instance)Client Relations

(Zone Manager)Dispatcher New Apps New System Acquisition (.Net, PHP, etc.) cron Services Rich Documents (GridFS) m e m c a c h e d Local DBs, Other Resource Cloud Firewall Enterprise Services Corporate Firewall End Various services providers throughout the Internet. Some are public, some are partners

(21)

Large SaaS

Databases Search Hive Static Files (S3) Reporting Pig

Mule ESB Container: Services, Message Routing, and Transformations

Client Relations Services Dispatcher Services Main App Services OpenMQ Other New Services

Tomcat App Container

Main App (zone instance)Client Relations

(Zone Manager)Dispatcher New Apps cron Services Rich Documents (GridFS) m e m c a c h e d Internal Services

HDFS, GridFS, Data Warehouse Hadoop, DB cluster,

computational network

External Service or Consumer

Cloud-based MapReduce/NoSQL Infrastructure - expand and contract

(22)
(23)

SOBA Labs

sobaDB 192.168.0.42 Other Consumer 192.168.0.42 sobaEngine localhost Ubuntu Landscape

REST SOBA interface - implementation is transparent to caller! http://soba.myserver.com/manage/resource F i r e w a l l Oracle vm_uuid: b220c8db Xen Host SOBA Agent

Xen XML-RPC API

REST SOBA interface

Xen Python SOBA Python Amazon EC2 End-user App ami-322ec65b End-user App ami-322ec65b

(24)

SOBA Labs

Mule-based SOBA Engine abstracts provisioning, configuration, and

monitoring through web services Java and Python Web Services Interface CANONICAL Landscape Other Application easy integration! JSON JSON web se rvi ce s R E S T R E S T web se rvi ce s SOBA Engine Python API Native Application easy integration! Python dict

amazon EC2 API Xen Server API Rackspace Cloud Servers API

SOBA Agent Python dict EC 2 w e b se rvi ce s API Xe n XML -R PC API R E S T JSON JSON SOBA Data mongoDB EC2 Data XML EC2 Query XML Config Data (Puppet?) Ubuntu Server Ubuntu Server Ubuntu Server R E S T DRY Interface

Don't Repeat Yourself!

Provisioning, configuration or monitoring via SOBA is the same regardless of target: Same API call, same data payload, same data format, etc.

Implementation is abstracted from the

dict

SOBA

(25)

Plug - Know Any High Caliber Coders?

badoo.com is hiring!

Top talent - we’re very demanding

PHP, MySQL developers and sr. developers

Java with a Business Intelligence twist for Pentaho and Hadoop

Mobile: Android, iOS, Blackberry, WAP, JME

QA sr. lead - highly technical, web, web services, and mobile

€2,000 referral bonus for you if we hire your friend!

Paid 90 days after hiring (trial period ends)

If your friend can legally work in Russia or the UK, but doesn’t live in Moscow or London, we’ll work out relocation

Contact: [email protected]

(26)

Q&A

Comments?

Anything else?

Eugene Ciurana [email protected] - pr3d4t0r ##java, irc.freenode.net

http://ciurana.eu/scalablesystems

This presentation is available from:

http://ciurana.eu/GeeCON-2011

References

Related documents

It allows the end user of a given survey to select any sub-sample of photometric galaxies with unknown redshifts, match this sample’s catalog indices into a value-added data file,

The leaching of Ca(OH) 2 consequently increases the pH, calcium content and alkalinity of the water, causing the water quality to deteriorate. In order to meet the

In general, as long as the number of firms that possess a particular valuable re- source (or a bundle of valuable resources) is less than the number of firms needed to generate

mobility investments put more empha sis on custom apps, robust enterprise app stores, and app management the 2014 Apperian Executive Mobility Survey, which asked

Client Desktop Client Desktop Client Desktop Perceptive Lotus Oracle ERP / CRM Exchange SQL Server SharePoint Access File Servers Social + Web TRIM Custom App PROBLEM

2 Over the last 18 years of plant breeding for strawberry cultivars with a high degree of resistance and horticultural traits, strawberry cultivars with at least moderate

Control (Reporting) IT/Security Admin Secure WebGateway Services Customer Functionality Acceptable Websites Infected Executable Files Update Servers Infected Web Apps SaaS..

OTHERWISE DUE ON SUCH PRODUCT SHALL BE CONSIDERED AS INPUT TAX CREDITABLE AGAINST HIS OUTPUT TAX PAYABLE 2 ND STATEMENT: EXPORT SALES BY A VAT-REGISTERED PERSON ARE.. SUBJECT