WSO2 Message Broker
Outline
Messaging
Scalable Messaging
Distributed Message Brokers WSO2 MB Architecture
o Distributed Pub/sub architecture o Distributed Queues architecture
User Story Conclusion
What is Messaging ?
We often program and design distributed systems with RPC style communication
o E.g. Web Services, Thrift, REST
RPC communication is
o Request/Response (there is always a response) o Synchronous (client waits for response)
o Non-persistent (message is lot if something failed)
But there are other 7 possibilities
o Under messaging we support those
o Build on top of single message, with flexibility (users can choose) in other dimensions
Messaging Systems in Real World
There are many types of message systems in the real word
o Sensor networks
o Monitoring/ Surveillance
o Business Activity Monitoring o Job Scheduling systems
o Social Networks
Why Messaging?
More reliability
o E.g. via persistence, transactions
Decupling
o Space o Time
o Synchronization
Messaging Server Models
Messaging is implemented with a broker (or brokers in the middle)
Participants send messages, and broker delivers them to recipients
There are two main models
o Queues - A message is delivered only once to a single consumer.
o Publish/Subscribe: Broadcast a message to many message consumers
Distributed Queues
A queue in the “Cloud” Supports Operations
o Put(M) – put a message
o Get() – get a message (dqueue)
o Subscribe() – send me a message when there is one
E.g. SQS (Amazon Queuing Service) Usecases
o Job Queues
Publish/ Subscribe
There is a topic space based on interest Publishers send messages to brokers
Subscribers registers their interest
Brokers matches events (messages) and delivers to all interested parties
Usecases
o Surveillance
What is JMS ?
JMS – Java Message Service
A specification that define a standard API for java
programmer to perform messaging by interacting with a message broker
Support both
o Distributed Queue o Publish/Subscribe
It does not define the message format or how java
API interacts with the message broker
What is AMQP ?
Advanced Message Queuing Protocol (AMQP) Open standard for passing business messages
between applications or organizations.
JMS does not define the message format, and AMQP fills that gap
AMQP let different systems (e.g. .NET and
Java) to interact with each other by agreeing the message format at the wire level just like Web Services.
Brokers
Message broker support messaging
Some brokers can be setup as a network or a cluster
Some of well known brokers
o Apache Qpid - http://qpid.apache.org/ o Storm MQ - http://stormmq.com/ o Active MQ - http://activemq.apache.org/ o HornetQ - http://www.jboss.org/hornetq o Rabbit MQ - http://www.rabbitmq.com/ o IBM WebSphere MQ - http://www-01.ibm.com/software/integration/wmq/ WSO2 Inc. 11
Scaling
There a several dimensions of Scale Number of messages
Number of Queues Size of messages
Scaling Pub/Sub is relatively easy
E.g. Consider cluster of brokers. If all node know about all subscriptions, all publish messages can be delivered
E.g. Narada Broker, Padres
Scaling Distributed Queues is harder
Scaling Distributed Queues
Scaling Distributed Queues (Contd.)
Topology Pros Cons Supporting Systems
Master Salve Support HA No Scalability Qpid, ActiveMQ, RabbitMQ
Queue Distribution Scale to large
number of Queues
Does not scale for large number of messages for a queue
RabbitMQ
Cluster Connections Support HA Might not support in-order delivery Logic runs in the client side takes local decisions.
HorentMQ
Broker/Queue Networks
Load balancing and distribution
Fair load balancing is hard
ActiveMQ
Alternative Message Broker Design
Most persistent message brokers use a per-node DB to store messages with message routing.
But with large messages, cost of routing messages over the network is very high
With availability of scalable storage and
distributed coordination middleware we propose an alternative architecture for scalable message brokers
Main idea
o Avoid message routing
o Use scalable storage to share messages between nodes o Use distributed coordination to control the behavior
Cassandra and Zookeeper
Cassandra
o NoSQL Highly scalable new data model (column family)
o Highly scalable (multiple Nodes), available and no Single Point of Failure.
o SQL like query language (from 0.8) and support search through secondary indexes (well no JOINs, Group By etc. ..).
o Tunable consistency and replication
o Very high write throughput and good read throughput. It is pretty fast.
Zookeeper
o Scalable, fault tolerant distributed coordination framework
WSO2 Message Broker
Use Apache Zookeeper for coordination when
needed
Support for AMQP JMS and WS-Eventing while
enabling interoperability between protocols
Built by extending Apache Qpid Code base
WSO2 MB Architecture
How Distributed Queues Works ?
How Distributed Queues Works
Contd..
How Distributed Queues Works
Contd..
Each node contains a node queue. Message
meta data are stored in this queue. A Queue Delivery Worker running in each node and
consume messages in the above node queue. Destination is extracted from this consumed message and delivered to the endpoint.
MB stores message content separately
Delivery logic works with message IDs written
to queue representation in Cassandra and it only reads the messages at delivery
Distributed Queues
Strict ordering means there can be one message being delivered at a give time.
o Say we receive messages m1, m2 for Queue Q.
o Say we deliver messages m1 and m2 to client c1 and c2 for Queue Q in parallel
o Say m1->c1 failed, but by then m2->c2 is done. o If there is no other subscribers, now m1 has to be
delivered out of order.
Two implementation
o Strict ordering support - using a distributed shared lock with Zookeeper
How Pub/Sub Works ?
How Pub/Sub Works Contd…
There is a node queue for each of the brokers. When published message to a topic, broker get
the list of nodes where subscriptions available for the topic and write the message id to each of the node queue connected to brokers.
A worker thread running in each of these
brokers to consume messages from the above node queue and deliver the message to
subscriber.
MB2 JMS Support
Feature Yes No Pub / Sub √ Durable Subscriptions √ Hierarchical Topics √ Queues √ Message Selectors √ Transactions √ WSO2 Inc. 25How does it Make a difference?
Scale up in all 3 dimensions
Create only one copy of message while delivery High Availability and Fault Tolerance
Large message transfers in pub/sub (asynchronous style)
Let users choose between strict and best effort messages
Conclusion and Future Work
Provides an alternative architecture for scalable message brokers using Cassandra and Zookeeper It provides
o A publish/subscribe model that does not need any coordination between broker nodes
o A strict mode for distributed queues that provides in order delivery
o A best-effort mode for distributed queue
Future work
o Further Scalability Tests
o Testing with large messages o Fault Tolerance Tests
Integrating with WSO2 ESB
JMS Transport
o JMS endpoints and JMS proxy services Message Stores and Processors
Integrating with WSO2 DSS
JMS Transport
o JMS transport enabled data services
User Story
An SAP system need to distribute IDOCs to
their point of sales which are distributed Island wide. IDOCs are sending out from SAP as
batches issued within small amount of time period. These IDOCs need to transform in to SOAP messages and need to inject some
properties. Finally these messages need to update the data bases in Point of Sales.