Managin
Grid and Web Service
and their exchanged messages
OGF19 Workshop on Reliability and Robustnes
Friday Center Chapel Hill N
January 31 2007
Authors
Harshawardhan Gadgil (his PhD topic), Geoffrey Fox,
Shrideep Pallickara, Marlon Pierce
Community Grids Lab, Indiana University
Presented by
Geoffrey Fox
2
Management Problem I
n
Characteristics of today’s (Grid) applications
–Increasing complexity
–
Components widely dispersed and disparate in nature and
access
§ Span different administrative domains
§ Under differing network / security policies
§ Limited access to resources due to presence of firewalls, NATs
etc… (major focus in prototype) –
Dynamic
§ Components (Nodes, network, processes) may fail
n
Services
must
meet
–
General
QoS
and Life-cycle features
–
(User defined)
Application specific
criteria
n
Need to “manage” services to provide these
capabilities
–
Dynamic monitoring and recovery
–
Static configuration and composition of systems from
3
Management Problem II
n
Management Operations
*
include
– Configuration and Lifecycle operations (CREATE, DELETE) – Handle RUNTIME events
– Monitor status and performance
– Maintain system state (according to user defined criteria)
n
Protocols like
WS-Management/WS-DM
define inter-service
negotiation and how to transfer metadata
n
We are designing/prototyping a system that will manage a
general world wide collection of services and their network links
– Need to address Fault Tolerance, Scalability, Performance,
Interoperability, Generality, Usability
n
We are starting with our
messaging infrastructure
as
– we need this to be robust in Grids we are using it in (Sensor and
material science)
– we are using it in management system – and it has critical network requirements
* From WS – Distributed Management
4
Core Features of Management Architecture
n
Remote Management
– Allow management irrespective of the location of the resource (as long
as that resource is reachable via some means)
n
Traverse firewalls and NATs
– Firewalls complicate management by disabling access to some
transports and access to internal resources
– Utilize tunneling capabilities and multi-protocol support of messaging
infrastructure
n
Extensible
– Management capabilities evolve with time. We use a service oriented
architecture to provide extensibility and interoperability
n
Scalable
– Management architecture should be scale as number of managees
increases
n
Fault-tolerant
– Management itself must be fault-tolerant. Failure of transports OR
Management Architecture built in terms of
n
Hierarchical Bootstrap System
– Robust itself by
Replication
n
managees in different domains can be managed with
separate policies for each domain
n
Periodically spawns a
System Health Check
that
ensures components are up and running
n
Registry for metadata
(distributed database) –
Robust by standard database techniques and our
system itself for Service Interfaces
n
Stores managee specific information (User-defined
configuration / policies, external state required to
properly manage a managee)
n
Generates a unique ID per instance of registered
Architecture:
Scalability: Hierarchical distribution
Replicated ROOT
US EUROPE
FSU CARDIFF CGL
Active Bootstrap Nodes
/ROOT/EUROPE/CARDIFF
•Responsible for maintaining a working set of management components in the domain
•Always the leaf nodes in the hierarchy
Passive Bootstrap Nodes
•Only ensure that all child bootstrap nodes are always up and running
Spawns if not present and ensure up and running
n
Messaging Nodes
form a scalable messaging substrate
n Message delivery between managers and managees
n Provides transport protocol independent messaging between
distributed entities
n Can provide Secure delivery of messages
n
Managers
– Active
stateless
agents that manage
managees.
n Both general and managee specific management threads performs
actual management
n Multi-threaded to improve scalability with many managees
n
Managees
– what you are managing (managee / service
to manage) – Our system makes robust
n There is NO assumption that Managed system uses Messaging
nodes
n Wrapped by a Service Adapter which provides a Web Service
interface
n Assumed that ONLY modest state needed to be stored/restored
externally. Managee could front end and restore itself a huge database
Architecture:
Conceptual Idea (Internals)
Connect to Messaging Node for
sending and receiving messages User writes system
configuration to registry Manager processes periodically
checks available managees to manage. Also Read/Write managee specific external state
from/to registry
Always ensure up and running
Always ensure up and running
Periodically Spawn
Architecture:
User Component
n
“Managee Characteristics” are determined by the
user.
n
Events
generated by the Managees are handled
by the manager
n
Event processing is determined by via
WS-Policy
constructs
n
E.g. Wait for user’s decision on handling specific
conditions
n
“Auto Instantiate” a failed service but service
responsible for doing this consistently even when
failed service not failed but just unreachable
n
Administrators can set up services (managees) by
defining characteristics
n
Writing information to registry can be used to start up
Issues in the distributed syste
Consistency
n
Examples of inconsistent behavior
n Two or more managers managing the same managee n Old messages / requests reaching after new requests
n Multiple copies of managees existing at the same time / Orphan
managees leading to inconsistent system state
n
Use a Registry generated monotonically increasing Unique
Instance ID (IID) to distinguish between new and old
instances of Managers, Managees and Messages
n Requests from manager thread A are considered obsolete IF
IID(A) < IID(B)
n Service Adapter stores the last known MessageID (IID:seqNo)
allowing it to differentiate between duplicates AND obsolete messages
n Periodic renewal with registry
n IF IID(manageeInstance_1) < IID(manageeInstance_2) n THEN manageeInstance_1 was deemed OBSOLETE
n SO EXECUTE Policy (E.g. Instruct manageeInstance_1 to
Issues in the distributed syste
Security
n
Security – Provide secure communication between
communicating parties (e.g. Manager <-> Managee)
n Publish/Subscribe:- Provenance, Lifetime, Unique Topics n Secure Discovery of endpoints
n Prevent unauthorized users from accessing the Managers or
Managees
n Prevent malicious users from modifying message (Thus message
interactions are secure when passing through insecure intermediaries)
n
Utilize NaradaBrokering’s Topic Creation and Discovery
*and Security Scheme
#n *NB-Topic Creation and Discovery (Grid2005)
http://grids.ucs.indiana.edu/ptliupages/publications/NB-TopicDiscovery-IJHPCN.pdf
n #NB-Security (Grid2006)
Implemented:
n
WS – Specifications
n WS – Management (June 2005) parts (WS – Transfer [Sep 2004],
WS – Enumeration [Sep 2004] and WS – Eventing) (could use WS-DM)
n
WS – Eventing
(Leveraged from the WS – Eventing
capability implemented in OMII)
n WS – Addressing [Aug 2004] and SOAP v 1.2 used (needed for
WS-Management)
n
Used XmlBeans 2.0.0 for manipulating XML in custom
container.
n
Currently implemented using JDK 1.4.2 but will
switch to JDK1.5
n
Released on http:/
/www.naradabrokering.org in
Performance Evaluatio
Results
n Extreme case with many
catastrophic failures
n Response time increases
with increasing number of concurrent requests
n Response time is
MANAGEE-DEPENDENT
and the shown times are typical
n MAY involve 1 or more Registry access which will increase overall response time
n Increases rapidly as no.
of Managees > (150 – 200)
Performance Evaluation
How much infrastructure is required to manage N managees ?
n N = Number of managees to manage
n M = Max. no. of entities connected to a single messaging node n D = Max. no of managees managed by a single manager process n R = min. no. of registry service instances required to provide
fault-tolerance
n Assume every leaf domain has 1 messaging node. Hence we have
N/M leaf domains.
n Further, No. of managers required per leaf domain is M/D
n
Total Components in lowest level
= (R registry + 1 Bootstrap Service + 1 Messaging Node
+ M/D Managers) * (N/M such leaf domains)
= (2 + R + M/D) * (N/M)
n Thus percentage of additional infrastructure is
Performance Evaluation
Research Question:
How much infrastructure is required to manage N managees ?
n
Additional infrastructure = [(2 +R)/M + 1/D] * 100 %
nA Few Cases
n Typical values of D and M are 200 and 800 and assuming R = 4,
then Additional Infrastructure = [(2+4)/800 + 1/200] * 100 %
≈ 1.2 %
n Shared Registry => there is one registry interface per domain, R
= 1, then Additional Infrastructure = [(2+1)/800 + 1/200] * 100 %
≈ 0.87 %
n If NO messaging node is used (assume D = 200), then Additional
Infrastructure
= [(R registry + 1 bootstrap node + N/D managers)/N] * 100 % = [(1+R)/N + 1/D] * 100 %
≈ 100/D % (for N >> R)
Performance Evaluatio
Research Question:
Performance Evaluatio
XML Processing Overhead
n
XML Processing overhead is measured as the total
marshalling and un-marshalling time required.
n
In case of Broker Management interactions, typical
processing time (includes validation against schema) ≈ 5
ms
n Broker Management operations invoked only during initialization
and failure from recovery
n Reading Broker State using a GET operation involves 5ms
overhead and is invoked periodically (E.g. every 1 minute, depending on policy)
n Further, for most operation dealing with changing broker state,
Prototype:
Managing Grid Messaging Middleware
n We illustrate the architecture by managing the distributed
messaging middleware: NaradaBrokering
n This example motivated by the presence of large number of
dynamic peers (brokers) that need configuration and deployment in specific topologies
n Runtime metrics provide dynamic hints on improving routing
which leads to redeployment of messaging system (possibly) using a different configuration and topology
n Can use (dynamically) optimized protocols (UDP v TCP v
Parallel TCP) and go through firewalls
n Broker Service Adapter
n Note NB illustrates an electronic entity that didn’t start off with
an administrative Service interface
n So add wrapper over the basic NB BrokerNode object that
provides WS – Management front-end
n Allows CREATION, CONFIGURATION and MODIFICATION of broker
Messaging (NaradaBrokering) Architecture
June 19, 2006 Community Grids Lab, Bloomington IN :CLADE 2006: 20
Typical use of Grid Messaging in NASA
Datamining
Grid
Sensor Grid implementing using
NB
NB
GIS Grid
21
NaradaBrokering Management Needs
n NaradaBrokering Distributed Messaging System consists of peers (brokers)
that collectively form a scalable messaging substrate. Optimizations and configurations include:
– Where should brokers be placed and how should they be connected, E.g.
RING, BUS, TREE, HYPERCUBE etc…, each TOPOLOGY has varying degree of resource utilization, routing, cost and fault-tolerance
characteristics.
n Static topologies or topologies created using static rules may be inefficient in
some cases
– E.g., In CAN, Chord a new incoming peer randomly joins nodes in the
network. Network distances are not taken into account and hence some lookup queries may span entire diameter of network
– Runtime metrics provide dynamic hints on improving routing which leads
to redeployment of messaging system (possibly) using a different configuration and topology
– Can use (dynamically) optimized protocols (UDP v TCP v Parallel TCP)
and go through firewalls but no good way to make choices dynamically
Prototype:
Costs (Individual Managees are NaradaBrokering Brokers)
Time (msec) (average values)
Initialized
(Later modifications)
142 ± 1 104 ± 2 160 ± 2 610 ± 6 778 ± 5
Un-Initialized (First time)
129 ± 2 Delete Broker
20 ± 1 Delete Link
27 ± 2 Create Link
57 ± 2 Create Broker
33 ± 3 Set Configuration
Recovery
Typical Time
Max:
20 + 778.0 + 610.1 + 160.5 + 2* 26.67
1622 msec
Min:
5 + 778.0 + 610.1
1393 msec N nodes, Links per
broker vary from 0 – 3
1 – 4 managee Objects per node Cluster
10 + (778.0 + 610.1 + 160.5)
1548 msec N nodes, N links (1
outgoing link per Node)
2 managee Objects Per node
Ring
Recovery Time
= T(Read State From Registry) + T(Bring managee up to speed)
= T(Read State) + T[SetConfig + Create Broker + CreateLink(s)]
Number of managee specific Configuration
Entries Topology
Prototype:
Observed
Recovery Cost per managee
1616 ± 82 1421 ± 9
8 ± 1 2362 ± 18
Average (msec)
Restore (1 Broker + 3 Link) Restore (1 Broker + 1 Link) Read State
*Spawn Process
Operation
Time for Create Broker depends on the number & type of transports opened by the broker
E.g. SSL transport requires negotiation of keys and would require
more time than simply establishing a TCP connection
Management Console
Management Console
Management Console
Management Console
Conclusion
n
We have presented a scalable, fault-tolerant
management framework that
n
Adds acceptable cost in terms of extra resources required
(about 1%)
n
Provides a general framework for management of distributed
entities
n
Is compatible with existing Web Service specifications
n
We have applied our framework to manage Managees
that are loosely coupled and have modest external
state (important to improve scalability of management
process)
n