connect • communicate • collaborate
perfSONAR MDM Deployment
PERT workshop, TNC2012
Szymon Trocha, Poznań Supercomputing and Networking
Centre
Reykjavik, 21 May 2012
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no 238875 (GÉANT) Credits to D.Vicinanza (DANTE) for parts of slides
connect • communicate • collaborate
Internet measurements
„Internet measurements is fun” „We need to share our data”
N.Brownlee, „Measuring Internet Evolution - If We Don't Measure, We Don't Know What's Happening!”, APAN26
connect • communicate • collaborate
Agenda
perfSONAR MDM overview Diagnostic Service
Diagnostic Service deployment
Diagnostic Service deployment scenarios Deployment status
Development status and roadmap
Demonstration of installation and usage
connect • communicate • collaborate
perfSONAR MDM overview
connect • communicate • collaborate 5
What’s In A Name?
perfSONAR stands for “Performance Service Oriented Network Monitoring Architecture”
connect • communicate • collaborate 6
Four Aspects to perfSONAR
Architecture and protocols
Define web services based on roles
Define their communication syntax and semantics
–
Protocols based on the Open Grid Forum (OGF) NetworkMeasurement Working Group (NM-WG) schemas
Allow anyone to develop web service implementations
A set of interoperable software implementations
Java, Perl, Python etc.
A collaboration by many organisations
A deployed measurement infrastructure
connect • communicate • collaborate
7 / 17
Across Network Boundaries
Performance data fragmented and hard to access Difficult to find measurement capability
Multi-domain problem diagnoses difficult and slow
National Network X National Network X Backbone X X Network Administrator Network Administrator Network Administrator User Campus Network X Network Administrator User Campus Network X Network Administrator
Key: X = locally held performance data = path
connect • communicate • collaborate 8
connect • communicate • collaborate
perfSONAR MDM Diagnostic Service
connect • communicate • collaborate
What is perfSONAR MDM service?
perfSONAR MDM (Multi-Domain-Monitoring) is the multi-domain diagnostic service for the GÉANT Service Area
Part of the GEANT portfolio
Based on the perfSONAR protocol
Interoperable with other deployments around the world
(from the perfsonar.geant.net website)
connect • communicate • collaborate
11 / 8
perfSONAR MDM Service
Access to a set of monitoring functionalities (e.g. accessing metrics or performing tests) offered to a group of users accessible directly through an XML interface (perfSONAR protocol) or through a visualisation tools. Based on an underlying set of perfSONAR web-services
Own OwnUser
GN2 Visualisation
perfSONAR SOAP XML + JRA5 AA
GN2 Visualisation Domain A BWCTL MP OWD MA Lookup Domain B BWCTL MP OWD MA Lookup Domain C BWCTL MP OWD MA Lookup User Visualisation
perfSONAR SOAP XML + JRA5 AA
11
connect • communicate • collaborate
A real answer to real use cases
Services built around real use cases:
Network performances and monitoring
Circuit monitoring (both statically and dynamically provisioned) SLA verification
Federated deployment facilitated by a dedicated deployment
team
Each NREN provides at least one MP (and possibly a MA)
In turn it get access to GEANT and other participant NRENs MPs and MAs
Supported
connect • communicate • collaborate
13 / 8
Shared Responsibility
An domain has two roles
Data supplier–
Deploys the service instances–
Provides the required data and functionalities–
Administers and minimizes unavailabilityData user
–
Uses the infrastructure, solves issues–
Updates operational procedures to take the MDM service into account–
Raises awareness internallyconnect • communicate • collaborate
Simplicity for concrete benefits
Simplicity:
Revised list of metrics to meet network engineers requirement No hardware-dependent software
No need for GPS antenna Running on Debian or RHEL
Benefits:
Immediate access to the complete picture:
–
No more waiting for other NRENs to provide their network monitoring data that affects your users’ experience.Improving multi-domain network performance troubleshooting
–
Enabling consistent monitoring across multiple domains–
Tackling potential problems which may adversely impact the researchers’ voice, video or data communicationsconnect • communicate • collaborate
A new perfSONAR MDM:
Compatible, open, interoperable
Restructuring of perfSONAR MDM
Progressively endorsed by EU NRENs and other R&E networks
in the world
Using direct user feedback to meet user expectations
perfSONAR User Panel to gather requirements and constantly
listen to the user community
Simplifying installation procedure
Revised documentation (lightweight and modular)
Interoperable with perfSONAR-PS
Use-case scenario
See TNC session on …
connect • communicate • collaborate
Service deployment overview
connect • communicate • collaborate
Deployment process
The main goal is to deploy the perfSONAR MDM service to the GÉANT backbone and NRENs connected
In each participating domain, the focus is to have measurement infrastructure deployed within the domain boundaries (interdomain), even though domains are encouraged to extend deployment across their infrastructure (intradomain)
Starts with becoming a perfSONAR Early Adopter
Deployment team…
Is responsible to ensure the domain has all the means and required tools available to fully deploy all the components to operate the service
Follows up the entire deployment cycle and provides the required support during the whole deployment process
User is responsible for the day-to-day operations of the service and its components within his domain
connect • communicate • collaborate
Deployment process stages
Definition • Use case presented • Infrastructure evaluated • Environment understood • Initial plan ready Deployment • Deploy main components in the few main intra-domain paths and inter-domain boundaries • Start developing monitoring Transition • Spread all components to most of the desired PoPs • monitoring fully deployed • Visualisation tools deployed • Tools integrated to the NOC and/or PERT Operation • Ensure all components are operational • Ensure elements provide consistent delivery to the GÉANT Service Area • Monitoring fully operational to operate the service • Transfer to MDSD 18
connect • communicate • collaborate
Diagnostic service deployment scenarios
connect • communicate • collaborate
Service architecture
connect • communicate • collaborate
Service architecture cont.
perfSONAR MDM Service Architecture and Scenarios document
This document ensures common understanding of elements of the monitoring infrastructure and their configurations to provide
consistent perfSONAR MDM monitoring service support and delivery to the GÉANT Service Area users by participating domains. Where to find? https://intranet.geant.net/sites/Services/SA2/T3/Documents/perfSO NAR%20Deployment%20Subtask%20documents/perfSONAR%20 MDM%20Service%20Architecture%20and%20Scenarios.pdf Or easier
–
GN3 SA2 T3 Documents -> perfSONAR Deployment Subtaskdocuments -> perfSONAR Deployment Subtask documents.pdf
connect • communicate • collaborate
Early adopters deployment scenarios
3 tiers of the deployment are considered: Upper Tier (e.g. GEANT), Domain (e.g. NREN), Lower Tier (e.g. MAN or RAN)
The difference in deployment scenario depends on the scope of deployment within a domain and the depth of deployment in the hierarchy
Minimal case: a domain deploys MP(s) located near the domain’s
interconnection points to upper tier domains (upstream) and enables on-demand measurements. It allows the upper tiers as well as the other
domains at the same level to run on-demand multidomain measurements between their interconnection points
Recommended case: a domain in addition deploys MPs within its infrastructure, next to at least lower tier (downstream) interconnection
points, neighboring domains and possibly also other points of interest (intra-domain)
Domains are encouraged to extend the basic scenario enabling scheduled measurements between deployed MPs
Domain deploys a Measurement Archive (MA) to store data from intra-domain measurements
connect • communicate • collaborate
Tier model
connect • communicate • collaborate
Modified tier model
connect • communicate • collaborate
Coments to service architecture
perfSONAR MDM Product Manager
Domenico Vicinanza [email protected] GN2 SA2 T3 leader
Jan Hertzberg [email protected]
connect • communicate • collaborate
Service architecture cont.
„perfSONAR MDM Service Configuration Recommendations”
document
Recommends configuration of the proper elements of the
monitoring infrastructure to provide consistent perfSONAR MDM monitoring service support and delivery to the GÉANT Service Area users by participating domains
Where to find?
http://downloads.perfsonar.eu/repositories/documents/perfSONAR %20MDM%20Service%20Configuration%20Recommendations.pdf
Or easier
–
perfSONAR MDM Forge -> Documentation -> perfSONAR MDM Service Configuration Recommendationsconnect • communicate • collaborate
Diagnostic Service software components
connect • communicate • collaborate
perfSONAR MDM Monitoring Service
Achievable bandwidth One-way delay
Delay variation
One-way packet loss Traceroute
On-demand Scheduled Visualization
connect • communicate • collaborate
Deployment scope
BWCTL
MP
HADES
HADES
MA
SQL
MA*
29 * 3.3 releaseconnect • communicate • collaborate
Current verion(s) of perfsonarUI
Location of the tool reflects users’ privacy concerns Public version for general users
–
Access to MAs for link utilisation, one way delay, jitter, packet loss and traceroute. Does not allow any on-demand bandwidth tests.–
perfsonar.geant.net -> Resources -> Stable version, for general users (perfsonar-0.22-pub.jnlp)Partners’ version for NOC/PERT engineers
–
Access to MAs for link utilisation, one way delay, jitter, packet loss, traceroute, bandwidth. It allows on-demand bandwidth tests.–
GÉANT Partner Portal -> perfSONAR User Interface (perfsonar-0.22-prod.jnlp)Partners’ experimental version for developers
–
In addition includes Playground, Lookup Service Playground, Qflow MA client, Looking Glass and both LinkStatus and Links tabs.–
GÉANT Partner Portal -> perfSONAR User Interface (perfsonar-0.22-exp.jnlp)LIVE DEMO
connect • communicate • collaborate
New version of perfsonarUI
Has all essential functionalities of the existing desktop
perfsonarUI (RRD MA, HADES MA, SQL MA and BWCTL MP) Developed as a web
application
Allows the user to compare two interfaces (overlayed graphs and comparison of numerical values)
Compare two routes which
were active in the selected time period
Service status check
TODO: configuration panel, which will allow the user to configure the UI to use a global service list, a local service list (kept in an DB) or combine both of them
connect • communicate • collaborate
Give it a try
Go to:
NOT PUBLICLY AVAILABLE
connect • communicate • collaborate
Hardware possibilities
connect • communicate • collaborate
Handover of existing servers
There are servers coming from GN/GN2 Used as HADES and BWCTL MP
Practical approach to handover
Acknowledge you’d like to manage them Give us details
We will
–
remove the servers from the central HADES MDM server and our configuration management repository–
delete the "labor" user–
deregister the servers from the DANTE Red Hat Enterprise Linux subscriptions–
add an SSH key securely supplied by the domain to the server's root account–
change the eRIC remote management super/admin account to a shared passwordconnect • communicate • collaborate
Handover of existing servers cont.
It will be the NREN's responsibility to (assuming you want to keep the existing installation):
change the root and eRIC card password review the authorized SSH keys
register the installation with Red Hat under their own subscription set up their own HADES domain (BWCTL will likely simply keep working)
or boot the system from an installation DVD and start fresh (e.g. choosing Debian)
–
Then only need access to the eRIC remote management–
On how to reset a remote management card back to factory defaults, see https://forms.raritan.com/support/eric-express/user-guide/Let us know if paperwork is needed
connect • communicate • collaborate
Reference hardware
DELL Server Hardware Compatibility Testing with HADES / BWCTL (by DFN-Labor)
Test System 1
DELL PowerEdge R415, AMD Opteron 4176HE, 8GB RAM, 250GB SATA disk
NICs: Internal Broadcom, Intel Gigabit ET Quad Port Cu PCIe x4, Myricom Myri-10G
Red Hat Enterprise Linux 6.2, 64 bit
Test System 2
DELL PowerEdge R710, Intel Xeon E5645, 8GB RAM, 250GB SATA disk NICs, OS and measurement software: As above
OWAMP + BWCTL tests
There were some minor issues with delay on the R710 model, while no problems noticed related to the R415
R415 will be deployed over GEANT
connect • communicate • collaborate
RIPE TTM servers
RIPE NCC announced the end of life for the Test Traffic Measurement service (TTM)
What does this mean
They will not extend any existing TTM contracts
They will continue to operate TT measurements as long as necessary but until no later than 1 October 2012
On 1 October 2012, they will shut down the central server and website
If one wants to continue operating the test boxes they will support that on a best-effort basis. In particular, provide with the root password and best-effort parts replacement for clock hardware as long as they have spare parts
Lower tier domains/projects - use them for LHC Tier 2s
connect • communicate • collaborate
Deployment status
connect • communicate • collaborate
Early Deployers
39 HEAnet (IE) RedIRIS (ES) RENATER (FR) DFN (DE) PIONIER (PL) SWITCH (CH) BREN (BG) IUCC (IL) and GRNET (GR) CYNET (CY) FCCN (PT) GARR (IT) Janet (UK)connect • communicate • collaborate
GÉANT deployment example
8 sites
Geneva (CH), Paris (FR), Madrid (ES), Milan (IT), Amsterdam (NL), Frankfurt (DE)
London (GB), Vienna (AT) for Internet2 and ESNet interdomain measurements Components BWCTL MP HADES HADES MA SQL MA
On-demand and scheduled
Includes early OWAMP functionality
connect • communicate • collaborate
GÉANT deployment example cont.
BWCTL
MP(scheduled)
connect • communicate • collaborate
Development status
connect • communicate • collaborate
perfSONAR MDM latest deliveries (1/2)
RRD MA 3.4 in unstable repository .deb and .rpm packages
The backend has changed to the relational database
To support the continuity of the service there are tools available that allow users to convert their old databaes to the new one Serves all the requests as the previous one
Installation guide covers both installation process and convert operations for both packages as well as provides the examples of using new database
BWCTL MP 0.53-4 in unstable repository Bug fixes reported after training
Internal performance improvements Lookup Service 1.6 in unstable repository
Simplified WebAdmin
Internal review of code and improvements Embeded eXist
connect • communicate • collaborate
perfSONAR MDM latest deliveries (2/2)
OWAMP controller (not released, internal testing)
Allows measurement requests to HADES system over OWAMP protocol
HADES looks as the OWAMP server OWAMP MP 0.53 in unstable repository
Service to request on demand OWAMP measurement Data stored into SQL MA
perfsonarUI (web-based) Usability modifications Service status window Bug fixing
connect • communicate • collaborate
HADES new UI for configuration
45
Friendly interface for the installation and configuration of HADES.
The process of installation of HADES is easy to understand.
Reduction of the errors made in places where no changes are needed Adjustment of the specific resources that are needed for a HADES installation
connect • communicate • collaborate
Other changes
Pending
Changes from perfSONAR MDM training comments
–
Installation documentation improvement and clarifications–
BWCTL scheduler scripts–
Service configuration documentation improvements–
Fixing repositories for RHEL–
Fixing perfsonarUI–
Clarification of OWAMP components conceptLookup service improvements
–
Interoperability–
UI cooperationTarget supported operating systems: Debian 6, RHEL 5
connect • communicate • collaborate
perfSONAR MDM roadmap
connect • communicate • collaborate
perfSONAR MDM roadmap (1/2)
connect • communicate • collaborate
perfSONAR MDM roadmap (2/2)
connect • communicate • collaborate
Information sources
connect • communicate • collaborate
Multi-Domain Service Desk
The Multi-Domain Service Desk provides first level support for
MDS related incidents raised by NREN NOC staff. The calls
placed to the MDSD by NRENs are logged into the incident
management system
If the calls can be resolved by the MDSD, the ticket will indicate
the steps taken to resolve the incident.
MDSD provides first point of contact for perfSONAR MDM
service users
connect • communicate • collaborate
connect • communicate • collaborate
connect • communicate • collaborate
connect • communicate • collaborate
connect • communicate • collaborate
Thank you
connect • communicate • collaborate
Live deployment of BWCTL MP
From stable repository Debian 6 Prerequisities Repository Installation Installation check Tests Local script perfsonarUI 57