UVieCo – University of Vienna Condor Pool
2.2 UVieCo and High Throughput Computing using Condor
2.2.3.1 Firewall within Condor
It has been stated that initially condor was designed to run in a network environment which is both ‘symmetric’ (i.e. one in which any machine can initiate a connection to any other machine), and in which there are no restrictions on types of network traffic (e.g. firewalls blocking UDP). Now-a-days, in the modern computing environment such an ‘open’ network environment is increasingly rare. It is thus the case that it can be quite difficult to deploy condor in many current network environments due to the presence of firewalls, private networks (i.e. networks of machines with IP addresses in a specified range) and other circumstances that break the symmetry of the network [118]. Firewall issues related to condor system have been intensively studied by O’Donnell [119], Son et. al [120, 121], Lodygensky et al. [122], Beckles et al. [118], Calleja et al. [123] and Scherp et al. [124].
41
There are currently two main mechanisms for dealing with firewalls within Condor [125]: 1. Restrict condor to use a specific range of port numbers, and allow connections through the firewall that use any port within the range, and 2. Use Condor Connection Brokering (CCB) or Generic Connection Brokering (GCB). Condor Connection Brokering or CCB allows condor components to communicate with each other when one side is in a private network or behind a firewall. Currently, the functionality of GCB is being replaced by CCB because GCB provides communication between two different private networks whereas CCB only supports communication between nodes with one-directional connectivity. The main reasons why CCB is preferable are: support for all platforms (including Windows), easier configuration and troubleshooting, and ability to restart and reconfigure on the fly. Generic Connection Brokering, or GCB, is a system for managing network connections across private network and firewall boundaries. Although GCB provides numerous advantages over restricting condor to use a range of ports which are then opened on the firewall, it has to be noted that it’s also a very complicated system, with major implications for condor’s networking and security functionality.
O’Donnell addressed the problem of condor’s lack of ability to function through a firewall for the first time in the Wisconsin Computer Science network (cs.wisc.edu) [119]. He considered two approaches: writing a custom proxy server and using an existing standard proxy system, considering the fact that the best solution should not negate the original purpose of a firewall, i.e. security. He finally came up with the SOCKS proxy system which was equipped with security, auditing, management, fault tolerance, and alarm notification at that time.
Son and Livny [120] observed that in grid computing, the pools of hundreds or thousands machines are not necessarily having world-addressable IP addresses and the administrators of those pools would prefer private network configuration as it helped them to manage their clusters easily and also reduced the cost by paying for only several public IP addresses for head nodes instead of hundreds or thousands ones. According to them, these private networks and firewalls damaged internet connectivity, making it asymmetric and difficult or even impossible for peer-to-peer computing. They then correlated this problem with the condor system and came up with two different approaches, DPF (Dynamic Port Forwarding) and GCB (Generic Connection Brokering)
42
which have different characteristics in terms of clusters supported, security, and performance and suggested that the users should choose the better one depending on their policies and situations.
In an another approach, Son et al. [121] presented a middleware firewall traversal system called CODO (Cooperative On-Demand Opening), which provides applications end-to-end connectivity over firewalls/NATs in a secure way along with allowing applications authorized through strong security mechanisms to traverse firewalls/NATs so that authorized applications can communicate through it , while blocking unauthorized applications.
Lodygensky et al. came up with a lightweight grid solution for the deployment of multi- parameters applications by using XtremWeb coordinator to solve problems related to domain administrations and firewalls when connecting different condor pools [122].They demonstrated the usefulness of this approach measuring the performances of a multi- parameters bio-chemistry application deployed on two sites: University of Wisconsin/Madison and Paris South University.
Beckles et al. [118] raised several issue related to condor’s pattern of network communication such as machine’s role, direction of network communication, network protocols and port usage, administrative overhead, private firewalls, inadequate documentation, unresolved bugs relating to network communication, etc. Then explaining why these are unfriendly to the firewalls and private networks and finally coming up with the solutions / techniques which have been developed to address or mitigate these problems. These solutions include - mitigating the effects of firewalls, altering the pattern of network communication e.g. reducing it to ‘few-to-many ‘or even to ‘one-to-many’, firewall/NAT traversal, i.e. traversing the security boundary along with generic connection brokering (GCB) and dynamic port forwarding (DPF).
A grid infrastructure, WISENT has been created using Globus Toolkit and Condor by Scherp et al. [124] to handle large heterogeneous datasets generated in energy meteorology research. Because of the 6 different locations of the partner sites, the construction of the grid infrastructure is hindered by blocking firewalls due to strong firewall policies and the use of network address translation (NAT). To tackle with this problem, after considering
43
possible solutions such as an extra grid-zone, a tunnel via virtual private network (VPN) established with each external project partner and application level gateway (ALG), they came with an approach of using a connection broker which can be used for hole punching to traverse firewalls and NAT systems.
Similarly, Calleja et al. [123] implemented an experimental solution by constructing a dedicated Virtual Private Network (VPN) and flocking the small condor pools across this VPN. They further claimed that VPN enabled to tunnel through departmental firewalls and encrypted traffic across interdepartmental links, allowing nodes with private IP addresses which could join a grid that crosses institutional boundaries. Jobs migrated seamlessly across the flocked Condor pools and there was no noticeable degradation in performance due to the overhead of running across the VPN.