Oracle® Grid Engine
Installation and Upgrade Guide
Release 6.2 Update 7
E21973-02
Oracle Grid Engine Installation and Upgrade Guide, Release 6.2 Update 7 E21973-02
Copyright © 2000, 2012, Oracle and/or its affiliates. All rights reserved. Primary Author: Uma Shankar
Contributing Author:
Contributor: Andy Schwierskott
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:
U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle America, Inc., 500 Oracle Parkway, Redwood City, CA 94065.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.
Contents
Preface
... ix Audience... ix Documentation Accessibility ... ix Related Documents ... ix Conventions ... ix1 Planning the Installation
System Requirements... 1-1 Disk Space Requirements... 1-1 Supported Operating Platforms... 1-2 Planning Checklist... 1-2 Cluster Design... 1-3 Cells... 1-4 Cluster Name ... 1-4 Queue Structure... 1-4 Host System Requirements... 1-4 Master Host... 1-4 Shadow Master Hosts... 1-5 Execution Hosts... 1-5 Administration Hosts ... 1-6 Submit Hosts... 1-6 User Account Considerations... 1-6 User Names... 1-6 Installation Accounts ... 1-6 File Access Permissions... 1-7 Network Services... 1-7 Installation Methods... 1-7 Directory Organization... 1-8 Spool Directories under the Root Directory... 1-9 Choosing Between Classic Spooling and Database Spooling... 1-10 $SGE_ROOT Directory... 1-10 Spooling Options... 1-11 Database Server and Spooling Host ... 1-11 Scheduler Profiles... 1-12 Getting the Software... 1-13
Electronic Download ... 1-13 CD-ROM Distribution ... 1-13
2 Installing Grid Engine
Loading the Distribution Files on a Workstation... 2-2 How to Load the Distribution Files on a Workstation ... 2-2 pkgadd Method ... 2-2 tar Method... 2-3 Installing the Software With the GUI Installer... 2-4 Requirements ... 2-4 Express Installation... 2-5 Using the Express Installation Mode ... 2-5 Custom Installation... 2-13 Using the Custom Installation Mode ... 2-13 How to Configure Password-less Access for the root User... 2-21 Configuring Password-less ssh Access for the root User... 2-22 Configuring Password-less rsh Access for the root User ... 2-23 Understanding Host and Installation States... 2-23 Host Resolving... 2-24 Host Validating ... 2-24 Installation States ... 2-25 Tweaking start_gui_installer... 2-26 Description of start_gui_installer Options ... 2-26 Using start_gui_installer Options... 2-27 installing as a Different connect_user ... 2-27 Installing Single Windows Execution Host... 2-27 Installing Multiple Windows Execution Hosts... 2-27 Troubleshooting the GUI Installer... 2-28 FAQs... 2-28 Known issues and workarounds ... 2-31 Installing the Software From the Command Line... 2-31 Installation Overview ... 2-31 Performing an Installation ... 2-32 How to Install the Master Host... 2-32 Installing the Master Host... 2-32 Example Master Host Installation... 2-37 How to Install Shadow Master Host... 2-49 Starting a Shadow Master Host Manually ... 2-51 Configuring Shadow Master Host Environment Variables... 2-51 Example Shadow Master Host Installation ... 2-52 How to Install Execution Hosts... 2-54 Example Execution Host Installation... 2-57 How to Register Administration Hosts... 2-62 How to Register Submit Hosts... 2-62 How to Install the Berkeley DB Spooling Server... 2-62 Installing the Increased Security Features... 2-65 Why Install the Increased Security Features? ... 2-65
Additional Setup Required... 2-65 How to Install a CSP-Secured System... 2-66 How to Generate Certificates and Private Keys for Users... 2-68 How to Renew Certificates... 2-69 How to Check Certificates... 2-70 Displaying a Certificate ... 2-70 Check Issuer ... 2-70 Check Subject ... 2-70 Show Email of Certificate... 2-70 Show Validity ... 2-71 Show Fingerprint... 2-71 Verifying the Installation... 2-71 How to Verify That the Daemon is Running on the Master Host... 2-72 How to Verify That the Daemons Are Running on the Execution Hosts... 2-72 How to Run Simple Commands... 2-73 How to Submit Test Jobs... 2-73 Automating the Installation Process... 2-74 Automatic Installation... 2-75 Special Considerations ... 2-75 Using the inst_sge Utility and a Configuration Template ... 2-76 How to Automate Installation With Increased Security (CSP) ... 2-76 How to Automate Other Installations Through a Configuration File... 2-77 How to Automate the Master Host Installation ... 2-77 Automating Other Installations Through a Configuration File ... 2-78 Automatic Uninstallation... 2-79 How to Uninstall Execution Hosts Automatically ... 2-79 How to Uninstall the Master Host Automatically ... 2-80 How to Uninstall the Shadow Master Host ... 2-80 How to Start the Automatic Backup... 2-80 Troubleshooting Automatic Installation and Uninstallation... 2-81 Installing SMF Services... 2-82 Why Install SMF Services?... 2-82 Additional Setup Required... 2-82 How Do SMF Services Compare to the Normal Services?... 2-82 qmaster Daemon ... 2-82 shadowd Daemon ... 2-83 execd Daemon ... 2-83 Berkeley RPC Server... 2-83 dbwriter Software ... 2-83 Installing a JMX-Enabled System... 2-84 Additional Setup Required... 2-84 How to Install a JMX Agent-Enabled System... 2-85 How to Generate Certificates, Private Keys and Keystores for Users... 2-86 How to Check Certificates, Private Keys and Keystores for Users... 2-87 JMX Configuration Files... 2-87 jaas.config ... 2-87 java.policy... 2-89
management.properties ... 2-92 jmx.access ... 2-95 jmx.password... 2-96 logging.properties ... 2-96 Testing and Troubleshooting... 2-98 Removing the Software... 2-99 How to Remove the Software Interactively... 2-99 How to Remove the Software Using the inst_sge Utility and a Configuration Template... 2-100 Additional Software for the Microsoft Operating System... 2-101 Additional Software... 2-101 Microsoft Services for UNIX... 2-101 Unsupported Grid Engine Functionality... 2-102 Configuring User Name Mapping... 2-102 How to Install Microsoft Services for Unix... 2-103 System Requirements ... 2-103 Services for UNIX Installation... 2-104 Post SFU Installation Tasks... 2-106 Troubleshooting SFU... 2-108 Microsoft Subsystem for UNIX-based Applications... 2-109 Unsupported Grid Engine Functionality... 2-110 How to Install a Microsoft Subsystem for UNIX-based Applications... 2-110 System Requirements ... 2-110 Installing Subsystem for UNIX-based Applications ... 2-110 Post Installation Tasks ... 2-114 Troubleshooting Microsoft Subsystem for UNIX-based Applications... 2-115 Changing Default Behavior to Case Sensitivity... 2-116 Disabling DEP... 2-117 How to Disable DEP for Windows XP Professional, Windows Server 2000 and Window Server 2003 2-117
How to Disable DEP for Windows Vista (Enterprise and Ultimate) and Windows Server 2008 . 2-117
Enabling suid Behavior for Interix Programs... 2-118 User Management on Windows Hosts... 2-118 Managing Users on Windows Hosts... 2-118 Windows User Example... 2-119 UNIX User Management... 2-119 Using Grid Engine in a Microsoft Windows Environment... 2-119 Registering Windows User Passwords ... 2-120 Using the sgepasswd Command ... 2-120 Adding Windows Hosts to Existing Grid Engine Systems... 2-121 How to Add Windows Hosts Later... 2-121 Other Installation Issues... 2-122 How to Verify and Install Linux Motif Libraries... 2-122 How to Install the Software on a System with IPMP... 2-122 What Is IP Multipathing?... 2-123 Issues Between IPMP and Grid Engine ... 2-123 Installing the Grid Engine Master Node With IPMP... 2-123 Ignoring the Error Messages ... 2-124
Temporarily Disabling IPMP ... 2-124 Installing a Grid Engine on an Execution Host With IPMP... 2-124 Enabling Administrative and Submit Hosts With IPMP ... 2-124
3 Upgrading Grid Engine
About Upgrading the Software... 3-1 Before You Upgrade... 3-1 Constraints... 3-2 How to Back Up the Configuration of the Old Cluster... 3-2 What the Backup Contains ... 3-2 How to Back Up the Cluster ... 3-3 How to Install the 6.2 Software Using the Cloned Configuration Method... 3-3 Additional Constraints for the New 6.2 Installation with Cloned Configuration... 3-3 Example Upgrade for Cloned Cluster Configuration... 3-8 How to Upgrade the Original Cluster to the 6.2 Software (Real Upgrade)... 3-14 How to Upgrade from 5.3 to 6.0... 3-17
A Configuration File Templates
Preface
Oracle Grid Engine Installation and Upgrade Guide describes how to install Grid Engine and how to upgrade the Grid Engine from a previous version.Audience
This document is intended for system administratrs to perform installation or upgrading Oracle Grid Engine.
Documentation Accessibility
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at
http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc. Access to Oracle Support
Oracle customers have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
Related Documents
For more information, see the following documents in the Oracle Grid Engine Release 6.2 documentation set:
■ Oracle Grid Engine Release Notes ■ Oracle Grid Engine User Guide
■ Oracle Grid Engine Administration Guide
Conventions
The following text conventions are used in this document: Convention Meaning
italic Italic type indicates book titles, emphasis, or placeholder variables for which you supply particular values.
monospace Monospace type indicates commands within a paragraph, URLs, code in examples, text that appears on the screen, or text that you enter.
1
1
Planning the Installation
Before you install the Grid Engine software, you must plan how to achieve the results that fit your environment. This section helps you make the decisions that affect the rest of the procedure.System Requirements
To verify that the systems on which you intend to install Grid Engine conform to required hardware and software specifications, review the system requirements listed below.
Disk Space Requirements
The Grid Engine software directory tree has the following fixed disk space requirements:
■ 50 Mbytes for the installation files without any binaries ■ Between 60 and 100 Mbytes for each set of binaries
The ideal disk space for Grid Engine system spool directories is as follows: ■ 50-200 Mbytes for the master host spool directories
■ 50-200 Mbytes for the Berkeley DB spool directories
The spool directories of the master host and of the execution hosts are configurable and need not reside under the default location, sge-root.
Note: You must satisfy several Windows platform-specific prerequisites before you can install Grid Engine on hosts that are running the Windows operating system. You might need to install additional software on your computer which might require additional disk space. See Microsoft Services for UNIX and Microsoft Subsystem for UNIX-based Applications.
Planning Checklist
Supported Operating Platforms
The Grid Engine 6.2 software supports the following operating systems and platforms: Master Host
■ Solaris 11, 10, 9, and 8 Operating Systems (SPARC Platform Edition) ■ Solaris 9 Operating System (x86 Platform Edition)
■ Solaris 11 and 10 Operating Systems (x64 Platform Edition) ■ Linux x86, kernel 2.4 and higher, glibc >= 2.3.2
■ Linux x64, kernel 2.4 and higher, glibc >= 2.3.2 Compute Host
■ Solaris 11, 10, 9, and 8 Operating Systems (SPARC Platform Edition) ■ Solaris 9 Operating System (x86 Platform Edition)
■ Solaris 11 and 10 Operating Systems (x64 Platform Edition) ■ Linux x86, kernel 2.4 and higher, glibc >= 2.3.2
■ Linux x64, kernel 2.4 and higher, glibc >= 2.3.2 ■ Linux IA64, kernel 2.4, 2.6, glibc >= 2.3.2 ■ Apple Mac OS X 10.4 (Tiger), PPC platform ■ Apple Mac OS X 10.4 (Tiger), x86 platform ■ Apple Mac OS X 10.5 (Leopard), x86 platform ■ Hewlett Packard HP-UX 11.00 or higher, 32 bit
■ Hewlett Packard HP-UX 11.00 or higher, 64 bit (including HP-UX on IA64) ■ IBM AIX 5.1, 5.3, 6.1
■ Microsoft Windows: ■ Server 2003
■ XP Professional with at least Service Pack 1 ■ 2000 Server with at least Service Pack 3 ■ 2000 Professional with at least Service Pack 3 ■ Server 2003 Release 2
■ Server 2008 ■ Vista Enterprise ■ Vista Ultimate
Planning Checklist
Before you install the Grid Engine software, you must plan how to achieve the results that fit your environment. This section helps you make the decisions that affect the rest of the procedure. Write down your installation plan in a table similar to the following example.
Cluster Design
If you are going to install Grid Engine 6.2 on Microsoft Windows Server 2003, Windows XP Professional with at least Service Pack 1, Windows 2000 Server with at least Service Pack 3, or Windows 2000 Professional with at least Service Pack 3, acquire and install Microsoft Services For UNIX. See Microsoft Services for UNIX for more information.
If you are going to install Grid Engine 6.2 on Microsoft Windows Server 2003 Release 2, Windows Server 2008, Windows Vista Enterprise or Windows Vista Ultimate, acquire and install Microsoft Subsystem for UNIX-based Applications. See Microsoft Subsystem for UNIX-based Applications for more information.
If you are going to install Grid Engine 6.2 on a Windows system, create the required Certificate Security Protocol (CSP) certificates before installing Grid Engine. See How to Install a CSP-Secured System for information about CSP certificates.
Check Other Installation Issues for applicability.
Cluster Design
Table 1–1 Planning Checklist
Parameter Value
$SGE_ROOT directory ___________________________
Cell name ___________________________
$SGE_CLUSTER_NAME ___________________________ Administrative user ___________________________ sge_qmaster port number (6444 is recommended) ___________________________ sge_execd port number (6445 is recommended) ___________________________
Master host ___________________________
Shadow master hosts ___________________________
Execution hosts ___________________________
Spooling for each execution host (global or local) ___________________________ Windows execution hosts (yes or no) ___________________________ Administration hosts ___________________________
Submit hosts ___________________________
Group ID range for jobs ___________________________ Spooling mechanism (Berkeley DB or Classic
spooling)
___________________________ Berkeley DB server host (the master or another
host)
___________________________ Berkeley DB spooling directory on the database
server
___________________________ Scheduler tuning profile (Normal, High, Max) ___________________________ Installation method (interactive, secure, automated,
or upgrade)
Queue Structure
Cells
You can set up the Grid Engine system as a single cluster or as a collection of loosely coupled clusters called cells. The $SGE_CELL environment variable indicates the cluster being referenced. When the Grid Engine system is installed as a single cluster, $SGE_ CELL is not set, and the value default is assumed for the cell value.
Cluster Name
The $SGE_CLUSTER_NAME environment variable supports unique naming of the cluster. Unlike the $SGE_CELL variable, there are restrictions on $SGE_CLUSTER_NAME. If you decide to use Grid Engine SMF services on Solaris 10 or later hosts, you must select a new $SGE_CLUSTER_NAME. This name becomes part of the name of the Grid Engine SMF services. The $SGE_CLUSTER_NAME is also used to distinguish multiple rc files for different clusters.
Queue Structure
The installation procedure creates a default cluster queue structure, which is suitable for getting acquainted with the system. The default queue can be removed after installation.
Consider the following when determining a queue structure:
■ Whether you need cluster queues for sequential, interactive, parallel, and other job types
■ Which queue instances to put on which execution hosts ■ How many job slots are needed in each queue
For more detailed information on administering cluster queues, see Oracle Grid Engine Administration Guide.
Host System Requirements
Master Host
The master host controls the Grid Engine system. This host runs the master daemon
sge_qmaster.
The master host must comply with the following requirements:
Note: If your $SGE_CELL name already reflects the desired cluster name and also satisfies $SGE_CLUSTER_NAME restrictions, set the cluster name to the $SGE_CELL value. Otherwise, the proposed default value is pSGE_QMASTER_PORT, which uniquely identifies the running cluster by the port on which its qmaster daemon is running. See Installing SMF Services for more information.
Note: No matter what directory is used for the installation of the software, the administrator can change most settings that were created by the installation procedure. This change can be made while the system is running.
Host System Requirements
■ The host must be a stable platform.
■ The host must not be excessively busy with other processing.
■ At least 60 to 120 Mbytes of unused main memory must be available to run the Grid Engine system daemons. For very large clusters that include many hundreds or thousands of hosts and tens of thousands of jobs in the system at any time, 1 GByte or more of unused main memory might be required and 2 CPUs might be beneficial.
■ The master host must be installed before shadow master execution, administration, or submit hosts.
■ (Optional) The Grid Engine software directory, $SGE_ROOT, should be installed locally to cut down on network traffic.
For more information, see How to Install the Master Host.
Shadow Master Hosts
These hosts back up the functionality of sge_qmaster in case the master host or the master daemon fails. To be a shadow master host, a machine must have the following characteristics:
■ It must run sge_shadowd.
■ It must share sge_qmaster status, job information, and queue configuration information that is logged to disk. In particular, the shadow master hosts need read/write root or administration user access to the sge_qmaster spool directory and to the $SGE_ROOT/$SGE_CELL/common directory.
■ The $SGE_ROOT/$SGE_CELL/common/shadow_masters file must contain a line defining the host as a shadow master host.
The shadow master host facility is activated for a host as soon as these conditions are met. You do not need to restart the Grid Engine system daemons to make a host into a shadow master host.
For more information, see How to Install Shadow Master Host.
Execution Hosts
Execution hosts run the jobs that users submit to the Grid Engine system. An execution host must first be set up as an administration host. You run an installation script on each execution host. For more information, see How to Install Execution Hosts.
Note: Windows hosts cannot act as master hosts.
Note: If no cell name is specified during installation, the value of
$SGE_CELL is default.
User Account Considerations
Administration Hosts
Operators and managers of the Grid Engine system use administration hosts to perform administrative tasks such as reconfiguring queues or adding Grid Engine users.
The master host installation script automatically makes the master host an
administration host. During the master host installation process, you can add other administration hosts. You can also manually add administration hosts on the master host at any time after installation.
Submit Hosts
Jobs can be submitted and controlled from submit hosts. The master host installation script automatically makes the master host a submit host.
User Account Considerations
User Names
For the Grid Engine system to verify that users submitting jobs have permission to submit them on the desired execution hosts, users' names must be identical on the submit and execution hosts. You might therefore have to change user names on some machines, because Grid Engine user names map directly to system user accounts.
Installation Accounts
You can install the Grid Engine software either as the root user or as an unprivileged user, for example, your own user account. However, if you install the software when you are logged in as an unprivileged user, the installation allows only that user to run Grid Engine jobs. Access is denied to all other accounts. Installing the software when you are logged in as root resolves this restriction. However, root permission is
required for the complete installation procedure. Also, if you install as an unprivileged user, you are not allowed to use the qrsh, qtcsh, or qmake commands, nor can you run tightly integrated parallel jobs.
To use SMF on Solaris 10 or later hosts and run the Grid Engine software as an unprivileged user, perform the following additional steps as root user (or user with appropriate permissions):
1. Create the new role sgeadmin for the local user :
roleadd -c "Grid Engine SMF Administrator" -g <group> -d <home_dir> -u <UID> -s <profile_shell> -P "solaris.smf.manage.sge" "sgeadmin"
2. Assign the just-created role sgeadmin to the user: usermod -R "sgeadmin" <login>
For a distributed name service, such as NIS, NIS+, or LDAP, create the new role
sgeadmin and assign it to the user:
/usr/sadm/bin/smrole add -D <domain_name> - -n "sgeadmin" -a "normal_user" -d Note: User names on the master host are not relevant for permission checking. These user names do not have to match or even exist.
Installation Methods
<home_dir> -c "Grid Engine SMF Administrator" -p "solaris.smf.manage.sge"
File Access Permissions
If you install the software logged in as root, you might have a problem configuring root read/write access for all hosts on a shared file system. Therefore, you might have problems putting the $SGE_ROOT files onto a network-wide file system.
You can force Grid Engine software to run all Grid Engine system components through a non-root administrative user account, for example sgeadmin. With this setup, this particular user needs only read/write access to the shared $SGE_ROOT file system.
The installation procedure asks whether files should be created and owned by an administrative user account. If you answer "Yes" and provide a valid user name, files are created by this user. Otherwise, the user name under which you run the
installation procedure is used. Create an administrative user, and answer "Yes" to this question.
Make sure in all cases that the account used for file handling on all hosts has
read/write access to the $SGE_ROOT directory. Also, the installation procedure assumes that the host from which you access the Grid Engine software distribution media can write to the $SGE_ROOT directory.
If your Windows host is a member of a Windows domain, only the local
Administrator is the root user. Neither the members of the Administrators group, nor the domain Administrator, nor a member of the Domain Admins group are the root user. See User Management on Windows Hosts for more information about users on Windows hosts.
Network Services
Determine whether your site's network services are defined in an NIS database or in an /etc/services file that is local to each workstation. If your site uses NIS, determine the host name of your NIS server so that you can add entries to the NIS services map. The Grid Engine system services are sge_execd and sge_qmaster. To add the services to your NIS map, choose reserved, unused port numbers. The following examples show sge_qmaster and sge_execd entries.
sge_qmaster 6444/tcp sge_execd 6445/tcp
Installation Methods
Several methods are available for installing the Grid Engine software: ■ Interactive
■ Interactive, with increased security
Note: The name of the root user on Windows hosts depends on the system language of the Windows operating system. You can even change the name of the root user. The default name for many languages is the name Administrator.
Directory Organization
■ Automated, using the inst_sge script and a configuration file ■ Upgrade
To decide which installation method you should use, consider the following factors. ■ Do you already have the Grid Engine software installed and running?
■ If so, you will probably want to upgrade. The upgrade process is described in Upgrading Grid Engine.
■ If not, the master host installation is only done once. The master host is typically installed interactively, as described in Installing the Software With the GUI Installer or Installing the Software From the Command Line. ■ Do you need to install just a few execution hosts? If so, then you will probably
want to install them interactively, as described in Installing the Software With the GUI Installer or Installing the Software From the Command Line.
■ Do you need to install a large number of execution hosts? If so, then you might want to perform automated installation, using the inst_sge script and a configuration file. See Using the inst_sge Utility and a Configuration Template. ■ Do you require your grid to use encryption? If so, you have to perform an
interactive installation with increased security. See Installing the Increased Security Features.
Directory Organization
When determining the directory organization, you must decide the following: ■ The directory organization, for example, whether you will install a complete
software tree on each workstation, cross-mounted directories, or a partial directory tree on some workstations.
■ Where to locate each $SGE_ROOT root directory.
By default, the installation procedure installs the Grid Engine software, man pages, spool areas, and the configuration files in a directory hierarchy under the installation directory as shown in the following figure. If you accept this default behavior, you should install or select a directory with the access permissions that are described in File Access Permissions.
Note: Because changing the installation directory or the spool directories requires a new installation of the system, use extra care to select a suitable installation directory. You can preserve all important information from a previous installation.
Directory Organization
Figure 1–1 Sample Directory Hierarchy
You can choose to put the spool areas in other locations during the primary
installation. See Oracle Grid Engine Administration Guide for more detailed instructions about configuring queues.
Spool Directories under the Root Directory
During the installation of the master host, you must specify the location of a spooling directory. This directory is used to spool jobs from execution hosts that do not have a local spooling directory.
■ On the master host, spool directories are maintained under qmaster-spool-dir. The location of qmaster-spool-dir is defined during the master host installation process. The default value of qmaster-spool-dir is $SGE_ROOT/$SGE_
CELL/spool/qmaster.
■ On each execution host, a spool directory called execd-spool-dir is defined during the execution host installation processes. The default value of
execd-spool-dir is $SGE_ROOT/$SGE_CELL/spool/exec-host. You will get better performance from execution hosts with local spooling directories than from execution hosts that have NFS mounted the master host's spooling directory.
Note: If you are using a Windows execution host, you must use the local spooling directory.
Directory Organization
You do not need to export these directories to other machines. However, exporting the entire $SGE_ROOT tree and making it write-accessible for the master host and all executable hosts makes administration easier.
Choosing Between Classic Spooling and Database Spooling
During the installation, you are given the option to choose between classic spooling and Berkeley DB spooling. If you choose Berkeley DB spooling, you are then given the option to spool to a local directory or to a separate host, known as a Berkeley DB spooling server.
Using a Berkeley DB spooling server might provide better performance than classic spooling. Part of this performance increase is because the master host can make non-blocking writes to the database, but has to make blocking writes to the text file used by classic spooling. Also consider file format and data integrity. Writing to the Berkeley DB provides a greater level of data integrity than writing to a text file. However, a text file stores data in a format that you can read and edit. Normally, you do not need to read these files, but the spooling directory contains the messages from the system daemons, which can be useful for debugging.
$SGE_ROOT Directory
You must create a directory into which to load the contents of the distribution media. This directory is called the root directory, or $SGE_ROOT. When the Grid Engine system is running, this directory stores the current cluster configuration and all other data that must be spooled to disk.
Use a valid path name for the directory that is network-accessible on all hosts. For example, if the file system is mounted using automounter, set $SGE_ROOT to /usr/SGE6, not to /tmp_mnt/usr/SGE6.
The $SGE_ROOT directory is the top level of the Grid Engine software directory tree. On startup, each Grid Engine software component in a cell needs read access to the $SGE_ ROOT/$SGE_CELL/common directory. When Grid Engine software is installed as a single cluster, the value of $SGE_CELL is default.
Note: If no cell name is specified during installation, the value of
$SGE_CELL is default.
Note: If you use a Lustre fileshare as the spool directory, you should disable file striping for these directories. For information about how to disable file striping, refer to the Lustre operation manual located at:
http://wiki.lustre.org/index.php/Lustre_Documentation.
Note: For efficient spooling, place the spooling directories somewhere other than within $SGE_ROOT.
Note: Throughout this information space, the $SGE_ROOT
environment variable is used to refer to the directory into which the Grid Engine software is installed.
Spooling Options
For ease of installation and administration, this directory should be readable on all hosts on which you intend to run the Grid Engine software installation procedure. For example, you can select a directory that is available across a network file system, such as NFS. If you choose to select file systems that are local to the hosts, you must copy the installation directory to each host before you start the installation procedure for the particular machine. See File Access Permissions for a description of required
permissions.
Spooling Options
During the installation, you are given the option to choose between classic spooling and Berkeley DB spooling. If you choose Berkeley DB spooling, you are then given the option to spool to a local directory or to a separate host, known as a Berkeley DB spooling server.
Using a Berkeley DB spooling server might provide better performance than classic spooling. Part of this performance increase is because the master host can make non-blocking writes to the database, but has to make blocking writes to the text file used by classic spooling. Also consider file format and data integrity. Writing to the Berkeley DB provides a greater level of data integrity than writing to a text file. However, a text file stores data in a format that you can read and edit. Normally, you do not need to read these files, but the spooling directory contains the messages from the system daemons, which can be useful for debugging.
Database Server and Spooling Host
The master host can store its configuration and state to a Berkeley DB spooling database. The spooling database can be installed on the master server or on a separate host. When the Berkeley DB spools into a local directory on the master host, the performance is better. If you want to set up a shadow master host, you need to use a separate Berkeley DB spooling server (host). In this case, you have to choose a host with a configured RPC service. The master host connects through RPC to the Berkeley DB.
With the introduction of NFS4 software available with the Solaris 10 operating system, you can use Berkeley DB spooling on a network file system. You could not use
Berkeley DB spooling on previous NFS versions. This circumstance allows a shadow host installation spooled on Berkeley DB without setting up an additional Berkeley DB Spooling Server.
If you choose to use Berkeley DB spooling without a shadow master, you do not need to set up a separate spooling server. Likewise, if you choose not to use Berkeley DB
Note: This configuration does not provide a High-Availability (HA) solution. For example, scripts of pending jobs are not spooled through BDB spool server and thus are not available for a shadow master.
Caution: Although using a shadow master host is more reliable, using a separate Berkeley DB spooling host results in a potential security hole. RPC communication as used by the Berkeley DB can be easily compromised. Only use this alternative if your site is secure and if users can be trusted to access the Berkeley DB spooling host by means of TCP/IP communication.
Scheduler Profiles
spooling, you can set up a shadow master host without setting up a separate spooling server.
Once you determine whether you need a separate spooling server, you will also need to determine the location for the spooling directory. The spooling directory must be local to the spooling server. A default value for the location of the spooling directory is recommended during installation, but this default value is not suitable when the file server is different from the master host.
The requirements for the Berkeley DB spooling host are similar to the requirements for the master host:
■ The host must be a stable platform.
■ The host must not be excessively busy with other processing.
■ At least 60 to 120 Mbytes of unused main memory must be available to run the Grid Engine system daemons. For very large clusters that include many hundreds or thousands of hosts and tens of thousands of jobs in the system at any time, one GByte or more of unused main memory might be required and two CPUs might be beneficial.
■ (Optional) A separate spooling host must be installed before the master host. ■ (Optional) The $SGE_ROOT directory should be installed locally, to cut down on
network traffic.
Scheduler Profiles
You can choose from three scheduler profiles during the installation process: normal, high, and max. You can use these predefined profiles as a starting point for Grid Engine tuning.
Using these profiles, you can optimize the scheduler for one or more of the following: ■ The amount of information that is tracked about a scheduling run
■ The load adjustment during a scheduling run
■ Interval scheduling (the default) or immediate scheduling You can choose from three scheduler profiles:
■ normal - This profile uses load adaptation and interval scheduling, and reports all the information that the scheduler gathers during the dispatch cycle. This profile is the starting point for most grids. Use this profile if your highest priority is
gathering and reporting information about a scheduling run.
■ high - This profile is more appropriate for a large cluster, where throughput is more important than gathering and reporting all the information from the
scheduler. This profile also uses interval scheduling. Use this profile if you want to get better performance at the cost of getting less information about your
scheduling runs.
■ max - This profile disables all information gathering and reporting, enables immediate scheduling, and disables load adaptation. Immediate scheduling is very useful for sites with high throughput and very short running jobs. The advantage of immediate scheduling decreases as runtime of the jobs increases. This profile can be used in clusters of any size where only throughput is important and everything else is a lower priority.
For more information on how to configure scheduling, see Oracle Grid Engine Administration Guide.
Getting the Software
Getting the Software
The software is distributed through electronic download and on CD-ROM.
Electronic Download
To electronically download a copy of the Grid Engine software, visit SUN.COM. The product distribution is in pkgadd format for the Solaris Operating System (Solaris OS). If you would like to download a copy of the open source Grid Engine software, visit the download center.
CD-ROM Distribution
For information on how to access CD-ROMs, ask your system administrator or refer to your local system documentation. For instructions, see Loading the Distribution Files on a Workstation.
2
2
Installing Grid Engine
To effectively install Grid Engine, perform the following tasks in the order that they are listed:In addition, you might need to perform one or more related tasks: Table 2–1 Installation Tasks
Topic Description
Planning the Installation Strategically plan your installation to achieve results that fit your environment.
Loading the Distribution Files on a Workstation
Unpack and load the distribution files onto a workstation.
Installing the Software With the GUI Installer
Learn how to run the new GUI installer and install whole cluster.
Installing the Software From the Command Line
Learn how to run an installation script on the master host and on every execution host in the Grid Engine system and to register information about administration hosts and submit hosts.
Installing the Increased Security Features
Set up your system more securely.
Oracle Grid Engine User’s Guide
Install the Accounting and Reporting Console, an optional feature that enables you to gather live reporting data from the Grid Engine system.
Verifying the Installation Verify that the daemon is running on the master host and on the Execution Hosts and how to run simple commands and submit test jobs.
Table 2–2 Additional Installation Tasks
Topic Description
Automating the Installation Process
Learn how to automate the Grid Engine installation process.
Installing SMF Services Learn how to install the Service Management Facility (SMF) services.
Installing a JMX-Enabled System
Learn how to install a JMX-enabled system.
Removing the Software Learn how to remove the Grid Engine software.
Additional Software for the Microsoft Operating System
Learn how to install Grid Engine on Microsoft Windows operating system.
Loading the Distribution Files on a Workstation
Loading the Distribution Files on a Workstation
The Grid Engine 6.2 software is distributed on CD-ROM and through electronic download. The CD-ROM distribution contains a directory named Sun_Grid_Engine_ 6_2. The product distribution is in this directory, in both tar.gz format and the pkgadd
format. The pkgadd format is provided for the Solaris Operating System (Solaris OS). For all supported operating systems, the software is distributed in tar.gz format. For more on how to obtain the distribution files, see Getting the Software.
How to Load the Distribution Files on a Workstation
Ensure that the file systems and directories that are to contain the Grid Engine software distribution and the spool and configuration files are set up properly by setting the access permissions as defined in File Access Permissions.
1. Provide access to the distribution media. If you downloaded the software, rather than getting it on CD-ROM, just unzip the files into a directory. This directory must be located on a file system that has at least 350 MBytes free space.
2. Log in to a system. Log in preferably on a system that has a direct connection to a file server.
3. Create the installation directory. Create an installation directory as described in $SGE_ROOT Directory.
# mkdir /opt/sge6-2
In these instructions, the installation directory is abbreviated as sge-root. 4. Install the binaries for all binary architectures that are to be used by any of your
master, execution, and submit hosts in your Grid Engine system cluster. You can use either the pkgadd Method or the tar Method.
pkgadd Method
The pkgadd format is provided for the Solaris Operating System. To facilitate remote installation, the pkgadd directories are also provided in zip files.
You can install the following packages:
User Management on Windows Hosts
Learn how to manage user accounts on Windows hosts.
Other Installation Issues Learn how to identify additional considerations for installing Grid Engine software.
Table 2–3 Installing Packages Using Pkgadd Method
Package Description
SUNWsgeec Architecture independent files
SUNWsgeex Solaris (SPARC platform) 64-bit binaries for Solaris 8, Solaris 9, and Solaris 10 Operating Systems
SUNWsgeei Solaris (x86 platform) binaries for Solaris 8, Solaris 9, and Solaris 10 Operating Systems
SUNWsgeeax Solaris (x64 platform) binaries for Solaris 10 Operating System Table 2–2 (Cont.) Additional Installation Tasks
Loading the Distribution Files on a Workstation
As you type the following commands, you must be prepared to respond to script questions about your base directory, sge-root, and the administrative user. The script requests the choices that you made during the planning steps of this installation. See Planning the Installation for further details.
At the command prompt, type the following commands, responding to the script questions.
# cd cdrom_mount_point/Sun_Grid_Engine_6_2 # pkgadd -d ./Common/Packages SUNWsgeec
Depending on the Solaris binary that you need, type one of the following commands: # pkgadd -d ./Solaris_sparc/Packages SUNWsgee
# pkgadd -d ./Solaris_sparc/Packages SUNWsgeex # pkgadd -d ./Solaris_x86/Packages SUNWsgeei # pkgadd -d ./Solaris_x64/Packages SUNWsgeeax
tar Method
For all supported operating systems, the software is distributed in tar.gz format. Regardless of platform, install the architecture independent file Common/tar/sge-6_ 2-common.tar.gz.
The tar files that contain platform-specific binaries use the naming convention of
sge-6_2-bin-architecture.tar.gz.
The following table lists the platform-specific binaries. Install the file for each platform that you need to support. Note that each platform has its own directory under Sun_ Grid_Engine_6_2.
SUNWsgeea Accounting and Reporting Console (ARCo) packages for the Solaris and Linux Operating systems.
Table 2–4 Installing Binaries Using Tar Method Platform-Specific File Platform
Solaris_sparc/tar/sge-6_ 2-bin-solaris-sparcv9.ta r.gz
Solaris (SPARC platform) 64-bit binaries for Solaris 8, Solaris 9, and Solaris 10 Operating Systems
Solaris_x86/tar/sge-6_ 2-bin-solaris-i586.tar.g z
Solaris (x86 platform) binaries for Solaris 8, Solaris 9, and Solaris 10 Operating Systems
Solaris_x64/tar/sge-6_ 2-bin-solaris-x64.tar.gz
Solaris (x64 platform) 64-bit binaries for Solaris 10 Windows/tar/sge-6_
2-bin-windows-x86.tar.gz
Microsoft Windows (x86 platform) 32-bit binaries for Windows 2000, XP and Windows Server 2003 Linux24_i586/tar/sge-6_
2-bin-linux24-i586.tar.g z
Linux (x86 platform) binaries for the 2.4 and 2.6 kernel
Linux24_amd64/tar/sge-6_ 2-bin-linux24-ia64.tar.g z
Linux (Itanium platform) binaries for the 2.4 and 2.6 kernel Table 2–3 (Cont.) Installing Packages Using Pkgadd Method
Installing the Software With the GUI Installer
Type the following commands at the command prompt. In the example, <basedir> is the abbreviation for the full directory, cdrom-mount-point/Sun_Grid_Engine_6_2. % su
# cd <sge-root>
# gzip dc <basedir>/Common/tar/sge6_2common.tar.gz | tar xvpf
# gzip dc <basedir>/Solaris_sparc/tar/sge6_2binsolsparc32.tar.gz | tar xvpf # gzip dc <basedir>/Solaris_sparc/tar/sge6_2binsolsparc64.tar.gz | tar xvpf -# SGE_ROOT=<sge-root>; export SGE_ROOT
# util/setfileperm.sh $SGE_ROOT
Installing the Software With the GUI Installer
A new GUI installer to simplify the installation process is available since Grid Engine 6.2u2. The GUI installer enables you to easily install a whole cluster interactively. To install a cluster, you need to set up the environment in a similar way to an automatic installation.
Requirements
■ The GUI installer requires at least Version 5 of the Java platform. ■ Screen resolution of 1024x768 or larger.
■ (Optional) Password-less ssh or rsh access as root user to all remote hosts that you want to install. If this requirement is not met you can only install Grid Engine components on a local host. For more information, see How to Configure
Password-less Access for the root User. You can still use the GUI installer by starting it locally from each remote host.
■ Start the installer as root user.
■ Ensure that you start the installation from the qmaster host when password-less root access is available.
For information on installation modes supported by the GUI installer, see these topics: Linux24_amd64/tar/sge-6_
2-bin-linux24-x64.tar.gz
Linux binaries for the 2.4 and 2.6 kernel MacOSX/tar/sge-6_
2-bin-darwin-ppc.tar.gz
Apple Mac OS/X (PowerPC platform) MacOSX/tar/sge-6_
2-bin-darwin-x64.tar.gz
Apple Mac OS/X (Intel-based platform) HPUX11/tar/sge-6_
2-bin-hp11.tar.gz
Hewlett-Packard HP-UX 11 or higher HPUX11/tar/sge-6_
2-bin-hp11-64.tar.gz
64-bit binaries for Hewlett-Packard HP-UX 11 or higher Aix43/tar/n1ge-6_
1-bin-aix51.tar.gz
IBM AIX 5.1 and 5.3
Table 2–4 (Cont.) Installing Binaries Using Tar Method Platform-Specific File Platform
Express Installation
For additional reference information, see these topics:
Express Installation
The express installation mode is targeted at first-time users and provides a significantly reduced set of parameters to configure. This mode also provides
reasonable default values for most of the parameters. You must have a password-less
ssh or scp access if you are planning to install Grid Engine on remote hosts. The following steps describe a complete cluster installation and assume that the
password-less access is configured. (Click any of the screen captures in the following steps to view more details.)
Using the Express Installation Mode
The express installation steps are as follows.
1. Start the GUI installer. On the welcome screen, click Next.
2. Choose components to install. Click Next. See the following table for a brief explanation of options displayed on this screen.
Topic Description
Express Installation Enables first-time users to install the software easily. Provides a significantly reduced set of parameters that need to be
configured. Requires password-less ssh access as root user to all remote hosts that you want to install.
Custom Installation Enables you to configure almost all existing options that are available during the command-line installation. Offers more advanced features for the cluster host selection. Requires password-less ssh or rsh access as root user to all remote hosts that you want to install.
Topic Description
How to Configure Password-less Access for the root User
Procedure for configuring a password-less ssh or rsh access for the root user to install a whole Grid Engine (SGE) cluster by using the GUI installer.
Understanding Host and Installation States
Describes the different installation states that you might encounter while using the GUI installer.
Tweaking start_gui_ installer
Describes the command-line options of the start_gui_installer command and how to use them to fine tune the performance of the installer.
Troubleshooting the GUI Installer
Contains known issues and their workarounds.
Note: Ensure that you start the GUI installer on the qmaster host. As
root, run the start_gui_installer command in your sge-root
directory. For example:
master:/sge# ./start_gui_installer Starting Installer ...
Express Installation
If you are not sure what you want to install, keep the components selected by default.
3. Modify the main configuration details. Click Next.
Figure 2–1 Main Congifuration Information Host type Description
Qmaster host Main component in Grid Engine software. You must install exactly one qmaster component per Grid Engine cluster installation.
Execution host(s) Hosts that execute the tasks (jobs).
Shadow host(s) Hosts that provide a high availability feature to the cluster. In case the qmaster fails (for example, due to a crash or network issue), one of the shadow hosts takes over the qmaster responsibility. Berkeley db host Host that implies a Berkeley db host spooling option. Grid Engine
then spools data to a remote server. Not recommended as the default option.
Option Description
Admin user Grid Engine processes will be executed under this user name, and certain directories will be owned by this user.
Qmaster host Host that will run qmaster daemon (main component). It can be changed later in the host selection.
Grid Engine root directory
Directory where you unpacked Grid Engine tar.gz archive or installed a package (for example, rpm, pkg). It must not contain an automounter prefix.
Cell name Name of this Grid Engine cell, a value that identifies an instance of Grid Engine when several instances run simultaneously.
Express Installation
Typically, one would provide a valid administrator email and click next. 4. Select hosts to be installed and fix reported problems. Click Install to start the
installation on the reachable hosts. Figure 2–2 Selecting Hosts
This screen allows you to select the hosts and components that you would like to install. Express installation mode has a slightly simplified selection model. Custom installation mode enables you to change the components that will be selected once new hosts are added. The qmaster host is added based on the qmaster host value from the main configuration screen by default. You can select the hosts in one of two different ways:
■ By a host name, host name pattern, or by an IP address or IP address pattern Cluster name Name of this Grid Engine instance used by SMF on Solaris
machines. In express installation mode, this instance is hidden and has a default value of p6444. The following naming restrictions apply to this field: The cluster name must start with a letter ([A-Za-z]), followed by letters, digits ([0-9]), dashes ("-"), or underscores ("_").
Qmaster port Port that will be used by the qmaster daemon. Default value is 6444.
Execd port Port that will be used by the execution daemon. Default value is 6445.
Administrator mail Email address used by Grid Engine to report issues to the grid administrator. Default value is none (no emails will be sent). Automatically start
service(s) at machine boot
Component (service) will be automatically started at machine boot. By default, this is selected.
Express Installation
■ From a file that you create using the installer's save action
The patterns do not support regular expressions. The supported expressions are lists and numeric ranges. For more information, see the following table:
In the following screen sequence, hosts grid00 to grid10 are added as execution and submit hosts. However, host grid11 has an error. See Understanding Host and Installation States for a complete list of errors and possible solutions. Note that each state has a tooltip that displays a better error message. Once the errors are resolved on the problematic hosts, select hosts that you want to verify and right-click. A pop-up menu enables you to refresh selected hosts. Optionally, invalid hosts can be removed. Once the states have been refreshed, a different error state or reachable state will be displayed.
Figure 2–3 Adding Hosts grid00 to grid10
Description Input Resolved Value
Host name grid00 grid00
IP address 192.168.0.1 192.168.0.1
List of hosts grid00grid01grid03 grid00grid01grid03 List of IP addresses 192.168.0.1192.168.0.2
192.168.0.5
192.168.0.1
192.168.0.2192.168.0.5 Host ranges grid[00-03] grid00 grid01 grid02 grid03 Range of IP
addresses
192.[168-169].0.[50-60] 192.168.0.50 ... 192.168.0.60, 192.169.0.50 ... 192.169.0.60
Express Installation
Figure 2–4 Unreachable Host State
5. (Optional) Modify the host configuration. Click OK. Select a host in the Select hosts screen, right-click on the host and click Configure to modify the host configuration.
Figure 2–5 Modifying Host Configuration
Table 2–5 Host Configuration Information
Option Description
Local execd spool directory
Directory for local execd spooling data. JVM library path Path to the JVM library on the qmaster
and/or shadow hosts.
Additional JVM args Additional arguments to be used when starting the JVM in qmaster.
Connect user The user which will be used to connect to the remote host using ssh or scp.
Express Installation
6. (Optional) Fix problems reported during pre-install validation, then click Install. When you click the Install button as described in Step 5, the installation does not start immediately. First, the installer executes a series of advanced checks for each host to verify that there is no misconfiguration. If the validation fails, host states are updated and you are presented with an option to return to the host selection or to continue with the installation.
In the following screen, one host has a configuration error. See Understanding Host and Installation States for a complete list of errors and possible solutions. Notice that each state includes a tooltip that displays an error message.
Figure 2–6 Configuration Warning Message
7. Monitor the progress of the installation, then click Next. Install timeout (sec) Timeout value for any installation task.
Note: Continuing the installation after the installer reports errors will likely result in a failed installation. Before restarting the
installation, you should return to the host selection and either resolve the reported problems or remove the hosts that have configuration errors.
Table 2–5 (Cont.) Host Configuration Information
Express Installation
Figure 2–7 Grid Engine Installing Status
Express Installation
Figure 2–9 Error Message for Existing Cluster
If there were any failures during the installation, the Failed tab is selected. See Understanding Host and Installation States for a complete list of installation states. Click the Log button for each failed installation for more information.
This error is displayed because the cluster name p6444 already exists on this host (installation was not attempted).
Custom Installation
Figure 2–10 Reviewing Grid Engine Installation Results
Optionally, print or save the information about the Grid Engine configuration for future reference. The page is also automatically saved to the $SGE_ROOT/$SGE_ CELL/Readme_TIMESTAMP.html file. If the page could not be saved there, due to root being mapped to nobody on NFS shared file system, it is saved to
/tmp/Readme_TIMESTAMP.html. To verify the installation, go to Verifying the Installation.
Custom Installation
The custom installation mode is targeted at the experienced users. It offers more advanced customization of Grid Engine installation than the Express Installation. It provides default values for most of the parameters. You must have a password-less
ssh or rsh access if planning to install Grid Engine on remote hosts. The following steps assume that the password-less access is configured and describe a cluster installation consisting of:
■ Qmaster host with JMX feature enabled ■ Three execution hosts on various architectures ■ One shadow host
■ One administrative host ■ Four submit hosts
Using the Custom Installation Mode
The custom installation steps are as follows.
Custom Installation
2. Choose components to install, including a shadow host and the custom
installation option, and click Next. See the following table for a brief explanation of options displayed on this screen.
3. Modify the main configuration details. Click Next.
Figure 2–11 Modifying Main Configuration Information
Note: Ensure that you start the GUI installer on the qmaster host. As root, run the start_gui_installer command in your sge-root
directory. For example:
master:/sge# ./start_gui_installer Starting Installer ...
Host type Description
Qmaster host Main component in Grid Engine software. Exactly one qmaster component must be installed per Grid Engine cluster installation.
Execution host(s) Hosts that execute the tasks (jobs).
Shadow host(s) Shadow hosts provide a high availability feature to the cluster. In case that the qmaster fails (crash, network issue), one of the shadow hosts will take over the qmaster responsibility.
Berkeley db host Selecting it implies a Berkeley db host spooling option. The Grid Engine then spools data to a remote server. Not recommended as default option.
Custom Installation
Typically, one would customize the default values and click Next. 4. Modify the JMX configuration details. Click Next.
Option Description
Admin user Grid Engine processes will be executed under this user name, and certain directories will be owned by this user.
Qmaster host Host that will run qmaster daemon (main component). It can be changed later in the host selection.
Grid Engine root directory Directory where you unpacked the Grid Engine tar.gz archive or installed a package (for example, rpm, pkg). It must not contain an automounter prefix. Cell name Name of this Grid Engine cell, a value that identifies an instance of a Grid Engine when several instances run simultaneously.
Cluster name Name of this Grid Engine instance used by SMF on Solaris machines. The following naming restrictions apply to this field: The cluster name must start with a letter ([A-Za-z]), followed by letters, digits ([0-9]), dashes ("-"), or underscores ("_").
Qmaster port Port that will be used by the qmaster daemon. Default value is 6444.
Execd port Port that will be used by the execution daemon. Default value is 6445.
Group id range Range of additional group IDs. The group IDs in this range must not be used anywhere else. The size of the range determines how many concurrent jobs can run in Grid Engine. Choose a large value.
Shell name Shell to be used while connecting to remote hosts (with ssh or rsh syntax). Expected values for this field are ssh or rsh.
Copy command Command to be used while copying files to remote hosts (with scp or rcp syntax). Expected values for this field are scp or rcp.
Administrator mail Email address used by the Grid Engine to report issues to the grid administrator. Default value is none (no emails will be sent).
Automatically start service(s) at machine boot
Component (service) will be automatically started at machine boot. By default, this is selected.
Use JMX Triggers installation of a JVM thread in qmaster. Currently only needed when you plan to install Service Domain Manager. By default, this is selected. Ignore domain names Grid Engine will ignore domain names when
comparing host names. By default, this is selected. Use CSP product mode Grid Engine will be installed with certificate security
protocol (CSP). Communication between Grid Engine daemons will be protected by an SSL certificate. Has impact on cluster throughput. By default, this is not selected.
Custom Installation
Figure 2–12 Modifying JMX Configuration Details
5. Modify the spooling configuration. Click Next.
Option Description
JMX port Port number to be used by JVM thread in qmaster process. Enable SSL server
authentication
Once enabled, SSL certificate configuration will be presented later. The server certificate will be used for authentication and encryption.
Enable SSL client authentication
Client authentication will be used.
Path to the keystore Path to Java keystore file that will be created during the qmaster installation.
Keystore password Keystore password. Default value is changeit. Retype password Password to retype. Default value is changeit.
Custom Installation
Figure 2–13 Modifying the Spooling Configuration
6. (Optional) Provide SSL certificate information. Click Next.
Option Description
Qmaster spool directory Directory for qmaster spooling data. Global execd spool directory Directory for execution daemon spooling
directory used by default for all execution hosts. Unless overridden in the host selection screen, each execution host creates a subdirectory in the global execd spool directory.
Classic spooling method Spooling is done in human readable format.
Berkeley db spooling method Spooling is done to local Berkley db. Berkeley db spooling server spooling
method
Spooling is done to Berkley db server. Berkeley db host Host for Berkeley db server, enabled only
when Berkeley db spooling server method is selected.
Db directory Berkeley db spooling directory, either on local host or Berkeley db host, if Berkeley db spooling server method is selected.
Custom Installation
Figure 2–14 Providing SSL Configuration Details
This screen is displayed only when you have previously selected the JMX or CSP features. An SSL certificate will be generated as part of qmaster installation. This certificate will then be used throughout the Grid Engine.
7. Select hosts to be installed and fix reported problems. Click Install to start the installation on the reachable hosts.
This screen allows you to select the hosts and components that you would like to install. The qmaster host is added based on the qmaster host value from the main configuration screen by default. You can select the hosts in one of two different ways:
■ By a host name, host name pattern, or by an IP address or IP address pattern ■ From a file that you create using the installer's save action
Option Description
Country code Two-character country code. Default value is DE.
State State. Default value is GERMANY. Location Location. Default value is
Building.
Organization Organization. Default value is Organisation.
Organization unit Organization unit. Default value is Organisation_unit.
Email address Email address. Default value is [email protected].
Custom Installation
The patterns do not support regular expressions. The supported expressions are lists and numeric ranges. For more information, see the following table:
In the following screen sequence, six shadow, execution admin, and submit hosts are added from a file and later seven execution and admin hosts are added via hostname. Two hosts (grid11, grid12) have errors; they are unreachable. See Understanding Host and Installation States for a complete list of errors and possible solutions. Note that each state has a tooltip that displays a better error message. Hosts can be refreshed or removed using a context menu. The default component selection may be changed from execution and submit host to also include shadow and admin host before pressing the Add button. The selected components will be applied to any newly added hosts.
Figure 2–15 Selecting Hosts from File
Description Input Resolved Values
Host name grid00 grid00
IP address 192.168.0.1 192.168.0.1
List of hosts grid00grid01grid05 grid00grid01grid05 List of IP addresses 192.168.0.1192.168.0.2
192.168.0.5
192.168.0.1192.168.0.2 192.168.0.5
Host ranges grid[00-10] grid00, grid01, ..., grid10 Range of IP addresses 192.[168-169].0.[50-60 ] 192.168.0.50... 192.168.0.60, 192.169.0.50 ... 192.169.0.60
Custom Installation
Figure 2–16 Selecting Hosts Using Hostname
Figure 2–17 Unreachable Hosts in Selected Host List
8. (Optional) Modify the host configuration. Click OK. Select a host in the Select hosts screen, right-click on the host and click Configure to modify the host configuration.