Enabling High Availability on a Cluster High availability is disabled by default.

In document Pivotal HD Enterprise (Page 106-111)

Enabling High Availability on a Cluster

High availability is disabled by default.

Currently we only support Quorum Journal based storage for high availability.

PCC 2.1 is the first version to support HA. If you are running an earlier version, download and import the latest version of Pivotal Command Center (PCC) (see Installing PHD Using the CLI for details)

Before you enable HA for any cluster:

Make sure you take into consideration our recommended HA Best Practices. Checkpoint your NameNode:

Stop all incoming data traffic

Then force checkpoint by running the following command: sudo -u hdfs hdfs secondarynamenode -checkpoint force

To enable HA for a new cluster; follow the instructions provided in the High Availability section of Installing .

PHD Using the CLI

To enable HA for an existing cluster, see below. Stop the cluster:

2.

3. 4.

5.

6.

For HAWQ users, stop HAWQ.

From the HAWQ master, as gpadmin, run the following:

/etc/init.d/hawq stop

Backup the Namenode data. Copy {dfs.namenode.name.dir}/ current to a backup directory. Fetch the configurations for the cluster in a local directory:

icm_client fetch-configuration -l CLUSTERNAME -o LOCALDIR

Edit clusterConfig.xml as follows:

Comment out secondarynamenode role in hdfs service

Uncomment standbynamenode and journalnode roles in hdfs service

Uncomment nameservices namenode1id namenode2id journalpath, , , , and journalport entries in serviceConfigGlobals

Edit hdfs/hdfs-site.xml as follows: Uncomment the following properties:

<property> <name>dfs.nameservices</name> <value>${nameservices}</value> </property> <property> <name>dfs.ha.namenodes.${nameservices}</name> <value>${namenode1id},${namenode2id}</value> </property> <property> <name>dfs.namenode.rpc-address.${nameservices}.${namenode1id}</name> <value>${namenode}:8020</value> </property> <property> <name>dfs.namenode.rpc-address.${nameservices}.${namenode2id}</name> <value>${standbynamenode}:8020</value> </property> <property> <name>dfs.namenode.http-address.${nameservices}.${namenode1id}</name> <value>${namenode}:50070</value> </property> <property> <name>dfs.namenode.http-address.${nameservices}.${namenode2id}</name>

Pivotal HD Enterprise PHD Installation and Administration 7. </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://${journalnode}/${nameservices}</value> </property> <property> <name>dfs.client.failover.proxy.provider.${nameservices}</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hdfs/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>${journalpath}</value> </property>

<!-- Namenode Auto HA related properties --> <property>

<name>dfs.ha.automatic-failover.enabled</name> <value>true</value>

</property>

<!-- END Namenode Auto HA related properties -->

Comment the following properties

<property>

<name>dfs.namenode.secondary.http-address</name> <value>${secondarynamenode}:50090</value> <description>

The secondary namenode http server address and port. </description>

</property>

Edit yarn/yarn-site.xml as follows: Set the following property/value:

<property>

<name>mapreduce.job.hdfs-servers</name> <value>hdfs://${nameservices}</value> </property>

8.

9.

10.

11.

Edit hdfs/core-site.xml as follows: Set the following property/value:

<property>

<name>fs.defaultFS</name>

<value>hdfs://${nameservices}</value>

<description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property>

Then uncomment following property:

<property>

<name>ha.zookeeper.quorum</name>

<value>${zookeeper-server}:${zookeeper.client.port}</value> </property>

Edit hbase/hbase-site.xml as follows: Set the following property/value:

<property>

<name>hbase.rootdir</name>

<value>hdfs://${nameservices}/apps/hbase/data</value>

<description>The directory shared by region servers and into which HBase persists. The URL should be 'fully-qualified' to include the filesystem scheme. For example, to specify the HDFS directory '/hbase' where the HDFS instance's namenode is running at namenode.example.org on port 9000, set this value to: hdfs://namenode.example.org:9000/hbase. By default HBase writes into /tmp. Change this configuration else all data will be lost on machine restart.

</description> </property>

To enable HA for HAWQ, comment out the default DFS_URL property and uncomment DFS_URL in as follows:

hawq/gpinitsystem_config

#DFS_URL=${namenode}:${dfs.port}/hawq_data #### For HA uncomment the following line DFS_URL=${nameservices}/hawq_data

Add the following properties to hawq/hdfs-client.xml:

<property>

<name>dfs.nameservices</name> <value>${nameservices}</value>

Pivotal HD Enterprise PHD Installation and Administration 12. 13. 14. 15. </property> <property> <name>dfs.ha.namenodes.${nameservices}</name> <value>${namenode1id},${namenode2id}</value> </property> <property> <name>dfs.namenode.rpc-address.${nameservices}.${namenode1id}</name> <value>${namenode}:8020</value> </property> <property> <name>dfs.namenode.rpc-address.${nameservices}.${namenode2id}</name> <value>${standbynamenode}:8020</value> </property> <property> <name>dfs.namenode.http-address.${nameservices}.${namenode1id}</name> <value>${namenode}:50070</value> </property> <property> <name>dfs.namenode.http-address.${nameservices}.${namenode2id}</name> <value>${standbynamenode}:50070</value> </property> <property> <name>dfs.client.failover.proxy.provider.${nameservices}</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property>

On the standby Namenode, move {dfs.namenode.name.dir}/ current to a backup directory (or delete).

Reconfigure the cluster:

icm_client reconfigure -l CLUSTERNAME -c LOCALDIR

Start the cluster:

icm_client start -l CLUSTERNAME

Caution

Running the reconfigure command on a secure cluster disables security in PHD-1.1.0.0 and PHD-1.1.1.0

Update the HIVE Metastore:

a. b. c. 16. 1. 2.

updated to use the nameservices so HIVE scripts can work when ever NameNode failure happens.

: Make sure metastore is not running and is backed up to a persistent store before running the update

Note

commands.

Login to host configured as hive-metastore.

Display the current NameNode and hdfspath for hive warehouse directory: /usr/lib/gphd/hive/bin/metatool -listFSRoot

Run the following command:

/usr/lib/gphd/hive/bin/metatool -updateLocation hdfs://<nameservices> hdfs://<current_namenode>:<dfs_port>

Where nameservices is the logical name used for the nameservices in a HA enabled cluster and is the hostname of the NameNode on the cluster before reconfiguring to enable current_namenode

HA.

When specifying the nameservices, do not use underscores ('_'), for example, .

phd_cluster

Restart HAWQ services for your configuration changes to take effect. From the HAWQ master, as gpadmin, run the following:

/etc/init.d/hawq start

In document Pivotal HD Enterprise (Page 106-111)