• No results found

Hadoop Multi-node Cluster Installation on Centos6.6

N/A
N/A
Protected

Academic year: 2021

Share "Hadoop Multi-node Cluster Installation on Centos6.6"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

Hadoop Multi-node Cluster

Installation

on Centos6.6

Created: 01-12-2015 Author: Hyun Kim Last Updated: 01-12-2015 Version Number: 0.1 Contact info: [email protected] [email protected]

(2)

Hadoop Multi Cluster Installation Guide with Centos 6

In this tutorial, we are using Centos 6.6 and we are going to install multi node cluster Hadoop.

For this tutorial, we need at least two nodes. One of them is going to be a master node and the other node is going to be a slave node. I’m only using two nodes in this tutorial to make this guide as simple as possible. We will be installing namenode and jobtracker on the master node and installing datanode, tasktracker, and

secondarynamenode on the slave node. I’m using hostname for my masternoe as lbb01.exmaple.com and slavenode as lbb02.example.com. Simple enough? Let’s get started.

Static IP Configuration

We want our servers to work all the time even when they restart by accident. Therefore, we will configure static ip for each server. Use the command below to open ethernet configuration.

You connection might be eth0 instead of em1. $nano /etc/sysconfig/network-scripts/ifcfg-em1

Change BOOTPROTO = “static” and add your IPADDR and NETMASK.

You can check your ip and netmask address by using “ifconfig” command. As an exmaple:

IPADDR=”192.168.23.234” NETMASK=”255.255.255.0”

(3)

Configure Default Gateway

$ nano /etc/sysconfig/network

Now we are trying to configure network. This may sound complicated but we are simply add HOSTNAME and GATEWAY. If GATEWAY or HOSTNAME exists already, simply edit them.

I’m using lbb01.exmaple.com as my hostname as you can see in the picture below.

Add your GATEWAY=XXX.XXX.XXX.X

Restart network

$etc/init.d/network restart Configure DNS

$ nano /etc/resolv.conf

add your primary and alternative nameserver. For example,

nameserver xxx.xxx.xxx.x nameserver xxx.xxx.xxx.x

$ install yum

(4)

Download JDK

We need JDK to install Hadoop. I’m installing jdk-7u25 in this tutorial. ww.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase7-521261.html#jdk-7u25-oth-JPR

(5)

Download hadoop

We are installing hadoop-0.20.0 in this tutorial. Hadoop-0.20.0 Donwload

(6)

I saved the file under root folder. Ping localhost

Do what we’ve done so far on slave node as well. Do change host name to lbb02.exmaple.com NOT lbb01.example.com. Each node has different

IPADDR(ip address) so use command “ifconfig” to adjust all the settings. edit /etc/hosts

on each node edit the hosts file. $nano /etc/hosts

add

XXX.XXX.XXX.XXX(ip address for your master node) lbb01.example.com(hostname for your master node)

(7)

XXX.XXX.XXX.XXX(ip address for your slave node) lbb02.example.com(hostname for your master node)

Try to ping each host to see if they can communicate with each other. You should be able to ping each host by hostname now.

On each node, $ping lbb01.example.com $ping lbb02.exmaple.com nslookup $ nslookup lbb01.example.com $ nslookup lbb02.example.com

If these commands outputs server, address, name on each node, we have successfully configured network settings.

Install hadoop

As you can see, I’m logged in as a root user. However, I’m not going to extract hadoop as a root user. I will be moving the hadoop file to

/home/lbbd/ since that is where I can write the file under the user name “lbbd”.

Your user/account name will be different. Be aware.

Giving lbbd permission

Although the hadoop file is extracted under /home/lbbd/, we need to give lbbd permission to play wit this folder. To do this, use the command below.

(8)

$ chown -R lbbd:lbbd /home/lbbd/hadoop-0.20.0

Change hadoop-0.20.0 to hadoop

$ ln -s hadoop-0.20.0 hadoop Why change to hadoop?

So that whenever we need to edit something on hadoop-0.20.0 folder, we don’t have to type -0.20.0 anymore. We can simply go to hadoop-0.20.0 folder by $ cd /home/lbbd/hadoop. It’s convenient.

Install JDK

I saved the jdk-7u25 file on /root/hadoop_packages. You didn’t have to do this. Wherever you saved your jdk file, go to the folder. use the command below to extract the file.

$ rpm -ivh hadoop_pcakges/jdk-7u25-linux-x64.rpm

(9)

$nano /home/lbbd/hadoop/conf/hadoop-env.sh

Now we need to change hadoop-env since we need to let hadoop related files to know where we we extracted jdk and hadoop.

so I added two lines below:

export JAVA_HOME=/usr/java/jdk1.7.0_25/ export HADOOP_HOME=/home/lbbd/hadoop

core-site.xml edit

$nano /home/lbbd/hadoop/conf/core-site.xml Edit the file by adding

<property>

<name>fs.default.name</name>

<value>hdfs://(your host anme):9000</value> </property>

(10)

hdfs-site.xml <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.name.dir</name> <value>/var/datastore</value> <final>true</final> </property>

(11)

Don’t forget to give you account permission to /var/datastore. Namenode cannot run without permission.

So login as root and create the folder shown above $ mkdir /var/datastore

then give the user permission to access to the folder $ chown -R lbbd:lbbd /var/datastore

use to below command to see if the permission has been updated $ls -l /var/ mapred-site.xml <property> <name>mapred.job.tracker</name> <value>hostname:9001</value> </property>

(12)

edit .bash_profile

(13)

run these commands below to see if everything is installed and directed correctly in the system

$java

$hadoop

(14)

Format Namenode

(15)

$ hadoop-daemon.sh start namenode $ jps

jobtracker running

$ hadoop-daemon.sh start jobtracker $ jps

Do all the followings above on your slave node as well. However, when you edit hdfs.xml file use the properties below:

<property>

(16)

<value>2</value> </property> <property> <name>dfs.data.dir</name> <value>/home/data</value> <final>true</final> </property>

And then you need to create data folder by $mkdir /home/data (as root user) and give your user account permission to this folder as we did with /var/datastore folder.

References

Related documents

Includes major sign development, exposure in all facility media and site literature, site exposure (sponsor plaque), name recognition and road exposure, etc. With an

From examining your current situation, to setting goals, to deciding how to measure your progress, a CFP® professional is uniquely qualified to take you through the financial

For example, one UCS CPA Multi-UCSM Hadoop Cluster W/F, one Single UCSM Server Configuration W/F, and four UCS CPA Node Baremetal are created for a four node Hadoop cluster.. When

CSL has been part of the outsourcing industry since 2000 and has successfully implemented strate- gic offshore programs in the areas of IT functions like custom software

To verify that our configuration and implementation are both working, we ran trials for the same records on a Hadoop two node cluster, one node cluster, and a simple machine

We are going to create single node cluster hence select only 1 instance and choose Small Instance type which at least required for running Hadoop mapreduce example.. You can

 Install and configure Apache Hadoop on a multi node cluster.  Install and configure Cloudera Hadoop distribution in fully

Grid: If the nodes are shared across geographically and use more heterogeneous hardware. Map Step- The namenode i.e. master node takes the input given input and divides it