I have written couple of blogs to set up Hadoop as Single/Cluster Muti-node environment and deploying, configuring and running a Hadoop cluster manually is rather time and cost-consuming. Here’s a helping hand to create a fully distributed Hadoop cluster with Cloudera Manager. In this blog, we’ll see how fast and easy to install Hadoop cluster with cloudera Manager.

Software used:

  • CDH5
  • Cloudera Manager – 5.7
  • OS – REHL 7
  • VirtualBox – 5.2

Prepare Servers:

For Minimal cluster, we need 3 servers for non-production cluster.

  • CM – CloudManager + other Hadoop Services ( Minimum 8GB )
  • DN1/DN2 – Data Nodes

Please do the following steps on one machine CloudManager (CM)

Disable Selinux:

Setup NTP:

Disable firewall:

Distribute Authentication Key-pairs:

Define host names:

Edit hosts file in /etc/ folder in clusterManager Node (CM) , specify the IP address of each system followed by their host names. Each machine need a static IP address and all VM’s machines should be ping able from each other.

Now clone the machines to DN1/DN2. Update the IP address and hostname. Test the SSH without password and display the hostname.

Install Cloudera Manager and Agents:

Installation could be divided into the following steps:

  • Install MySql database
  • Java Set up
  • install and run Cloudera Manager server/ Agents

Install MySql Database:

Java Set up:

Hadoop is written in Java so we need to set up Java. Install the Oracle Java Development Kit (JDK) as below on all nodes.

Logout and login and you can see java version.

Copy JDK to other Nodes DN1/DN2 as well.

Get MySQL jdbc connector:

Install Cloudera Manager and Agents:

Now deploy the  cloudera manager server and agent.

Setup Cloudera repository:

Now install packages for Cloudera Manager on CM node.

On other two nodes , please  install demons/agent  only.

Prepare Cloudera Manager Database:

Start the cloudera manager.

Before starting agent,  update the host entry  in agent config files.

 Hadoop cluster Set up via Cloudera Manager:

Go to http://192.168.1.80:7180/cmf/  and  login page will appear. login with admin/admin as login/password.

Then read and accept the license agreement and choose “Cloudera Enterprise Data Hub Edition Trial” on the next page. After that you’ll be offered to set up a new cluster.

As you have already installed the Agents, you can see the hosts lists. Select all hosts.

CM3

Press continue and select the CDH version and select parcel method.

CM4

Press “Continue” and wait for distribution and activation.CM5

Wait for Cluster Inspector to finish the inspection and you’ll see all installed components.

CM6

Install Hadoop cluster:

CM7

Then you can choose the cluster roles distribution across the cluster. Accept the default options. You can see the summary view via “Host view detail”.

CM8

Next Part is database set up. Please provide the database access detail.

CM9

Accept default and continue.

CM10

Wait for the Cloudera Manager to set up the cluster roles.

CM11

When cluster is installed you can see it in Cloudera Manager and start monitor the cluster state, add and remove new services in this cluster, change configurations, identify problems in the cluster and so on. The yellow signs shown near the services are warnings that can be ignored now but should be analyzed and fixed if you are going to bring the cluster in production.

Summary

Cloudera Manager makes creation and maintenance of Hadoop clusters significantly easier than if they have been managed manually. Due to this instruction it is possible to create a Hadoop cluster in less than one hour when manual configuration and deployment could take a few hours or even days.

 

Leave a Reply