Hadoop 2.6.4 standalone mode installation on ubuntu 14.04

posted on Nov 20th, 2016

Apache Hadoop

Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.

The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.

Pre Requirements

1) A machine with Ubuntu 14.04 LTS operating system installed.

2) Apache Hadoop 2.6.4 Software (Download Here)

Standalone Mode

By default, Hadoop is configured to run in a non-distributed or standalone mode, as a single Java process. There are no daemons running and everything runs in a single JVM instance. HDFS is not used.

Step 1 - Update. Open a terminal (CTRL + ALT + T) and type the following sudo command. It is advisable to run this before installing any package, and necessary to run it to install the latest updates, even if you have not added or removed any Software Sources.

$ sudo apt-get update

Step 2 - Installing Java 7.

$ sudo apt-get install openjdk-7-jdk

Step 3 - Install open-ssh server. It is a cryptographic network protocol for operating network services securely over an unsecured network. The best known example application is for remote login to computer systems by users.

$ sudo apt-get install openssh-server

Step 4 - Create a Group. We will create a group, configure the group sudo permissions and then add the user to the group. Here 'hadoop' is a group name and 'hduser' is a user of the group.

$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser

Step 5 - Configure the sudo permissions for 'hduser'.

$ sudo visudo

Since by default ubuntu text editor is nano we will need to use CTRL + O to edit.

ctrl+O

Add the permissions to sudoers.

hduser ALL=(ALL) ALL

Use CTRL + X keyboard shortcut to exit out. Enter Y to save the file.

ctrl+x

Step 6 - Creating hadoop directory.

$ sudo mkdir /usr/local/hadoop

Step 7 - Change the ownership and permissions of the directory /usr/local/hadoop. Here 'hduser' is an Ubuntu username.

$ sudo chown -R hduser /usr/local/hadoop
$ sudo chmod -R 755 /usr/local/hadoop

Step 8 - Switch User, is used by a computer user to execute commands with the privileges of another user account.

$ su hduser

Step 9 - Change the directory to /home/hduser/Desktop , In my case the downloaded hadoop-2.6.4.tar.gz file is in /home/hduser/Desktop folder. For you it might be in /downloads folder check it.

$ cd /home/hduser/Desktop/

Step 10 - Untar the hadoop-2.6.4.tar.gz file.

$ tar xzf hadoop-2.6.4.tar.gz

Step 11 - Move the contents of hadoop-2.6.4 folder to /usr/local/hadoop

$ mv hadoop-2.6.4/* /usr/local/hadoop

Step 12 - Edit $HOME/.bashrc file by adding the java and hadoop path.

$ sudo gedit $HOME/.bashrc

$HOME/.bashrc file. Add the following lines

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

Step 13 - Reload your changed $HOME/.bashrc settings

$ source $HOME/.bashrc

Step 14 - Verify hadoop installation. It just display hadoop version in the terminal.

$ hadoop version

Hadoop Standalone Mode Installation on Ubuntu 14.04

Execution of WordCount Example

The following example copies the .txt files of the /usr/local/hadoop/ directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

Step 1 - Creating input directory.

$ mkdir /home/hduser/Desktop/input

Step 2 - Copy all text files. From $HADOOP_HOME to /home/hduser/Desktop/input

$ cp $HADOOP_HOME/*.txt /home/hduser/Desktop/input

Step 3 - Verify copy.

$ ls -l /home/hduser/Desktop/input

Step 4 - Submit jar file to run. Sample WordCount example jar is in $HADOOP_HOME/share/hadoop/mapreduce/ folder.

Hadoop Standalone Mode Installation on Ubuntu 14.04

$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /home/hduser/Desktop/input /home/hduser/Desktop/ouput

Step 5 - Verify output.

$ cat /home/hduser/Desktop/output/*

Please share this blog post and follow me for latest updates on

facebook             google+             twitter             feedburner

Previous Post                                                                                          Next Post

Labels : Hadoop Pseudo Distributed Mode Installation   Hadoop Fully Distributed Mode Installation   Hadoop HDFS commands usage   Hadoop Commissioning and Decommissioning DataNode   Hadoop WordCount Java Example   Hadoop Mapper/Reducer Java Example   Hadoop Combiner Java Example   Hadoop Partitioner Java Example   Hadoop HDFS operations using Java   Hadoop Distributed Cache Java Example