Hadoop 2.6.4 standalone mode installation on ubuntu 14.04
Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.
The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.
1) A machine with Ubuntu 14.04 LTS operating system installed.
2) Apache Hadoop 2.6.4 Software (Download Here)
By default, Hadoop is configured to run in a non-distributed or standalone mode, as a single Java process. There are no daemons running and everything runs in a single JVM instance. HDFS is not used.
Step 1 - Update. Open a terminal (CTRL + ALT + T) and type the following sudo command. It is advisable to run this before installing any package, and necessary to run it to install the latest updates, even if you have not added or removed any Software Sources.
Step 2 - Installing Java 7.
Step 3 - Install open-ssh server. It is a cryptographic network protocol for operating network services securely over an unsecured network. The best known example application is for remote login to computer systems by users.
Step 4 - Create a Group. We will create a group, configure the group sudo permissions and then add the user to the group. Here 'hadoop' is a group name and 'hduser' is a user of the group.
Step 5 - Configure the sudo permissions for 'hduser'.
Since by default ubuntu text editor is nano we will need to use CTRL + O to edit.
Add the permissions to sudoers.
Use CTRL + X keyboard shortcut to exit out. Enter Y to save the file.
Step 6 - Creating hadoop directory.
Step 7 - Change the ownership and permissions of the directory /usr/local/hadoop. Here 'hduser' is an Ubuntu username.
Step 8 - Switch User, is used by a computer user to execute commands with the privileges of another user account.
Step 9 - Change the directory to /home/hduser/Desktop , In my case the downloaded hadoop-2.6.4.tar.gz file is in /home/hduser/Desktop folder. For you it might be in /downloads folder check it.
Step 10 - Untar the hadoop-2.6.4.tar.gz file.
Step 11 - Move the contents of hadoop-2.6.4 folder to /usr/local/hadoop
Step 12 - Edit $HOME/.bashrc file by adding the java and hadoop path.
$HOME/.bashrc file. Add the following lines
Step 13 - Reload your changed $HOME/.bashrc settings
Step 14 - Verify hadoop installation. It just display hadoop version in the terminal.
Execution of WordCount Example
The following example copies the .txt files of the /usr/local/hadoop/ directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
Step 1 - Creating input directory.
Step 2 - Copy all text files. From $HADOOP_HOME to /home/hduser/Desktop/input
Step 3 - Verify copy.
Step 4 - Submit jar file to run. Sample WordCount example jar is in $HADOOP_HOME/share/hadoop/mapreduce/ folder.
Step 5 - Verify output.
Please share this blog post and follow me for latest updates on
Previous Post Next Post
Labels : Hadoop Pseudo Distributed Mode Installation Hadoop Fully Distributed Mode Installation Hadoop HDFS commands usage Hadoop Commissioning and Decommissioning DataNode Hadoop WordCount Java Example Hadoop Mapper/Reducer Java Example Hadoop Combiner Java Example Hadoop Partitioner Java Example Hadoop HDFS operations using Java Hadoop Distributed Cache Java Example