Hadoop Mapper/Reducer java example
Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.
The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.
1) A machine with Ubuntu 14.04 LTS operating system installed.
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
Hadoop Mapper/Reducer Example
MapReduce is a processing technique and a program model for distributed computing based on java. The MapReduce algorithm contains two important tasks, namely Map and Reduce. Map takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (key/value pairs). Secondly, reduce task, which takes the output from a map as an input and combines those data tuples into a smaller set of tuples. As the sequence of the name MapReduce implies, the reduce task is always performed after the map job.
Step 1 - Add all hadoop jar files to your java project. Add following jars.
Step 2 - Change the directory to /usr/local/hadoop/sbin
Step 3 - Start all hadoop daemons
Step 4 - Create input.txt file. In my case, i have stored input.txt in /home/hduser/Desktop/hadoop/ directory.
Step 5 - Add following lines to input.txt file.
Step 6 - Make a new input directory in HDFS
Step 7 - Copy the input.txt from local file system to HDFS.
Step 8 - Run your WordCount program by submitting java project jar file to hadoop. Creating jar file is left to you.
Step 9 - Now you can see the output files.
Step 10 - Dont forget to stop hadoop daemons.
Please share this blog post and follow me for latest updates on
Labels : Hadoop Standalone Mode Installation Hadoop Pseudo Distributed Mode Installation Hadoop Fully Distributed Mode Installation Hadoop HDFS commands usage Hadoop Commissioning and Decommissioning DataNode Hadoop WordCount Java Example Hadoop Combiner Java Example Hadoop Partitioner Java Example Hadoop HDFS operations using Java Hadoop Distributed Cache Java Example