0% found this document useful (0 votes)
3 views

Steps to create jar file and execute word count problem in mapper reducer

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Steps to create jar file and execute word count problem in mapper reducer

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Steps to create jar file and execute word count problem in mapper reducer

1. First Open Eclipse -> then select File -> New -> Java Project ->Name it WordCount -> then
Finish.

2. Create Three Java Classes into the project. Name them WCDriver(having the main
function), WCMapper, WCReducer.

3. You have to include two Reference Libraries for that:


Right Click on Project -> then select Build Path-> Click on Configure Build Path. You can see
the Add External JARs option on the Right Hand Side.
3.1 Go to C:\hadoop-3.3.6\share\hadoop\common
Select all jar file listed in this folder
3.2 C:\hadoop-3.3.6\share\hadoop\mapreduce
Select all jar file listed in this folder
3.3 Click on apply

4. Create a class file named as WCMapper in the WordCount Project


Mapper Code: You have to copy paste this program into the WCMapper Java Class file.

// Importing libraries
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,


Text, Text, IntWritable> {

// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException
{

String line = value.toString();

// Splitting the line on spaces


for (String word : line.split(" "))
{
if (word.length() > 0)
{
output.collect(new Text(word), new IntWritable(1));
}
}
}
}

5. Reducer Code: You have to copy paste this program into the WCReducer Java Class file.
// Importing libraries
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WCReducer extends MapReduceBase implements Reducer<Text,


IntWritable, Text, IntWritable> {

// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{

int count = 0;

// Counting the frequency of each words


while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}

output.collect(key, new IntWritable(count));


}
}

6. Driver Code: You have to copy paste this program into the WCDriver Java Class file.
// Importing libraries
import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException


{
if (args.length < 2)
{
System.out.println("Please give valid inputs");
return -1;
}

JobConf conf = new JobConf(WCDriver.class);


FileInputFormat.setInputPaths(conf, new Path(args[1]));
FileOutputFormat.setOutputPath(conf, new Path(args[2]));
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
}

// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = ToolRunner.run(new WCDriver(), args);
System.out.println(exitCode);
}
}
7. Now you have to make a jar file:
Right Click on Project-> Click on Export-> Select export destination as Jar File-> Name the jar
File(WordCount.jar) -> Click on next -> at last Click on Finish. Now copy this file into the
C:/hadoop-3.3.6/share/hadoop/mapreduce/

8. create one txt file named as test.txt with some repeated words

9. copy that data file into input directory


C:\hadoop-3.3.6\sbin>hadoop fs -put C:/Users/IIITK/Documents/files/test.txt /input3

10. list the contents of hdfs


C:\hadoop-3.3.6\sbin>hadoop fs -ls /input3/

11. display the contents of test.txt file


hadoop dfs -cat /input3/test.txt

12. run the wordcount.jar file saved in the shared directory of Hadoop
C:\hadoop-3.3.6\ sbin> adoop jar C:/hadoop-
3.3.6/share/hadoop/mapreduce/wordcount.jar WCDriver /input3 /output3

13. display the output stored in /output3 directory


14. C:\hadoop-3.3.6\sbin>hadoop fs -cat /output3

16. 16. we can see the output in browser also


Localhost:9870
Go to utilities

You might also like