DA Lab Program-2
DA Lab Program-2
Steps to be followed:
• Step-4: Mapper Code which should be copied and pasted into the
WCMapper Java Class file.
// Importing libraries
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
1
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
// Map function
if (word.length() > 0)
2
• Step-5: Reducer Code which should be copied and pasted into the
WCReducer Java Class file.
// Importing libraries
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
// Reduce function
int count = 0;
while (value.hasNext())
IntWritable i = value.next();
3
count += i.get();
• Step-6: Driver Code which should be copied and pasted into the
WCDriver Java Class file.
// Importing libraries
import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
{
4
if (args.length < 2)
return -1;
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
// Main Method
System.out.println(exitCode);
5
}
• Step-8: Open the terminal and change the directory to the workspace.
For that open terminal and write the below code (remember you should be in
the same directory as jar file you have created just now),
cat WCFile.text
• Step-9: Now, run the below command to copy the file input file into the
HDFS,
• Step-10: Now to run the jar file, execute the below code,
• Step-11: After Executing the code, you can see the result in WCOutput file
or by writing following command on terminal,