0% found this document useful (0 votes)

169 views

Apache Hadoop MapReduce Commands

This document summarizes the MapReduce commands available in Apache Hadoop 3.2.1. It describes user commands like archive, distcp, and job that interact with jobs, and administration commands like historyserver and hsadmin. The commands are grouped into sections for user commands and administration commands. Each command is briefly described, including its usage and available options.

Uploaded by

rajan peri

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

169 views

Apache Hadoop MapReduce Commands

Uploaded by

rajan peri

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Apache Hadoop 3.2.1 – MapReduce Commands Guide https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop...

Overview
User Commands
archive
archive-logs
classpath
distcp
job
pipes
queue
version
envvars
Administration Commands
historyserver
hsadmin
frameworkuploader

Overview
All mapreduce commands are invoked by the bin/mapred script. Running the mapred script without any arguments
prints the description for all commands.

Usage: mapred [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]

Hadoop has an option parsing framework that employs parsing generic options as well as running classes.

COMMAND_OPTIONS Description
SHELL_OPTIONS The common set of shell options. These are documented on the Hadoop Commands Reference
page.
GENERIC_OPTIONS The common set of options supported by multiple commands. See the Hadoop Commands
Reference for more information.
COMMAND Various commands with their options are described in the following sections. The commands
COMMAND_OPTIONS have been grouped into User Commands and Administration Commands.

User Commands
Commands useful for users of a hadoop cluster.

Usage: yarn classpath [--glob |--jar <path> |-h |--help]

COMMAND_OPTION Description

1 of 5 10/23/2019, 10:27 PM
Apache Hadoop 3.2.1 – MapReduce Commands Guide https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop...

COMMAND_OPTION Description
--glob expand wildcards
--jar path write classpath as manifest in jar named path
-h, --help print help

Prints the class path needed to get the Hadoop jar and the required libraries. If called without arguments, then
prints the classpath set up by the command scripts, which is likely to contain wildcards in the classpath entries.
Additional options print the classpath after wildcard expansion or write the classpath into the manifest of a jar file.
The latter is useful in environments where wildcards cannot be used and the expanded classpath exceeds the
maximum supported command line length.

distcp

Copy file or directories recursively. More information can be found at Hadoop DistCp Guide.

job

Command to interact with Map Reduce Jobs.

Usage: mapred job | [GENERIC_OPTIONS] | [-submit <job-file>] | [-status <job-id>] | [-counter

COMMAND_OPTION Description
-submit job-file Submits the job.
-status job-id Prints the map and reduce completion percentage and all job counters.
-counter job-id group-name Prints the counter value.
counter-name
-kill job-id Kills the job.
-events job-id from-event-# Prints the events’ details received by jobtracker for the given range.
#-of-events
-history [all] Prints job details, failed and killed task details. More details about the job such as
jobHistoryFilejobId [-outfile successful tasks, task attempts made for each task, task counters, etc can be viewed by
file] [-format humanjson] specifying the [all] option. An optional file output path (instead of stdout) can be specified.
The format defaults to human-readable but can also be changed to JSON with the
[-format] option.
-list [all] Displays jobs which are yet to complete. -list all displays all jobs.
-kill-task task-id Kills the task. Killed tasks are NOT counted against failed attempts.
-fail-task task-id Fails the task. Failed tasks are counted against failed attempts.
-set-priority job-id priority Changes the priority of the job. Allowed priority values are VERY_HIGH, HIGH, NORMAL,
LOW, VERY_LOW
-list-active-trackers List all the active NodeManagers in the cluster.
-list-blacklisted-trackers List the black listed task trackers in the cluster. This command is not supported in MRv2
based cluster.
-list-attempt-ids job-id task- List the attempt-ids based on the task type and the status given. Valid values for task-
type task-state type are REDUCE, MAP. Valid values for task-state are running, pending, completed, failed,
killed.
-logs job-id task-attempt-id Dump the container log for a job if taskAttemptId is not specified, otherwise dump the log
for the task with the specified taskAttemptId. The logs will be dumped in system out.
-config job-id file Download the job configuration file.

2 of 5 10/23/2019, 10:27 PM
Apache Hadoop 3.2.1 – MapReduce Commands Guide https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop...

pipes

Runs a pipes job.

Usage: mapred pipes [-conf <path>] [-jobconf <key=value>, <key=value>, ...] [-input <path>]
[-output <path>] [-jar <jar file>] [-inputformat <class>] [-map <class>] [-partitioner <class>]
[-reduce <class>] [-writer <class>] [-program <executable>] [-reduces <num>]

COMMAND_OPTION Description
-conf path Configuration for job
-jobconf key=value, key=value, … Add/override configuration for job
-input path Input directory
-output path Output directory
-jar jar file Jar filename
-inputformat class InputFormat class
-map class Java Map class
-partitioner class Java Partitioner
-reduce class Java Reduce class
-writer class Java RecordWriter
-program executable Executable URI
-reduces num Number of reduces

queue

command to interact and view Job Queue information

Usage: mapred queue [-list] | [-info <job-queue-name> [-showJobs]] | [-showacls]

COMMAND_OPTION Description
-list Gets list of Job Queues configured in the system. Along with scheduling information associated
with the job queues.
-info job-queue-name Displays the job queue information and associated scheduling information of particular job
[-showJobs] queue. If -showJobs options is present a list of jobs submitted to the particular job queue is
displayed.
-showacls Displays the queue name and associated queue operations allowed for the current user. The
list consists of only those queues to which the user has access.

version

Prints the version.

Usage: mapred version

envvars

Usage: mapred envvars

Display computed Hadoop environment variables.

Administration Commands
Commands useful for administrators of a hadoop cluster.

3 of 5 10/23/2019, 10:27 PM
Apache Hadoop 3.2.1 – MapReduce Commands Guide https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop...

historyserver

Start JobHistoryServer.

Usage: mapred historyserver

hsadmin

Runs a MapReduce hsadmin client for execute JobHistoryServer administrative commands.

Usage: mapred hsadmin [-refreshUserToGroupsMappings] | [-refreshSuperUserGroupsConfiguration] |

COMMAND_OPTION Description
-refreshUserToGroupsMappings Refresh user-to-groups mappings
-refreshSuperUserGroupsConfiguration Refresh superuser proxy groups mappings
-refreshAdminAcls Refresh acls for administration of Job history server
-refreshLoadedJobCache Refresh loaded job cache of Job history server
-refreshJobRetentionSettings Refresh job history period, job cleaner settings
-refreshLogRetentionSettings Refresh log retention period and log retention check interval
-getGroups [username] Get the groups which given user belongs to
-help [cmd] Displays help for the given command or all commands if none is specified.

frameworkuploader

Collects framework jars and uploads them to HDFS as a tarball.

Usage: mapred frameworkuploader -target <target> [-fs <filesystem>] [-input <classpath>]

[-blacklist <list>] [-whitelist <list>] [-initialReplication <num>] [-acceptableReplication
<num>] [-finalReplication <num>] [-timeout <seconds>] [-nosymlink]

COMMAND_OPTION Description
-input classpath This is the input classpath that is searched for jar files to be included in the tarball.
-fs filesystem The target file system. Defaults to the default filesystem set by fs.defaultFS.
-target target This is the target location of the framework tarball, optionally followed by a # with the localized
alias. An example would be /usr/lib/framework.tar#framework. Make sure the target directory is
readable by all users but it is not writable by others than administrators to protect cluster security.
-blacklist list This is a comma separated regex array to filter the jar file names to exclude from the class path. It
can be used for example to exclude test jars or Hadoop services that are not necessary to localize.
-whitelist list This is a comma separated regex array to include certain jar files. This can be used to provide
additional security, so that no external source can include malicious code in the classpath when the
tool runs.
-nosymlink This flag can be used to exclude symlinks that point to the same directory. This is not widely used.
For example, /a/foo.jar and a symlink /a/bar.jar that points to /a/foo.jar would normally
add foo.jar and bar.jar to the tarball as separate files despite them actually being the same file.
This flag would make the tool exclude /a/bar.jar so only one copy of the file is added.
-initialReplication num This is the replication count that the framework tarball is created with. It is safe to leave this value
at the default 3. This is the tested scenario.
-finalReplication num The uploader tool sets the replication once all blocks are collected and uploaded. If quick initial
startup is required, then it is advised to set this to the commissioned node count divided by two
but not more than 512.
-acceptableReplication The tool will wait until the tarball has been replicated this number of times before exiting. This
num should be a replication count less than or equal to the value in finalReplication. This is typically
a 90% of the value in finalReplication to accomodate failing nodes.

4 of 5 10/23/2019, 10:27 PM
Apache Hadoop 3.2.1 – MapReduce Commands Guide https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop...

COMMAND_OPTION Description
-timeout seconds A timeout in seconds to wait to reach acceptableReplication before the tool exits. The tool logs
an error otherwise and returns.

5 of 5 10/23/2019, 10:27 PM

01 OS9051-C-1800 Introduction To NetAct
100% (1)
01 OS9051-C-1800 Introduction To NetAct
89 pages
Apache - Hadoop Streaming
No ratings yet
Apache - Hadoop Streaming
13 pages
Apache Hadoop Commands
100% (1)
Apache Hadoop Commands
8 pages
Big Data Mapreduce and Streaming
No ratings yet
Big Data Mapreduce and Streaming
10 pages
Developing a MapReduce Application
No ratings yet
Developing a MapReduce Application
30 pages
Hadoop
No ratings yet
Hadoop
30 pages
sqooprequestfiles
No ratings yet
sqooprequestfiles
7 pages
Hadoop Week 3
No ratings yet
Hadoop Week 3
60 pages
r2 Cheatsheet PDF
No ratings yet
r2 Cheatsheet PDF
2 pages
Module-1: Hdfs Basics Running Example Programs and Benchmarks Hadoop Mapreduce Framework Mapreduce Programming
No ratings yet
Module-1: Hdfs Basics Running Example Programs and Benchmarks Hadoop Mapreduce Framework Mapreduce Programming
33 pages
Apache Hadoop Yarn Commands
No ratings yet
Apache Hadoop Yarn Commands
8 pages
r2 Cheatsheet
No ratings yet
r2 Cheatsheet
2 pages
hadoop3
No ratings yet
hadoop3
8 pages
Part 03 Intro To Hadoop
No ratings yet
Part 03 Intro To Hadoop
22 pages
Hadoop Introduction
No ratings yet
Hadoop Introduction
21 pages
Hadoop Installation Steps
No ratings yet
Hadoop Installation Steps
10 pages
MapReduce - Notes
No ratings yet
MapReduce - Notes
17 pages
day12-maven
No ratings yet
day12-maven
3 pages
How To Upload Data From Computer To Server Using Command Line
No ratings yet
How To Upload Data From Computer To Server Using Command Line
5 pages
BDA Unit 4 Notes
No ratings yet
BDA Unit 4 Notes
20 pages
bda-manual
No ratings yet
bda-manual
33 pages
day7
No ratings yet
day7
7 pages
Spark Implementation
No ratings yet
Spark Implementation
10 pages
Spark Details
No ratings yet
Spark Details
11 pages
Hadoop Week 2
No ratings yet
Hadoop Week 2
40 pages
Bda Lab
No ratings yet
Bda Lab
47 pages
BDA Unit-4-PPT
No ratings yet
BDA Unit-4-PPT
98 pages
Development PDF
No ratings yet
Development PDF
20 pages
HADOOP AND BIG DATA - Final
No ratings yet
HADOOP AND BIG DATA - Final
26 pages
TP2 _3IM - En
No ratings yet
TP2 _3IM - En
7 pages
Oozie Basic Exercise
No ratings yet
Oozie Basic Exercise
3 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
6 pages
Slapd: Index Return To Main Contents
No ratings yet
Slapd: Index Return To Main Contents
42 pages
Assignment 11 DSBDA
No ratings yet
Assignment 11 DSBDA
4 pages
RTL Compiler Script
No ratings yet
RTL Compiler Script
4 pages
HIBERNET-day12 and 13
No ratings yet
HIBERNET-day12 and 13
11 pages
Kubectl Commands Kubernetes
No ratings yet
Kubectl Commands Kubernetes
179 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
MapReduce Exam 2019 - Solved Paper
No ratings yet
MapReduce Exam 2019 - Solved Paper
25 pages
Install Apache Hadoop Using Cloudera
No ratings yet
Install Apache Hadoop Using Cloudera
132 pages
nodejs
No ratings yet
nodejs
53 pages
Xgrid Man
No ratings yet
Xgrid Man
5 pages
System Design and Implementation 5.1 System Design
No ratings yet
System Design and Implementation 5.1 System Design
14 pages
UNIXLInux For Oracle DBAMustKnow-libre
No ratings yet
UNIXLInux For Oracle DBAMustKnow-libre
17 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Hadoop Single Node Cluster Setup Steps
No ratings yet
Hadoop Single Node Cluster Setup Steps
7 pages
Big DATA Analytics: C.Ranichandra & N.C.Senthilkumar
No ratings yet
Big DATA Analytics: C.Ranichandra & N.C.Senthilkumar
46 pages
Manual DreamBox BusyBox
No ratings yet
Manual DreamBox BusyBox
77 pages
Hadoop Installation Cluster
No ratings yet
Hadoop Installation Cluster
9 pages
Commands Guide.: 5.3 Walk-Through
No ratings yet
Commands Guide.: 5.3 Walk-Through
1 page
Hadoop
No ratings yet
Hadoop
27 pages
MapReduce_commands
No ratings yet
MapReduce_commands
3 pages
Async Tasks With Apache Airflow
No ratings yet
Async Tasks With Apache Airflow
111 pages
Utilização · Sqlmapproject _ Sqlmap Wiki · GitHub
No ratings yet
Utilização · Sqlmapproject _ Sqlmap Wiki · GitHub
53 pages
bda u3 copy
No ratings yet
bda u3 copy
59 pages
Bash Command Line Pro Tips
From Everand
Bash Command Line Pro Tips
Jason Cannon
4.5/5 (8)
Quick Configuration of Openldap and Kerberos in Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos in Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
From Everand
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
Karl Josef Hensel
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Learning Hadoop 2
From Everand
Learning Hadoop 2
Garry Turkington
4/5 (1)
SQL 101: A Beginners Guide To SQL
No ratings yet
SQL 101: A Beginners Guide To SQL
8 pages
7 More Steps To Mastering Machine Learning With Python - Page1
No ratings yet
7 More Steps To Mastering Machine Learning With Python - Page1
8 pages
Basics of Dimensional Modeling
100% (1)
Basics of Dimensional Modeling
14 pages
Kindle Unlimited Central - Books - 04 - 16 - 2020
No ratings yet
Kindle Unlimited Central - Books - 04 - 16 - 2020
20 pages
Python Machine Learning
100% (6)
Python Machine Learning
113 pages
Apache Hadoop Yarn
No ratings yet
Apache Hadoop Yarn
2 pages
Pythonlearn 04 Functions
No ratings yet
Pythonlearn 04 Functions
25 pages
Strings in Python
No ratings yet
Strings in Python
33 pages
openSCADA Reference
No ratings yet
openSCADA Reference
6 pages
Dynamic Logical Processors For Hyper-Threading On HP-UX 11i v3
No ratings yet
Dynamic Logical Processors For Hyper-Threading On HP-UX 11i v3
22 pages
Bypass Windows Defense
No ratings yet
Bypass Windows Defense
30 pages
How To Create Iris Service
No ratings yet
How To Create Iris Service
15 pages
IGW Principle of Operation, Installation, Configuration
No ratings yet
IGW Principle of Operation, Installation, Configuration
28 pages
Video Card Presentation
No ratings yet
Video Card Presentation
21 pages
Ec09 505 Microprocessors and Microcontrollers: Prepared by Lekha Pankaj Asst - Professor ECE Dept
No ratings yet
Ec09 505 Microprocessors and Microcontrollers: Prepared by Lekha Pankaj Asst - Professor ECE Dept
40 pages
EMC VMAX - Removal of A TDEV - David Ring
No ratings yet
EMC VMAX - Removal of A TDEV - David Ring
6 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
71 pages
CCN MCQ
No ratings yet
CCN MCQ
2 pages
Sysstem Calls 1
No ratings yet
Sysstem Calls 1
3 pages
Báo Giá Linh Kiện Server
No ratings yet
Báo Giá Linh Kiện Server
8 pages
Cloud Computing Lab
No ratings yet
Cloud Computing Lab
4 pages
Peer Control Data Interface Part 2
No ratings yet
Peer Control Data Interface Part 2
77 pages
Google App Engine
No ratings yet
Google App Engine
4 pages
7th Standard
No ratings yet
7th Standard
89 pages
Solved Multiple Choice Questions On Operating System: Department of Computer Application GDC Ramban
No ratings yet
Solved Multiple Choice Questions On Operating System: Department of Computer Application GDC Ramban
17 pages
Application Software Grade9 - 3 - Application Software
No ratings yet
Application Software Grade9 - 3 - Application Software
11 pages
Organising Project Data
No ratings yet
Organising Project Data
3 pages
Hosting Simple Web Applications Using Amazon Lightsail
No ratings yet
Hosting Simple Web Applications Using Amazon Lightsail
4 pages
SRS of Examination System
No ratings yet
SRS of Examination System
9 pages
Switch Configuration Step by Step
No ratings yet
Switch Configuration Step by Step
13 pages
IP Part 1
No ratings yet
IP Part 1
14 pages
Exchange Maintenance Checklist
No ratings yet
Exchange Maintenance Checklist
2 pages
ED - 22 - 23 - Second - Midterm - A - ENG Solution
No ratings yet
ED - 22 - 23 - Second - Midterm - A - ENG Solution
6 pages
Main Desktop PC Build
No ratings yet
Main Desktop PC Build
2 pages
8770 R5.0 - Product Presentation Dec 2021
No ratings yet
8770 R5.0 - Product Presentation Dec 2021
21 pages
MS CAL Licensing Cheat Sheet
No ratings yet
MS CAL Licensing Cheat Sheet
1 page
Hostel Administration in Common Portal Abstract
No ratings yet
Hostel Administration in Common Portal Abstract
6 pages