0% found this document useful (0 votes)
96 views18 pages

Big Data & Hadoop Course Overview

This document outlines an elective course on big data that covers topics such as MapReduce, Hadoop, NoSQL databases like MongoDB and HBase, and Spark. The instructor is Jnaneshwar Bohara, who has an M.Sc. in computer science and is a certified scrum master and senior Java programmer. The course will provide hands-on experience with tools used for big data and teach students how to write MapReduce programs, use Hadoop, query and analyze data in NoSQL databases, and more. Key frameworks and technologies that will be covered include HDFS, MapReduce, Hadoop, MongoDB, HBase, and Spark.

Uploaded by

Prajwal khanal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views18 pages

Big Data & Hadoop Course Overview

This document outlines an elective course on big data that covers topics such as MapReduce, Hadoop, NoSQL databases like MongoDB and HBase, and Spark. The instructor is Jnaneshwar Bohara, who has an M.Sc. in computer science and is a certified scrum master and senior Java programmer. The course will provide hands-on experience with tools used for big data and teach students how to write MapReduce programs, use Hadoop, query and analyze data in NoSQL databases, and more. Key frameworks and technologies that will be covered include HDFS, MapReduce, Hadoop, MongoDB, HBase, and Spark.

Uploaded by

Prajwal khanal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Elective Course on Big Data

Jnaneshwar Bohara
Big Data - Hadoop | BoharaG 1
Big Data - Hadoop | BoharaG 2
Know Your Instructor

 Jnaneshwar Bohara
 M. Sc. Computer System and Knowledge
Engineering, IOE, TU (Gold Medal)
 Certified Scrum Master

 Senior Java Programmer

 Big Data Analyst


Know Your Instructor

 Jnaneshwar Bohara
 Researcher on Big Data
and Bioinformatics

https://www.amazon.com/MapReduce-Approach-Longest-
Subsequence-BioSequences/dp/3659680508
How Huge Big Data is!

Big Data and Hadoop | BoharaG 5


What is Big Data?

Big Data and Hadoop | BoharaG 6


What is Big Data?

Collection of data sets so large


and complex that it becomes
difficult to process using on-
hand database management
tools or traditional data
processing applications.

Big Data - Hadoop | BoharaG 7


Big Data and Hadoop | BoharaG 8
Topics
 Introduction
 MapReduce
 Hadoop
 Hands on Hadoop
 NoSQL
 MongoDB
 HBase
 Spark

Big Data and Hadoop | BoharaG 9


Introduction

 What is Big Data


 Characteristic of Big Data
 Current Trend in Big Data
 Real Life Applications of Big Data
 Scope and Challenges of Big Data
 Orientation of Practical (Tools and
Techniques)

Big Data and Hadoop | BoharaG 10


MapReduce
 Functional Programming
 What is MapReduce?
 How Does MapReduce Work?
 Distributed Execution Overview
 Data Distribution
 Use cases of MapReduce
 Anatomy of MapReduce Program
 MapReduce programs in Java
 Basic MapReduce API Concepts
 Writing MapReduce Driver, Mappers, and Reducers in
Java
Big Data and Hadoop | BoharaG 11
Hadoop
 What is Hadoop?
 History of Hadoop
 Motivations for Hadoop
 The Hadoop Ecosystem
 Hadoop Master/Slave Architecture
 Hadoop Daemons
 Hadoop Configuration Modes
 Uses for Hadoop
 Hadoop Cluster Setup
 Troubleshooting of installation and running programs
in Hadoop cluster
Big Data and Hadoop | BoharaG 12
Hands on Hadoop
 Basic Concept of Java Programming for Hadoop
Developers
 Basic Concept of Linux to work in Hadoop
 Basic HDFS Commands
 Compile and Run Hadoop Programs using Command Line
 Use Eclipse IDE for Hadoop Programming
 Use Python in Hadoop
 Write your own MapReduce Programs to solve real life
problems
 Use different Data Types and Formats in Hadoop
 Analyze Big Data (CSV and JSON) in your MapReduce
Program

Big Data and Hadoop | BoharaG 13


NoSQL

 Types of Data
 What is NoSQL?
 Why NoSQL?
 Types of NoSQL Databases

Big Data and Hadoop | BoharaG 14


MongoDB
 Document v’s Relational Databases
 Installing MongoDB
 MongoDB – Collections
 MongoDB – Documents
 Object Ids
 Queries on MongoDB
 Aggregation Pipeline
 Nested Documents
 Twitter data analysis using MongoDB

Big Data and Hadoop | BoharaG 15


HBase
 HBase: Overview
 HBase vs. RDBMS
 HBase vs. HDFS
 HBase Architecture
 HBase Data Model
 HBase: Keys and Column Families
 HBase Regions
 Creating a Table
 Writing Queries to insert and retrieve data to and from
HBase

Big Data and Hadoop | BoharaG 16


Spark

 What is Spark?
 Spark Core
 Spark SQL
 Spark SQL – Handling JSON
 Spark SQL – Handling CSV
 Spark Streaming

Big Data and Hadoop | BoharaG 17


Thank You !

Big Data - Hadoop | BoharaG 18

You might also like