MISRIMAL NAVAJEE MUNOTH JAIN
ENGINEERING COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION
A Jain Minority institution
‘ogrammes Accredited by NBA, Now Delhi, (UG Programmes — Mi
by the Government of Tamil Na ted to
Guru MarudharKosari Building, Jyothi Nagar, Rajiv Gandhi Salal, OMA Thoraipakkam, Chennai
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND
DATA SCIENCE
CCS334_ BIG DATA ANALYTICS LABORATORY
REGULATION-2021
NAME
REGISTER NUMBER
YEAR/SEMESTER : TIV/VMISRIMAL NAVAJEE MUNOTH JAIN
, ENGINEERING COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION
A Jain Minority institution
‘Approved by AICTE &Programmes Accredited by NBA, Now Delhi, (UG Programmes ~ MECH, AIKDS, ECE, SEIT)
Al Programmes Recognized by the Government of Tamil Nadu and A¥fiiated to Anna University, Chennai
Guru MarudharKesari Building, 4yothi Nagar, Rajiv Gandhi Salal, OMR Thoraipakkam, Chennai - 600 097,
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND
DATA SCIENCE
VISION
To produce high quality, creative and ethical engineers, and technologists
contributing effectively to the ever-advancing Artificial Intelligence and Data
Science field.
MISSION
To educate future software engineers with strong fundamentals by
continuously improving the teaching-learning methodologies using
contemporary aids.
To produce ethical engineers/researchers by instilling the values of
humility, humaneness, honesty and courage to serve the society.
To create a knowledge hub of Artificial Intelligence and Data Science
with everlasting urge to learn by developing, maintaining and continuously
improving the resources/Data Science.MISRIMAL NAVAJEE MUNOTH JAIN
, ENGINEERING COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION
A Jain Minority institution
‘Approved by AICTE &Programmes Accredited by NBA, Now Delhi, (UG Programmes ~ MECH, AIKDS, ECE, SEIT)
Al Programmes Recognized by the Government of Tamil Nadu and A¥fiiated to Anna University, Chennai
Guru MarudharKesari Building, 4yothi Nagar, Rajiv Gandhi Salal, OMR Thoraipakkam, Chennai - 600 097,
Register No:
BONAFIDE CERTIFICATE
This is to certify that this is a bonafide record of the work done by
Mr./Ms. of II YEAR/ V SEM
[Link]- ARTIFICIAL INTELLIGENCE AND DATA SCIENCE in
CCS334- BIG DATA ANALYTICS LABORATORY during the Academic year
2023 — 2024.
Faculty-in-charge Head of the Department
Submitted for the University Practical Examination held on :_/_/
Internal Examiner External Examiner
DATE: DATE:37] MISRIMAL NAVAJEE MUNOTH JAIN
fe ENGINEERING COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION
A Jain Minority institution
‘Approved by AICTE &Programmes Accredited by NBA, New Delhi, (US Programmes ~ MECH, EEE, ECE, CSE & IT)
{Al Programmes Recognized by the Government of Tamil Nadu and A¥fiiated to Anna University, Chennai
Guru MarudharKesari Building, 4yothi Nagar, Rajiv Gandhi Salal, OMR Thoraipakkam, Chennai - 600 097,
CCS334 BIG DATA ANALYTICS LABORATORY
COURSE OUTCOMES
Describe big data and use cases from selected business domains,
Explain NoSQL big data management.
Install, configure, and run Hadoop and HDFS.
Perform map-reduce analyties using Hadoop.
Use Hadoop-related tools such as HBase, Cassandra, Pig, and Hive for big data
analyticsCCS334 BIG DATA ANALYTICS LABORATORY
CONTENT
PAGE
EXPERIMENTS, No SIGNATURE
Downloading and installing Hadoop;
Understanding different Hadoop
modes. Startup scripts, Configuration
files.
Hadoop Implementation of file
management tasks, such as Adding
files and directories, retrieving files
and Deleting files
Implement of Matrix Multiplication
with Hadoop Map Reduce
Run a basic Word Count Map Reduce
program to understand Map Reduce
Paradigm.
Installation of Hive along with
practice examples.
Installation of HBase along with
Practice examples.
Installing thrift.
Practice importing and exporting data
from various databases.SyllabusCCS334 BIG DATA ANALYTICS LABORATORY
COURSE OBJECTIVES:
To understand big data.
To learn and use NoSQL big data management.
To learn mapreduce analytics using Hadoop and related tools.
To work with map reduce applications
To understand the usage of Hadoop related tools for Big Data Analytics
Tools: Cassandra, Hadoop, Java, Pig, Hive and HBase.
Suggested Exercises:
1 Downloading and installing Hadoop; Understanding different Hadoop modes.
Startup scripts, Configuration files.
2. Hadoop Implementation of file management tasks, such as Adding files and
directories, retrieving files and Deleting files
3. Implement of Matrix Multiplication with Hadoop Map Reduce
4, Run a basic Word Count Map Reduce program to understand Map Reduce
Paradigm.
5. Installation of Hive along with practice examples.
6. Installation of HBase, Installing thrift along with Practice examples
7. Practice importing and exporting data from various databases.[Link]:I DOWNLOADING AND INSTALLING HADOOP; UNDERSTANDING
DATE: DIFFERENT HADOOP MODES. STARTUP SCRIPTS,
* CONFIGURATION FILES.
AIM:
To Downloading and installing Hadoop; Understanding different Hadoop modes. Startup
scripts, Configuration files.
PREREQUISITES TO INSTALL HADOOP ON WINDOWS
VIRTUAL BOX (For Linux): itis used for installing the operating system on it.
OPERATING SYSTEM: You can install Hadoop on Windows or Linux based
operating systems. Ubuntu and CentOS are very commonly used.
JAVA: You need to install the Java 8 package on your system.
HADOOP: You require Hadoop latest version
Install Java
© Java IDK Link to download,
[Link]
Extract and install Java in C:Mava
Open ema and type -> javae —version
BY Command Prompt
eee ur) TCOEC REC Leys]
(Cera. el la hs hoe eh a
\Users\asus>javac -version
ae eee
2. Download Hadoop
[Link]
extract to C:AHadoopADE
app
cfrbackup-ECPREIHB
eSupport
Games
hadoop
© hadoop-2.8.0
| hadoop-28.4
+ hadoop-3.3.0
© Hortanwork
1 informatica
7 logs
3
Set the path JAVA_HOME Environment variable
4, Set the path HADOOP_HOME Environment vari
<
> This PC
Nar
ick access
a
@& OneDrive
~_& This Pc
Collapse
Manage
Pin to Start
Map network drive...
‘Open in new window
Pin to Quick access
Disconnect network drive.
Add a network location
Delete
Rename
iable
> os ic)
= ADE
= app
© cfrbackup-ECPRPIHB
files (86)View basic information ebout your computer
© device Manage dows ection
@ mete setting
& siten protection © 2019 Micosot Compton Al ght re
@ Aavenced
Stem
Installed memory (RAM): 00.6 (789 GB usable
Sistem toe bit Oeratna Sytem x6l-baced processor
Computer name Desctop-s75Fci
Workgroup worKsRaur
System Properties
Computer Name Hardware Advanced system Protection Remote
‘You must be logged on as an Administrator to make most of these changes.
Performance
Visual effects, processor scheduling, memory usage, and virtual memory
Settings
User Profiles
Desktop settings related to your sig
Startup and Recovery
System startup, system failure, and debugging information
SettingsEnvironment Variables
User variables for asus
Variable
Value
HaDOOP HOME
Intel IDEA Commer
JAVA HOME
OnaDrive
OneDriveConsumer
Pat
SEF MASK NO7ONF
User variables for asus
Varable
ChocoateylastPathlpeate
HADOO? HOME
‘\hadoop-3.3.0\bin
nity Edi... C\Program Files etBrains\intlld IDEA Community Ecltion 207,
CAlava\idet 8. 0.241\bin
CAUsers\asus\OneDrive
‘AUsers\asus\OneDrive
CAPython39\Seripts:C:\Python39\CA\Python37\Seripts\.CAPytho,
‘CHECKS. 4
Value
snanz949n25673884
‘Chadoop-3.3.0\bin
Intell IDEA Community Eck. C\Progrom Fle etSrainatintelld IDEA Community Eitton 2019
Java, HOME
OneDrive
OneDriveconsumer
Path
SF it User Vavinle
Vasile name:
"Variable vale
Browse Directory.
EAlavalt 8.0.24%\bin
CA\cers\asusiOnedrive
€.\eers\asus\Onebive
AP ython29\Sesipte\C\PythorB9\CAPy thon Serpts\CPytho.
HADOOP HOME
Cahadoop-3 306i
Browse FilEnvironment Variables
User variables for asus
Variable
4ADOOP_HOME
Value
‘CAhadoop-3.3.0\bin
Intelld IDEA Community Edi... C:\Program Files\etBrains\Intelld IDEA Cormunity Edition 2019,
JAVA HOME
OneDrive
‘OneDriveConsumer
Path
CAavaldkt 80_241\bin
‘CAUsers\asus\OneDrive
‘CAUsers\asus\OneDrive
CAPython39\Scripts\:C\Python39\.C\Python3\Scripts\.CAP tho,
SFF MASK NOZONFCHECKS 1
Jet vibes for aus
Vorable
ChocoateystPathupdate
— =
132412948225523054
CAhadoop 33.0\bin
Inteld IOLA Community Edi. CAProgram Fes eran
JAVA HOME
OneDrive
Path
SF ce User Variable
‘Variable name:
" Verisble value
Downe Decoy.
Almac 80 241\6i0
CAUsersasus\OneDive
CAPythond% Scripts. CA\Python39\C\P thon’ Scipls\CPytho
AVA HOME
[CNavayaet80_24m\bid
Downe Fe5. Configurations
Edit file C:/Hadoop-3,3.0/etc/hadoop/[Link],paste the xml code in folder and save
[Link]
hdfs://localhost:9000
Rename “[Link]” to “[Link]” and edit this file C:/Hadoop-
3.3.0/ete/hadoop/[Link], paste xml code and save this file.
[Link]
yam
Create folder “data” under “C:\Hadoop-3.3.0"
Create folder “datanode” under “C:\Hadoop-3.3.0\data”
Create folder “namenode” under “C:\Hadoop-3.3.0\data”
Edit file C:\Hadoop-3.3.0/ete/hadoop/[Link],
paste xml code and save this file,
[Link]
1
[Link]
/hadoop-3.3.0/data/namenode
[Link]/hadoop-3.3.0/data/datanode
Edit file C:/Hadoop-3.3.0/etc/hadoop/[Link],,
paste xml code and save this file.
[Link]-services
mapreduce_shufile
[Link]
[Link] ShufileHandler
Edit file C:/Hadoop-3.3.0/ete/hadoop/[Link]
by closing the command line
“JAVA_HOME=%JAVA_HOME%” instead of set “JAVA_HOME=C:\Java”
6. Hadoop Configurations
Download
[Link]
or (for hadoop 3)
[Link]
Copy folder bin and replace existing bin folder in
C:\Hadoop-3.3.0\bin
Format the NameNode
Open emd and type command “hdfs namenode format”El C:\Windows\System32\[Link]
Pee as uC eC Tey!
(c) 2020 Microsoft Corporation. All rights re:
AULT eo EERO Su Malad
7. Testing
Open emd and change directory to C:\Hadoop-3.3.0\sbin
© type [Link]
BBE C\Windows\System32\[Link]
sion [email protected]
soft Windo
rea on
¢) 202@ Microsoft Corporation. All
K
:\hadoop-3.3.@\sbin>[Link]
(Or you can start like this)
Start namenode and datanode with this command
© type [Link]
‘© Start yarn through this command
© type [Link]
Make sure these apps are running
+ Hadoop Namenode
© Hadoop datanode
* YARN Resource Manager~ YARN Node Manager
Open: [Link]
All Applications
Open: [Link]SO © bocaihost
eee
Overview ‘ocalhost:9000' (va
Compiled: “Tue Ju 7 [Link] +0530 2020 by bata fom braneh.3:3.0
Custer io: d
Book Poot a:
Summary
Hadoop installed Successfully...
RESULT:
Downloaded and installed Hadoop and also understand different Hadoop modes. Startup scripts,
Configuration files are successfully implemented.HADOOP IMPLEMENTATION OF FILE MANAGEMENT
/[Link] TASKS, SUCH AS ADDING
1 FILES AND DIRECTORIES, RETRIEVING FILES AND
pat DELETING FILES.
AIM:
To implement the following file management tasks in Hadoop:
1, Adding files and directories
2. Retrieving files
3. Deleting Files
[Link] a directory in HDFS at given paths).
Usage:
hadoop fi -mkdir Example:
hadoop fs -mkdir /user/saurzcode/dirl /user/saurzcode/dir2
[Link] the contents of adirectory.
Usage
hadoop fs -Is
Example:
hadoop f -Is /user/saurzcode
[Link] and download a file in HDFS.
Upload: hadoop fs -put:
Copy single sre file, or multiple src files from local file system to the Hadoop data file system
Usage:
hadoop fs -put ... Example:
hadoop fs -put /home/saurzcode/[Link] /user/ saurzcode/dir3/
Download:
hadoop fs -get:
Copies/Downloads files to the local file system
Usage:
hhadoop fs -get Example:
hadoop fs -get /user/saurzcode/dir3/[Link] /home/
[Link] contents of a file Same
as unix cat command: Usage:
hadoop fs -cat Example:
hadoop fs -cat /user/saurzcode/dirl /[Link]
1. Copy a file from source todestination
This command allows multiple sources as well in which casethe destination must be a directory.
Usage:
hadoop fs -cp
Example:
hadoop fs
Juser/saurzcode/dirl/[Link]
juser/saurzcode/dir2
2. Copy a file from/To Local file system to HDF
copyFromLocal
Usage:
hadoop fs -copyFromLocal
URI Example:
hadoop fs -copyFromLocal /home/saurzcode/[Link] /user/ saurzcode/[Link]
Similar to put command, except that the source is restricted to a local file reference
copyToLocal
Usage:
hadoop fs -copyToLocal [-ignorecre] [-cre] URI
Similar to get command, except that the destination is restricted to a local file reference.
3. Move file from source to destination.
Note:~ Moving files across filesystem is not permitted.
Usage
hadoop fs -mv Example:
hadoop fs -mv /user/saurzcode/dirl /[Link] /user/saurzcode/ dir24, Remove a file or directory in HDFS.
Remove files specified as argument, Deletes directory only when it is empty
Usage
hadoop fs -rm Example:
hadoop fs -rm /user/saurzcode/dirl /[Link]
Reeursive version of delete,
Usage
hadoop fs -rmr Example:
hadoop fs -rmr /user/saurzcode/
5. Display last few lines of a file.
Similar to tail command in Unix.
Usage :
hadoop fs -tail Example:
hadoop fi -tail /user/saurzcode/dirl /[Link]
6. Display the aggregate length of a file.
Usage
hadoop fs -du Example:
hadoop fs -du /user/saurzcode/dir /[Link]
RESULT:
Thus,the Hadoop Implementation of file management tasks, such as Adding
files and directories, retrieving files and Deleting files is executed successfully.[Link] IMPLEMENT OF MATRIX MULTIPLICATION WITH
HADOOP MAP REDUCE
To write a Map Reduce Program that implements Matrix Multiplication.
ALGORITE
We assume that the input matrices are already stored in Hadoop Distributed File System
(HDES) in a suitable format (e.g., CSV, TSV) where each row represents a matrix element. The
matrices are compatible for multiplication (the number of columns in the first matrix is equal
to the number of rows in the second matrix).
STEP 1: MAPPER
‘The mapper will take the input matrices and emit key-value pairs for each element in
the result matrix. The key will be the (row, column) index of the result element, and the value
will be the corresponding element value,
STEP 2: REDUCER
The reducer will take the key-value pairs emitted by the mapper and calculate the partial
sum for each clement in the result matrix.
STEP 3: MAIN DRIVER
‘The main driver class sets up the Hadoop job configuration and specifies the input and
output paths for the matrices.
STEP 4: RUNNING THE JOB
To run the MapReduce job, you need to package your classes into a JAR file and then submit
it to Hadoop using the hadoop jar command. Make sure to replace input_path and output_path
with the actual HDFS paths to your input matrices and desired output directory.
PROGRAM:
import [Link] JOException;
import [Link];
import [Link];
import [Link];
import [Link] .io. Text;
import [Link] Mapper;import [Link]. Reducer;
import [Link]. [Link], Configuration;
import [Link] Job;
import [Link] input. TextInputFormat;
import [Link]. [Link]. TextOutputFormat;
import [Link] input FilelnputFormat;
import [Link] .[Link];
import [Link];
public class MatrixMultiplicationMapper extends Mapper
{ protected void reduce(Text key, Iterable values, Context context) throws IOException,
InterruptedException {
int result = 0;
for (Text value : values) {
#/ Accumulate the partial sum for the result element
result += [Link]([Link]());
3
// Emit the final result for the result element
[Link](key, new IntWritable(result));}
public class MatrixMultiplicationDriver {
public static void main(Stringf] args) throws Exception
{ Configuration conf = new Configuration();
Job job = [Link](conf, "Matrix Multiplication");
[Link]([Link]);
[Link]([Link]);
job. setReducerClass([Link]);
[Link]([Link]);
[Link]([Link]);
[Link](job, new Path(args[0]));
FileOutputFormat setOutputPath(job, new Path(args[1);
[Link]([Link](true) ? 0 : 1);
Run the program
hadoop jar matrixmultiplication jar MatrixMultiplicationDriver input_path output_path
Dy
Jlendigubuntu:~/Desktop$ hadoop jar HatrixHulttplicatton. jar /matrix_data/ /natr\x_output_nev]
part-00000(4) 3%
€,0,240.0
61,250.
62,260.
10,880.
11,930.
1,2,980.RESUL’
‘Thus the Map Reduce Program that implements Matrix Multiplication was executed
and verified successfully./[Link] RUN A BASIC WORD COUNT MAP REDUCE PROGRAM
TO UNDERSTAND MAP REDUCE PARADIGM
[DATE:
AIM:
To write a Basic Word Count program to understand Map Reduce Paradigm.
ALGORITHM:
The entire MapReduce program can be fundamentally divided into three parts:
+ Mapper Phase Code
* Reducer Phase Code
© Driver Code
STEP 1: MAPPER CODE:
We have created a class Map that extends the class Mapper which is already defined in
the MapReduce Framework.
We define the data types of input and output key/value pair afler the class declaration
using angle brackets.
* Both the input and output of the Mapper is a key/value pair.
Input:
© The key is nothing but the offset of each line in the text file:LongWritable
+ The value is each individual line : Text
Output:
+ The key is the tokenized words: Text
We have the hardcoded value in our case which is 1: IntWritable
© Example - Dear 1, Bear 1,
We have written a java code where we have tokenized cach word and assigned them a
hardcoded value equal to 1
STEP 2: REDUCER CODE:
* We have created a class Reduce which extends class Reducer like that of Mapper.
* We define the data types of input and output key/value pair after the class declaration
using angle brackets as done for Mapper.
Both the input and the output of the Reducer is a key value pair.‘The key nothing but those unique words which have been generated after the sorting
and shuffling phase: Text
The value is a list of integers corresponding to each key: IntWritable
Example — Bear, [1, 1], ete
Output:
* The key is all the unique words present in the input text file: Text,
* The value is the number of occurrences of each of the unique words: IntWritable
© Example — Bear, 2; Car, 3, ete,
«We have aggregated the values present in each of the list corresponding to each key and
produced the final answer.
+ In general, a single reducer is created for each of the unique words, but, you can specify the
number of reducer in mapred-site xml.
STEP 3: DRIVER CODE:
* Inthe driver class, we set the configuration of our MapReduce job to run in Hadoop.
We specify the name of the job , the data type of input/ output of the mapper and reducer
We also specify the names of the mapper and reducer classes.
The path of the input and output folder is also specified.
The method setInputFormatClass () is used for specifying that how a Mapper will read
the input data or what will be the unit of work. Here, we have chosen TextInputFormat
so that single line is read by the mapper at a time from the input text file. The main ()
method is the entry point for the driver. In this method, we instantiate a new
Configuration object for the job.
PROGRAM:
import [Link];
import [Link];
import [Link];
import [Link];
import [Link] io. Text;
import [Link] Mapper;
import [Link] [Link]. Reducer;import [Link]. [Link]. Configuration;
import [Link];
import [Link] [Link] input. TextInputFormat;
import [Link];
import [Link] input FilelnputFormat;
import [Link],
import [Link]. hadoop. [Link];
public class WordCount
{
public static class Map extends Mapper
{ public void map(LongWritable key, Text value,Context context) throws
IOException, InterruptedException{
String line = [Link]();
StringTokenizer tokenizer = new StringTokenizer(line);
while ([Link]())
{ [Link]([Link]());
[Link](value, new IntWritable(1));
public static class Reduce extends Reducer
{ public void reduce(Text key, Iterable values Context context)
throws IOException InterruptedException {
int sum=0;
for(IntWritable x: values)
{
sum+=[Link]();
}
[Link](key, new IntWritable(sum));
public static void main(String[] args) throws Exception
{ Configuration conf= new Configuration;Job job = new Job(conf,"My Word Count Program");
[Link]([Link]);
[Link](Map class);
[Link]([Link]);
job setOutputKeyClass([Link]);
[Link]([Link]);
[Link]([Link]);
[Link](TextOutputF [Link]);
Path outputPath = new Path(args[1]);
J/Configuring the input/output path from the filesystem into the job
[Link](job, new Path(args[0|));
FileOutputFormat setOutputPath(job, new Path(args[1]));
/ideleting the output path automatically from hdfs so that we don't have to
delete it explicitly
[Link](conf).delete(outputPath);
Hexiting the job only if the flag value becomes false
[Link]([Link](true) ? 0 : 1);
}
}
Run the MapReduce code:
The command for running a MapReduce code is:
hadoop jar hadoop-mapreduce-example jar WordCount /sample/input /sample/output
OUTPUT:
Sears ere ei ter re COC’)
Ms Se cares
INFO napreduce.20b: nop OX reduce OX
INFO Rapreduce.2eb: ap 100% reduce 109%
INFO napreduce. 20 yab_3473930730090 e663 completed suced
irapart:r-00000(3) %
2
a
ADRIAN, 2
Agdiles,
AEsculaptus?
ALARBUS,
ALENCON,
ALL'S (25
ANDRONICUS
ANGELO, 2
RESULT:
Thus the Map Reduce Program that implements word count was executed and verified
successfully.[Link]
INSTALLATION OF HIVE ALONG WITH PRACTICE EXAMPLES.
DATE:
AIM:
To install HIVE along with practice examples.
PREREQUISITES:
* Java Development Kit (JDK) installed and the JAVA_HOME environment variable
set
+ Hadoop installed and configured on your Windows system.
STEP-BY-STEP INSTALLATION:
1. Download HIVE:
Visit the Apache Hive website and download the latest stable version of Hive.
Official Apache Hive website: [Link]
2. Extract the Downloaded Hive Archive to a Directory on Your Windows Machine,
e.g., CAhive.
3. Configure Hive:
* Open the Hive configuration file ([Link]) located in the conf folder of the
extracted Hive directory.
Set the necessary configurations, such as Hive Metastore connection settings and
Hadoop configurations. Make sure to adjust paths accordingly for Windows. Here's an
example of some configurations:
[Link]
jdbe:derby:;databaseName~/path/to/metastore_db;ereate-true
JDBC connect string for a JDBC
metastore.
[Link] Variables Setup:
Add the Hive binary directory (C:\hive\bin in this example) to your PATH environment
variable.
Set the HIVE HOME environment variable to point to the Hive installation directory
(CAhive in this example).5. Start the Hive Metastore service:
To start the Hive Metastore service, you can use the schematool script:
Stree wer
6. Start Hive:
* Open a command prompt or terminal and navigate to the Hive installation directory.
© Execute the hive command to start the Hive shell.
EXAMPLES:
1. Create a Database:
To create a new database in HIVE, use the following syntax:
CREATE DATABASE database_name;
Example:
CREATE DATABASE mydatabase;
2. Use a Database:
To use a specific database in HIVE, use the following syntax:
USE database_name;
Example:
USE mydatabase;
Show Databases:
To display a list of available databases in HIVE, use the following syntax:
SHOW DATABASES;
Create a Table:
To create a table in HIVE, use the following syntax:
CREATE TABLE table_name (
column! datatype,
column? datatype,Example:
CREATE TABLE mytable
(id INT,
name STRING,
age INT.
Show Tables:
To display a list of tables in the current database, use the following syntax:
SHOW TABLES;
6. Deseribe a Table:
To view the schema and details of a specific table, use the following syntax:
DESCRIBE table_name;
Example:
DESCRIBE mytable;
7. Insert Data into a Table:
To insert data into a table in HIVE, use the following syntax:
INSERT INTO table_name (column1, column2, ...) VALUES (valuel, value2, ...);
Example:
INSERT INTO mytable (id, name, age) VALUES (1, ‘John Doe', 25);
8. Select Data from a Table:
SELECT column], column2, ... FROM table_name WHERE condition;
Example:
SELECT * FROM mytable WHERE age > 20;
RESUL’
‘Thus the Installation of HIVE was done successfully.[Link] INSTALLATION OF HBASE ALONG WITH PRACTICE EXAMPLES
DATE:
AIM:
To install HBASE using Virtual Machine and perform some operations in HBASE.
ALGORITHM:
Step 1: Install a Virtual Machine
* Download and install a virtual machine software such as VirtualBox
([Link] or VMware ([Link]
* Create a new virtual machine and install a Unix-based operating system like Ubuntu or
CentOS, You can download the ISO image of your desired Linux distribution from their
official websites,
Step 2: Set up the Virtual Machine
+ Launch the virtual machine and install the Unix-based operating system following the
installation wizard.
‘© Make sure the virtual machine has network connectivity to download software
packages.
Step 3: Install Java
Open the terminal or command line in the virtual machine.
Update the package list
sudo apt update
Install OpenJDK (Java Development Kit)
sudo apt install default-jdk
Verify the Java installation:
java -version
Step 4: Download and Install HBase
* Inthe virtual machine, navigate to the directory where you want to install HBase.
* Download the HBase binary distribution from the Apache HBase website
([Link] Look for the latest stable version.
‘Extract the downloaded archive
tar -xvf .[Link]
* Replace with the actual name of the HBase archive file
34Move the extracted HBase directory to a desired location:
sudo my /opthhbase
Replace with the actual name of the extracted HBase
directory.
Step 5: Configure HBase
* Open the HBase configuration file for editing:
sudo nano /opt/hbase/conf/[Link]
‘Add the following properties to the configuration file:
[Link]
[Link]
[Link]
/var/lib/zookeeper
Save the file and exit the text editor.
Step 6: Start HBase
© Start the HBase server:
sudo /opt/hbase/bin/[Link]
HBASE PRACTICE EXAMPLES:
Step 1: Start HBase
* Make sure HBase is installed and running on your Windows system.
Step 2: Open HBase Shell
© Open a command prompt or terminal window and navigate to the directory where the
HBase installation is located. Run the following command to start the HBase shell.
>>hbase shelltep 3: Create a Table
«Inthe HBase shell, you can create a table with column families.
For example, let's create a table named "my_table" with a column family called "cf":
>> ereate 'my_table', 'ef*
Step 4: Insert Data
* To insert data into the table, you can use the put command,
‘Here's an example of inserting a row with a specific row key and values:
>> put 'my_table’, ‘row!’ ‘ef:columnt’, 'valuel’
>> put 'my_table’, ‘row’, 'cf:column2', 'value2
Step 5: Get Data
* You can retrieve data from the table using the get command.
+ For example, to get the values of a specific row:
>> get 'my_table', 'rowl’
* This will display all the column family values for the specified row
Step 6: Sean Data
* To scan and retrieve multiple rows or the entire table, use the scan command,
‘* For instance, to scan all rows in the table:
>> scan 'my_table’
* This will display all rows and their corresponding column family values.
Step 7: Delete Data
* To delete a specific row or a particular cell value, you can use the delete command.
«Here's an example of deleting a specific row:
>>delete 'my_table', ‘row!’
Step 8: Disable and Drop Table
* Ifyou want to remove the table entirely, you need to disable and drop it.
* Use the following commands:
>>disable 'my_table’
>>drop 'my_table’
RESUL!
‘Thus the installation of HBase using Virtual Machine was done successfilly.[[Link] INSTALLATION OF THRIFT
DATE:
To install Apache thrift on Windows OS.
ALGORITHM:
Step 1: Download Apache Thrift:
* Visit the Apache Thrift website: https:/[Link]/
* Go to the "Downloads" section and find the latest version of Thrift.
* Download the Windows binary distribution (ZIP file) for the desired version.
Step 2: Extract the ZIP file:
* Locate the downloaded ZIP file and extract its contents to a directory of your choice.
© This directory will be referred to as in the following steps,
Step 3: Set up environment variables:
Open the Start menu and search for "Environment Variables" and select "Edit the
system environment variables."
Click the "Environment Variables" button at the bottom right of the "System Properti¢
window.
Under the "System variables” section, find the "Path" variable and click "Edit.”
Add the following entries to the "Variable value" field (replace with
the actual directory path):
\bin
\lib
Click "OK" to save the changes.
Step 4: Verify the installation:
* Open a new Command Prompt window.
‘Run the following command to verify that Thrift is installed and accessible:
thrift version
‘* Ifeverything is set up correctly, you should see the version number of Thrift printed
on the screen,RESULT:
Thus the installation of Thrift on windows OS was done successfully.[Link]
PRACTICE IMPORTING AND EXPORTING DATA FROM
DATE:
VARIOUS DATABASES.
AIM:
‘To import and export data from various Databases using SQOOP.
ALGORITHM:
Step 1: Install SQOOP.
# First, you need to install Sqoop on your Hadoop cluster or machine
* Download the latest version of Sqoop from the Apache Sqoop website
(hutp://[Link]/) and follow the installation instructions provided in the
documentation
Step 2: Importing data from a database:
‘* To import data from a database into Hadoop, use the following Sqoop command:
Sqoop import ~connect
jdbe:://:/\
~-username
~-password
~table \
--target-dir \
--m
Replace the placeholders
(, , , , ,
, , , and
) with the appropriate values for your database and
Hadoop environment.
Step 3: Exporting data to a database:
To export data from Hadoop to a database, use the following Sqoop command:
sqoop export —connect
jdbe:://:/ \
~-username ~-password
~-table
~-export-dir \
--input-fields-terminated-by '"
Replace the placeholders
(, , , , ,
, , , and
) with the appropriate values for your database and Hadoop
environment,
RESUL’
Thus the implementation export data from various Databases using SQOOP was done
successfully.