ADHI COLLEGE OF ENGINEERINGAND TECHNOLOGY
Sankarapuram, Near Walajabad, Kanchipuram Dist., Pin: 631605
DEPARTMENTOF
ARTIFICIAL INTELLIGENCE AND DATA SCIENCE
STUDENT NAME :
REGISTER NO :
SUBJECT CODE : AD3301
SUBJECT NAME : Data Exploration and Visualization
YEAR / SEMESTER : II/III
(2022-2023 )
Sankarapuram, Near Walajabad, Kancheepuram Dist., Pin: 631605
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE
LABORATORY RECORD NOTE BOOK
2022– 2023
REGISTER NO:
This is to certify that this is a bonafide record of the work done by
Mr./Ms. _ of the year B.E / B.Tech.,
Department of in the
_ Laboratory in the_____Semester.
Staff In-Charge Head of the Department
Submitted to the University Examination held on_________________________.
Internal Examiner External Examiner
TABLE OF CONTENTS
S No Date LIST OF EXPERIMENTS Page sign
No
1 Install the data Analysis and Visualization tool: R/
Python /Tableau Public/ Power BI.
2 Perform exploratory data analysis (EDA) on with
datasets like email data set. Export all your emails as a
dataset, import them inside a pandas data frame,
visualize them and get different insights from the data.
3 Working with Numpy arrays, Pandas data frames , Basic
plots using Matplotlib.
4 Explore various variable and row filters in R for cleaning
data. Apply various plot features in R on sample data
sets and visualize.
5 Perform Time Series Analysis and apply the various
visualization techniques.
6 Perform Data Analysis and representation on a Map
using various Map data sets with Mouse Rollover effect,
user interaction, etc..
7 Build cartographic visualization for multiple datasets
involving various countries of the world; states and
districts in India etc.
8 Perform EDA on Wine Quality Data Set.
9 Use a case study on a data set and apply the various EDA
and visualization techniques and present an analysis
report.
Exno:1 Install the data Analysis and Visualization tool: R/ Python /Tableau
Date: Public/ Power BI.
Aim:
To install the data analysis and visualization tool: R/Python/Tableau
Public/Power BI.
Program:
Import pandas as pd
hafeez=[‘Hafeez’,19]
aslan=[‘Aslan’,21]
kareem=[‘kareem’,18]
data_frame=pd.dataframe([hafeez,aslant,kareem],columns=[‘name’,’age’])
print(data_frame)
Output:
Name age
0 Hafeez 19
1 Aslan 21
2 Kareem 18
Result:
Thus above software files are installed and verified successfully.
Exno:2 Perform exploratory data analysis (EDA) on with datasets like email data
Date: set. Export all your emails as a dataset, import them inside a pandas
data frame, visualize them and get different insights from the data.
Aim:
To perform exploratory data analysis on with datasets like email data
set.And to export all our emails as a dataset,import them inside a pandas data
frame,visualize them and get different insights from the data.
Program:
import pandas as pd
import matplotlib.pyplot as plt
emails = pd.read_csv("emails.csv")
print(emails.head())
print(emails.shape)
print(emails.dtypes)
print(emails.describe())
print(emails.isnull().sum())
emails['length'] = emails['message'].str.len()
emails['length'].plot(kind='hist', bins=50)
plt.xlabel('Email Length')
plt.show()
emails['sender'].value_counts().plot(kind='bar')
plt.xlabel('Email Sender')
plt.show()
Output:
Result:
Thus the above program was entered and executed successfully.
Ex no:3 Working with Numpy arrays, Pandas data frames , Basic plots using
Date: Matplotlib.
Aim:
To write a program with numpy arrays,pandas data frames,basic plots using
Matplotlib.
Program:
For Numpy
import numpy as np
a = np.array([1, 2, 3, 4, 5])
print(a)
b=a*2
print(b)
c=a+b
print(c)
mean = np.mean(a)
print(mean)
std_dev = np.std(a)
print(std_dev)
a = a.reshape(5, 1)
print(a)
For Pandas
import pandas as pd
data = {'name': ['John', 'Mike', 'Sara'],
'age': [28, 35, 42],
'city': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)
print(df)
print(df[['name', 'age']])
print(df[df['age'] > 30])
print(df.groupby(['city']).mean())
df['income'] = [50000, 60000, 70000]
print(df)
For Basic plots
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Line Plot Example')
plt.show()
output:
Result:
Thus the above program was entered and executed successfully.
Ex no:4 Explore various variable and row filters in R for cleaning data. Apply
Date: various plot features in R on sample data sets and visualize.
Aim:
To Explore various variable and row filters in R for cleaning data.And to apply
various plot features in R on sample data sets and visualize.
Program:
ic columns from a data frame.
library(ggplot2)
# Create a sample data frame
df<- data.frame(x = rnorm(100), y = rnorm(100))
# Create a scatter plot
ggplot(data = df, aes(x = x, y = y)) +
geom_point()
# Create a line plot
plot(df$x, type = "l")
Output:
Result:
Thus the above program was entered and executed successfully.
Ex no:5 Perform Time Series Analysis and apply the various visualization
Date: techniques.
Aim:
To perform time series analysis and to apply the various visualization
techniques.
Program:
Here is an example of how to use the forecast package to perform basic time
series analysis on a sample dataset:
library(forecast)
# Create a sample time series
ts<- ts(rnorm(100), start = c(2010, 1), frequency = 12)
# Decompose the time series into its trend, seasonal, and residual components
decomposed_ts<- decompose(ts)
# Plot the decomposition
plot(decomposed_ts)
# Fit an exponential smoothing model to the time series
fit <- ets(ts)
# Forecast the next 10 periods
forecast(fit, h = 10)
Here is an example of how to create a line plot of a time series using the ggplot2
package:
library(ggplot2)
# Create a line plot of the time series
ggplot(data = ts, aes(x = time(ts), y = ts)) +
geom_line()
library(ggplot2)
library(reshape2)
# Melt the data
data_melt<- melt(ts, id = "time")
# Plot the heatmap
ggplot(data = data_melt, aes(x = time, y = variable, fill = value)) +
geom_tile() +
scale_fill_gradient()
Output:
Result:
Thus the above program was entered and executed successfully.
Ex no:6 Perform Data Analysis and representation on a Map using various
Date: Map data sets with Mouse Rollover effect, user interaction, etc..
Aim:
To perform Data Analysis and to representation on a Map using various
Map data sets with Mouse Rollover effect, user interaction, etc..
Program:
import folium
# Create a map centered on a specific location
location = [40.693943, -73.985880]
map = folium.Map(location=location, zoom_start=13)
# Add data points to the map
for i in range(0, len(data)):
folium.Marker(data.loc[i, 'coordinates'], popup=data.loc[i, 'name']).add_to(map)
Output:
Result:
Thus the above program was entered and executed successfully.
Ex no:7 Build cartographic visualization for multiple datasets involving
Date: various countries of the world; states and districts in India etc.
Aim:
To Build cartographic visualization for multiple datasets involving various
countries of the world; states and districts in India etc.
Program:
import geopandas as gpd
# Load the state shapefile
states = gpd.read_file('path/to/states.shp')
# Join the data to the shapefile
states = states.join(data, on='state_name')
# Create the choropleth map
states.plot(column='data_column', cmap='YlGn', legend=True)
Output:
Result:
Thus the above program was entered and executed successfully.
Ex no:8 Perform EDA on Wine Quality Data Set.
Date:
Aim:
To Perform EDA on Wine Quality Data Set.
Program:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load the Wine Quality Data Set
data = pd.read_csv('path/to/winequality.csv')
# Display the first few rows of the data
data.head()
# Display basic statistics about the data
data.describe()
# Check for missing values
data.isnull().sum()
Output:
Result:
Thus the above program was entered and executed successfully.
Ex no:9 Use a case study on a data set and apply the various EDA and
Date: visualization techniques and present an analysis report.
Aim:
To Use a case study on a data set and apply the various EDA and visualization
techniques and present an analysis report.
Study:
The analysis report would typically include the following sections:
A title page – including the main topic or purpose and the type of report
Table of contents – in a logical or chronological order
A clause – specifying and presenting the methods used for the activity
The main discussion – broken down into organized sections, including the heading, the
sub-heading, and the discussion's body.
The conclusions – according to the results and information gathered in the business
report
The recommendations – given by the employee who created the report
Sections for bibliography or appendices – when necessary
To write a successful analytical report, make sure you follow these instructions:
Identify the Problem
The first step to creating an analytical report is identifying the problem and the
demographics affected by it. Make sure you describe the problem by including
information on where it began, what techniques were used to solve it so far, and their
effectiveness.
Explain Your Methods
Secondly, you should list the KPIs and methods you’ve used in the analysis report to
determine your actions’ success. You should also add one or two new methods to try
instead. For example, a report done on a failed ad campaign may reveal that the
success factor was determined by surveys conducted on a sample population.
Analyze Data
Analytical reports display a detailed analysis of the information collected through the
research methods employed. As you know, the report was built to sort out a specific
issue and decide on alternative methods to try. So, it would help if you analyzed the
success or failures of the solutions you tried in the first place.
Make Recommendations
Lastly, your analytical report should include solution recommendations. And, it would
help if you placed these solutions at the bottom of your report. By coming up with a few
recommendations, you’ll be able to create data-based decision-making instead of
guessing.
Result:
The case study on a data set and the various EDA and visualization techniques
was studied sucessfully.