ML Lab 04 Manual - Pandas and MatplotLib
ML Lab 04 Manual - Pandas and MatplotLib
Machine Learning
Introduction
Objectives
Lab Conduct
Machine Learning
Pandas (panel data) is a library that can load tabular data from .csv files and
store into a NumPy compatible table known as a “Pandas Data Frame”. Each
column in a data frame is of a “Pandas Series” type. Aside from loading datasets,
pandas also enables us to perform basic mean, mode, median operations as well
as clean up incomplete or duplicate data.
For this lab, you will be provided with 2 dataset files in .csv format which you will
need for the tasks. Additionally, for the final task, you will need to arrange your own
Machine Learning
dataset by downloading it from the internet. You will need to import pandas and
matplotlib.pyplot for the given tasks.
Machine Learning
### TASK 2 OUTPUT SCREENSHOTS START HERE ###
Machine Learning
x1 and x2, choose any 2 columns from the dataset and also mention the
columns that you are using.
d) Load the cleaned dataset 2 and make a 3-D scatter plot between any three
features in the dataset (axes x1, x2, x3)
Machine Learning
Lab Task 6 – Your Own Dataset ___________________________________________________
Download your own CSV dataset from the internet (e.g. Kaggle). Your dataset
must have at least 500 rows and at least 2 feature columns. Your dataset must
also have a labels column with classification data (0/1). Make a scatter plot
between the feature axes and show the labels with different markers. Provide
all of the codes and screenshots of the plots. You will also need to submit the
downloaded dataset with your report (renamed as lab4_task6.csv). Note that no
two submitted datasets must be exactly the same.
Machine Learning