0% found this document useful (0 votes)
68 views18 pages

Report On Petroleum Consumption Data Analytics: - Submitted by

The document is a report on analyzing petroleum consumption data. It includes sections on reading in client data using Pandas, checking for null values and renaming columns, performing statistics calculations, creating boxplots and correlation matrices, fitting multiple linear regression and classification models, and generating a confusion matrix to evaluate model performance. The report was created by a group of students for a college project under the supervision of their institution.

Uploaded by

Ayush Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
68 views18 pages

Report On Petroleum Consumption Data Analytics: - Submitted by

The document is a report on analyzing petroleum consumption data. It includes sections on reading in client data using Pandas, checking for null values and renaming columns, performing statistics calculations, creating boxplots and correlation matrices, fitting multiple linear regression and classification models, and generating a confusion matrix to evaluate model performance. The report was created by a group of students for a college project under the supervision of their institution.

Uploaded by

Ayush Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 18

Report On Petroleum Consumption Data

Analytics
• Submitted By:
Ujjawal Tyagi
Ayush Sharma
Shubham Singh
Anshul Gupta
Shivam Prajapati

• Under the Supervision of:

Institute Of Technology And Science


Certificate

This is to certify that our group has carried out the project work
Presented in the report title “Petroleum Consumption Data Analytics “ from
Institute Of Technology And Science Mohan
Nagar under my supervision. The report emboids result of original work and
studies carried out by students himself.

Date : 09:02:2022

Institute of Technology And Science


Acknowledgement

The project is created by Ujjwal Tyagi,Ayush Sharma,Shubham


Singh,Anshul Gupta,Shivam Prajapati. Our Topic is “Petroleum
Consumption Data Analytics. The project is done under the
supervision of our college Insititute of technology and Science.
Abstract

The topic “Petroleum Consumption Data


Analytics” is about analysing data of
petroleum industry constumers .The data is
divided into five columns imported from the
excel sheet csv .file .

Here in this project we do import pandas


library which used to read client data and
records.After that we do many opreations
like mean ,median ,mode.Then we do
multiple linear regression and classification
and at the last we find the confusion matrix.
Technology Used

Harware Used:
4gb RAM
Pentium 4 or higher Processor
512 gb or above

Software Used:
Windows 10

IDE Used:
Jupyter

Language used:
Python
CONTENTS

1. Read client data and records.


2. Checking null values .
3. Perform . calculations
4. Boxplot.
5. Boxplot for grouped data.
6. Correlation matrix.
7. Histogram.
8. Multiple Linear Regression.
9. Classification.
10. Confusion Matrix.
READ CLIENT DATA AND RECORDS

HERE,PANDAS read_csv() function imports a csv file to data frame format.


CHECKING NULL VALUES AND RENAME
DATA FRAME

Here,the function isnull() checking the values from each column in dataset.

Through Pandas we rename the dataframe column of the dataset.


PERFORM STATISTICS AND
CALCULATIONS

Here,we perform some opreations on the dataset such as

Mean,median,mode by the help of mean(),median(),mode() functions.


CREATE BOXPLOT

Here , with the help of matplot.lib that we had imported,


we plot the data as shown above
BOXPLOT FOR GROUPED DATA
CORRELATION MATRIX

Here , we correlate the data values of all columns of dataframe by the help of
corr() function used to find pairwise correlation of columns.
HISTOGRAM

Similarly, with the help of maxplot .lib ,we plot the histogram of the
Dataset.
MULTIPLE LINEAR REGRESSION
As shown clearly above , multiple linear regression is a statistical
technique that uses several variables to predict the outcome of a
response variable.
CLASSIFICATION
CONFUSION MATRIX

Here a table that is used to describe the performance of a classification


model on the dataset. The table is known as shown clearly above
‘CONFUSION TABLE’.
THANK YOU

You might also like