0% found this document useful (0 votes)

5 views11 pages

Pandas python

Uploaded by

shrawantiyarzal09

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

5 views11 pages

Pandas python

Uploaded by

shrawantiyarzal09

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Download as docx, pdf, or txt

You are on page 1/ 11

Pandas is a Python library.

Pandas is used to analyze data.

What is Pandas?
Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data
Analysis" and was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical
theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

What Can Pandas Do?

Pandas gives you answers about the data. Like:
 Is there a correlation between two or more columns?
 What is average value?
 Max value?
 Min value?

Pandas are also able to delete rows that are not relevant, or contains wrong
values, like empty or NULL values. This is called cleaning the

Key Features of Pandas

 Fast and efficient DataFrame object with default and
customized indexing.
 Tools for loading data into in-memory data objects from
different file formats.
 Data alignment and integrated handling of missing data.
 Reshaping and pivoting of date sets.
 Label-based slicing, indexing and subsetting of large data sets.
 Columns from a data structure can be deleted or inserted.
 Group by data for aggregation and transformations.
 High performance merging and joining of data.
 Time Series functionality.

Installation of Pandas
If you have Python and PIP already installed on a system, then installation of
Pandas is very easy.

Install it using this command:

C:\Users\Your Name>pip install pandas

If this command fails, then use a python distribution that already has Pandas
installed like, Anaconda, Spyder etc.

Import Pandas
Once Pandas is installed, import it in your applications by adding the import keyword:

import pandas

Pandas deals with the following three data structures −

 Series
 DataFrame
 Panel

These data structures are built on top of Numpy array, which means
they are fast.

Dimension & Description

The best way to think of these data structures is that the higher
dimensional data structure is a container of its lower dimensional
data structure. For example, DataFrame is a container of Series,
Panel is a container of DataFrame.

Data
Dimensi
Structur Description
ons
e

Series 1 1D labeled homogeneous array, sizeimmutable.

General 2D labeled, size-mutable tabular structure with

Data Frames 2
potentially heterogeneously typed columns.

Panel 3 General 3D labeled, size-mutable array.

Building and handling two or more dimensional arrays is a tedious
task, burden is placed on the user to consider the orientation of the
data set when writing functions. But using Pandas data structures,
the mental effort of the user is reduced.

For example, with tabular data (DataFrame) it is more semantically

helpful to think of the index (the rows) and the columns rather
than axis 0 and axis 1.

Mutability

All Pandas data structures are value mutable (can be changed) and
except Series all are size mutable. Series is size immutable.

Note − DataFrame is widely used and one of the most important

data structures. Panel is used much less.

Series
Series is a one-dimensional array like structure with homogeneous
data. For example, the following series is a collection of integers 10,
23, 56, …

10 23 56 17 52 61 73 90 26 72

Key Points

 Homogeneous data
 Size Immutable
 Values of Data Mutable

DataFrame
DataFrame is a two-dimensional array with heterogeneous data. For
example,

Name Age Gender Rating

Steve 32 Male 3.45

Lia 28 Female 4.6

Vin 45 Male 3.9

Katie 38 Female 2.78

The table represents the data of a sales team of an organization

with their overall performance rating. The data is represented in
rows and columns. Each column represents an attribute and each
row represents a person.

Data Type of Columns

The data types of the four columns are as follows −

Column Type

Name String

Age Integer

Gender String

Rating Float

Key Points

 Heterogeneous data
 Size Mutable
 Data Mutable

Panel
Panel is a three-dimensional data structure with heterogeneous
data. It is hard to represent the panel in graphical representation.
But a panel can be illustrated as a container of DataFrame.

Key Points

 Heterogeneous data
 Size Mutable
 Data Mutable

pandas.Series

A pandas Series can be created using the following constructor −

pandas.Series( data, index, dtype, copy)

The parameters of the constructor are as follows −

Sr.N
o
Parameter & Description

data
1
data takes various forms like ndarray, list, constants

index
2 Index values must be unique and hashable, same length as data.
Default np.arrange(n) if no index is passed.

dtype
3
dtype is for data type. If None, data type will be inferred

copy
4
Copy data. Default False

A series can be created using various inputs like −

 Array
 Dict
 Scalar value or constant

Create an Empty Series

A basic series, which can be created is an Empty Series.

Example
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print (s)
Its output is as follows −
Series([], dtype: float64)

Create a Series from ndarray

If data is an ndarray, then index passed must be of the same length.
If no index is passed, then by default index will
be range(n) where n is array length, i.e.,
[0,1,2,3…. range(len(array))-1].
Example 1
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print (s)
Its output is as follows −
0 a
1 b
2 c
3 d
dtype: object

Example 2
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
print (s)
Its output is as follows −
100 a
101 b
102 c
103 d
dtype: object

We passed the index values here. Now we can see the customized
indexed values in the output.

Create a Series from dict

A dict can be passed as input and if no index is specified, then the
dictionary keys are taken in a sorted order to construct index.
If index is passed, the values in data corresponding to the labels in
the index will be pulled out.
Example 1
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print (s)
Its output is as follows −
a 0.0
b 1.0
c 2.0
dtype: float64
Python Pandas - DataFrame

A Data frame is a two-dimensional data structure, i.e., data is

aligned in a tabular fashion in rows and columns.

Features of DataFrame
 Potentially columns are of different types
 Size – Mutable
 Labeled axes (rows and columns)
 Can Perform Arithmetic operations on rows and columns
Structure

Let us assume that we are creating a data frame with student’s

data.

You can think of it as an SQL table or a spreadsheet data

representation.

pandas.DataFrame
pandas.DataFrame

A pandas DataFrame can be created using the following constructor

−
pandas.DataFrame( data, index, columns, dtype, copy)

The parameters of the constructor are as follows −

Sr.N
Parameter & Description
o

data
1 data takes various forms like ndarray, series, map, lists, dict, constants and also
another DataFrame.

index
2 For the row labels, the Index to be used for the resulting frame is Optional
Default np.arange(n) if no index is passed.

columns
3 For column labels, the optional default syntax is - np.arange(n). This is only
true if no index is passed.

dtype
4
Data type of each column.

copy
5 This command (or whatever it is) is used for copying of data, if the default is
False.
Create DataFrame

A pandas DataFrame can be created using various inputs like −

 Lists
 dict
 Series
 Numpy ndarrays
 Another DataFrame

In the subsequent sections of this chapter, we will see how to create

a DataFrame using these inputs.

Create an Empty DataFrame

A basic DataFrame, which can be created is an Empty Dataframe.

Example
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print (df)
Its output is as follows −
Empty DataFrame
Columns: []
Index: []
Its output is as follows −
0
0 1
1 2
2 3
3 4
4 5
Example 2
Live Demo
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Its output is as follows −
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13
Example 3
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df =
pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print df
Its output is as follows −
Name Age
0 Alex 10.0
1 Bob 12.0
2 Clarke 13.0
Note − Observe, the dtype parameter changes the type of Age
column to floating point.
Create a DataFrame from Lists
The DataFrame can be created using a single list or a list of lists.

Example 1
Live Demo
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
Its output is as follows −
0
0 1
1 2
2 3
3 4
4 5

Example 2
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Its output is as follows −
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13
Example 3
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df =
pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print df
Its output is as follows −
Name Age
0 Alex 10.0
1 Bob 12.0
2 Clarke 13.0
Note − Observe, the dtype parameter changes the type of Age
column to floating point.
Create a DataFrame from Dict of ndarrays / Lists
All the ndarrays must be of same length. If index is passed, then
the length of the index should equal to the length of the arrays.
If no index is passed, then by default, index will be range(n),
where n is the array length.
Example 1
Live Demo
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':
[28,34,29,42]}
df = pd.DataFrame(data)
print df
Its output is as follows −
Age Name
0 28 Tom
1 34 Jack
2 29 Steve
3 42 Ricky
Note − Observe the values 0,1,2,3. They are the default index
assigned to each using the function range(n).

Pandas DataFrame to Excel

You can save or write a DataFrame to an Excel File or a specific
Sheet in the Excel file
using pandas.DataFrame.to_excel() method of DataFrame class.
In this tutorial, we shall learn how to write a Pandas DataFrame to
an Excel File, with the help of well detailed example Python
programs.
Prerequisite
The prerequisite to work with Excel file functions in pandas is that,
you have to install openpyxl module. To install openpyxl using pip,
run the following pip command.
pip install openpyxl

Example

import pandas as pd
df= pd.read_excel('D:\\rr.xlsx')
print(df)

Full Squat Produces Greater Neuromuscular and Functional Adaptations and Lower Pain Than Partial Squats After Prolonged Resistance Training
100% (1)
Full Squat Produces Greater Neuromuscular and Functional Adaptations and Lower Pain Than Partial Squats After Prolonged Resistance Training
11 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
18_Pandas
No ratings yet
18_Pandas
33 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Data Analytics Pandas
No ratings yet
Data Analytics Pandas
33 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Unit 4
No ratings yet
Unit 4
36 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas
No ratings yet
Pandas
11 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
Pandas
No ratings yet
Pandas
41 pages
Exp1 DSUP
No ratings yet
Exp1 DSUP
20 pages
Pandas
No ratings yet
Pandas
82 pages
Ln. 1 - Data handling using Pandas - Series & Dataframe
No ratings yet
Ln. 1 - Data handling using Pandas - Series & Dataframe
14 pages
SBLC 1
No ratings yet
SBLC 1
23 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Pandas
No ratings yet
Pandas
3 pages
Python Pandas - DataFrame
No ratings yet
Python Pandas - DataFrame
12 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
Python Pandas
100% (1)
Python Pandas
35 pages
4a Introduction To Pandas - PPTX - Lyst5943
No ratings yet
4a Introduction To Pandas - PPTX - Lyst5943
11 pages
Python Pandas Interview Questions
100% (1)
Python Pandas Interview Questions
17 pages
Pandas
No ratings yet
Pandas
16 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
64 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
Pandas Notes
No ratings yet
Pandas Notes
9 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
Data Handling Using Pandas - 1-2-1
No ratings yet
Data Handling Using Pandas - 1-2-1
10 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
XII_ip_Panda_I_Part_I_2023 (1) 1 1
No ratings yet
XII_ip_Panda_I_Part_I_2023 (1) 1 1
25 pages
Working With Pandas Notes
No ratings yet
Working With Pandas Notes
27 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
63 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Data Handling Using Pandas I - Series
No ratings yet
Data Handling Using Pandas I - Series
11 pages
1 Data Handling Using Pandas 1
No ratings yet
1 Data Handling Using Pandas 1
63 pages
2_Pandas
No ratings yet
2_Pandas
22 pages
Notes On Pandasmanpreet
No ratings yet
Notes On Pandasmanpreet
4 pages
Python Pandas
No ratings yet
Python Pandas
21 pages
Interview Bit Pandas
No ratings yet
Interview Bit Pandas
62 pages
FALLSEMFY2023-24 BCSE101E ELA CH2023241700215 Reference Material II 24-11-2023 Introduction To Pandas
No ratings yet
FALLSEMFY2023-24 BCSE101E ELA CH2023241700215 Reference Material II 24-11-2023 Introduction To Pandas
15 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Python 3rd unit question and answer
No ratings yet
Python 3rd unit question and answer
25 pages
Numpy Pandas
No ratings yet
Numpy Pandas
54 pages
Lab 9
No ratings yet
Lab 9
9 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Unit_III_part_2_1725700061785
No ratings yet
Unit_III_part_2_1725700061785
85 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Experiment
No ratings yet
Experiment
18 pages
Careers in Technology Ideaboard Template
No ratings yet
Careers in Technology Ideaboard Template
1 page
BCA III Yr V SEM English Question Bank24-25
No ratings yet
BCA III Yr V SEM English Question Bank24-25
2 pages
Ideaboard
No ratings yet
Ideaboard
1 page
English sem5
No ratings yet
English sem5
2 pages
C#-ist chapter
No ratings yet
C#-ist chapter
29 pages
Pandas 2
No ratings yet
Pandas 2
17 pages
Data Visualization using Matplotlib in Python
No ratings yet
Data Visualization using Matplotlib in Python
15 pages
Model structure visualizations help data scientist1
No ratings yet
Model structure visualizations help data scientist1
11 pages
Case Study - Boiler
No ratings yet
Case Study - Boiler
17 pages
Comparative Analysis of A Transfer Field Machine and An Induction Machine
No ratings yet
Comparative Analysis of A Transfer Field Machine and An Induction Machine
119 pages
Chloe_s_Cushions_for_Ravelry_18NOV2019
No ratings yet
Chloe_s_Cushions_for_Ravelry_18NOV2019
5 pages
Aratrika Bio
No ratings yet
Aratrika Bio
16 pages
NSTP 2 Week 3 & 4 Environmental Education and Recycyling
No ratings yet
NSTP 2 Week 3 & 4 Environmental Education and Recycyling
27 pages
Srikanth - GIS Updated Resume - 2022
No ratings yet
Srikanth - GIS Updated Resume - 2022
4 pages
Installation and Operating Instructions Zeverlution 3680 4000 5000 en
No ratings yet
Installation and Operating Instructions Zeverlution 3680 4000 5000 en
75 pages
Triple Point of Water - Assignment
No ratings yet
Triple Point of Water - Assignment
13 pages
New Macromycetes For Macedonia
100% (2)
New Macromycetes For Macedonia
17 pages
General Specification of Offshore Platforms
100% (1)
General Specification of Offshore Platforms
25 pages
RS-WZ3WZ1-N01-1 Operation Manual of Temperature Vibration Transmitter (RS485Type)
No ratings yet
RS-WZ3WZ1-N01-1 Operation Manual of Temperature Vibration Transmitter (RS485Type)
14 pages
Verb
No ratings yet
Verb
11 pages
Architecture For Exhibition
No ratings yet
Architecture For Exhibition
36 pages
Descr Strat
No ratings yet
Descr Strat
91 pages
Distribution Line Performance and Reliability Improvement Through Feeder Reconfiguration in Dessie Town
No ratings yet
Distribution Line Performance and Reliability Improvement Through Feeder Reconfiguration in Dessie Town
53 pages
Week 9a Lectures
No ratings yet
Week 9a Lectures
25 pages
PNLE I For Foundation of Professional Nursing Practice SET 1
No ratings yet
PNLE I For Foundation of Professional Nursing Practice SET 1
20 pages
Xounterbalance Valves Sun
No ratings yet
Xounterbalance Valves Sun
12 pages
Btec Diploma in Sport
No ratings yet
Btec Diploma in Sport
2 pages
Diet Chart For Central India
No ratings yet
Diet Chart For Central India
6 pages
ST1 Cosmetics
No ratings yet
ST1 Cosmetics
2 pages
Practical Meat Inspection Presentation-2022
No ratings yet
Practical Meat Inspection Presentation-2022
2 pages
PT Gemilang Bina Lintas Tirta Ship Management: Near Miss Report S - 06
No ratings yet
PT Gemilang Bina Lintas Tirta Ship Management: Near Miss Report S - 06
2 pages
NENG 301 - Thermodynamics and Kinetics of Nanomaterials: Prof. Y. Alex Xue CNSE, SUNY Polytechnic Institute
No ratings yet
NENG 301 - Thermodynamics and Kinetics of Nanomaterials: Prof. Y. Alex Xue CNSE, SUNY Polytechnic Institute
41 pages
MC Murdo E5 G5 Service Manual
No ratings yet
MC Murdo E5 G5 Service Manual
40 pages
NCM 109 Week 8
No ratings yet
NCM 109 Week 8
17 pages
Ray Jacquette: DLA Troop Support Clothing & Textiles Safety Team
No ratings yet
Ray Jacquette: DLA Troop Support Clothing & Textiles Safety Team
24 pages
Voke III
No ratings yet
Voke III
31 pages
Trimble TTS 500
No ratings yet
Trimble TTS 500
9 pages