1 Data Handling Using Pandas 1

INFORMATICS PRACTICES
Code No-065
CLASS-XII
2020-2021
By
B.Naresh, PGT Computer Science

JNV Khammam
Blue Print:
Unit Unit Name Marks
No
1 Data Handling using Pandas and Data 30

Visualization
2 Database Query using SQL 25
3 Introduction to Computer Networks 7
4 Societal Impacts 8
Practical 30
Total 100
Unit 1
Data Handling using Pandas and Data
Visualization
(Data Handling using Pandas –I)

Module: Module is a file which contains python functions. It is
.py file which has python executable code or statements.
Package: Package is namespace which contains multiple
packages or modules. It is a directory which contains a special
file __init__.py.
__init__.py file denotes Python the file that contains __init__.py
as package.
Library: It is collection of various packages. There is no
difference between package and python library conceptually.
Framework: It is a collection of various libraries which architects

the code flow.
Pandas:
Pandas is the most popular open source python library used for
data analysis.
We can analyze the data in pandas in two ways-
● Series
● Dataframes
Installation of pandas:
pip install pandas

Series:
Series is 1-Dimensional array deﬁned in python pandas to store
any data type.
Syntax:
<Series Name>=<pd>.Series(<list name>, ...)

Example:
5 15 16 4 34
Properties of Series:
• Series will contain homogeneous data type.
• Size of the series immutable
• Values in the series are mutable.
Creation of Series:
We can create a pandas series in following ways-
● From arrays
● From Lists
● From Dictionaries
● From scalar value
From Lists :
Output:
From arrays :
Output:
From Dictionary:
Output:
From Scalar Value:
Output:
Mathematical Operations on Series:
Mathematical Operations on Series (cont…):
Output:
Head and Tail functions on Series:
head and tail functions returns first and last n rows respectively.
Syntax:
<Series name>.head(n)
<Series name>.tail(n)
n-number of rows
Default value of n is 5
Selection, Indexing and Slicing on Series:
Selection: We can select a value from the series by using its
corresponding index.
Syntax:
<Series name>[<index number>]
Output:
Indexing:
Series.index attribute is used to get or set the index labels for the
given series.
Syntax:
<Series name>.index
Indexing (cont...):
Output:
Slicing:
Slicing operation on the series split the series based on the given
parameters.
Syntax:
<Series name>[<start>:<stop>:<step>]
Note: start,stop,step are optional
Default values: start=0, stop=n-1, step=1
Note: slicing will take default index
Data Frames
Data Frames:
Data Frames is a two-dimensional(2-D) data structure defined in
pandas which consist of rows and columns.
Data Frames stores an ordered collection of columns that can
store data of different types.
Example:
S.No. Name Age Marks
1 Ravi 25 99
2 Kunal 26 98
Characteristics of Data Frames:
➢ It has two indices (two axes)
○ Row index (axis=0) ->known as index
○ Column index (axis=1) ->known as column-name
➢ Value in the Data Frame will be identifiable by the
combination of row index and column index.
➢ Indices can be of any type
➢ Column can have data of different types.
➢ Value is mutable
➢ Size is mutable
Creation of Data Frames:
Syntax:
<Data Frame Name>=
pandas.DataFrame(
<2D data structure>,
<columns=<column sequence>,
<index=<index sequence>,............)
We can create Data Frame in many ways, such as-
(i) Two dimensional dictionaries
(ii) Two dimensional ndarrays(NumPy arrays)
(iii) Series type object
(iv) Another Dataframe object
(v) Text/CSV files
Creating Data frame from List:
Output:
Creating Data frame from array:
Output:
Creating Data frame from Series:
Output:
Creating Data frame from another Data frame:
Output:
(i) Two dimensional dictionaries
We can create Dataframe from Two dimensional dictionaries-
➢ Creating Dataframe from list of dictionaries
➢ Creating Dataframe from dictionary of Series

Creating Dataframe from list of dictionaries:
Output:
Creating Data frame from dictionary of Series:
Output:
(v) Text/CSV files:
We can Create Dataframe from Text/CSV Files by using
read_csv() function.
Syntax:
<data frame name>
=pandas.read_csv(filepath_or_buffer, sep=',',
delimiter=None, header='infer', names=None,
index_col=None, usecols=None, …)
(v) Text/CSV files (cont..):
Output:
Accessing values in dataframe:
Accessing a particular value:
<Data frame name>[<column name>][<index>]
Accessing a group of values:

<Data frame name>.loc[<index>],[<column name>]
Accessing values in dataframe (cont…):
Output:
NaN variable in Python:
NaN , standing for not a number, is a numeric data type used to
represent any value that is undefined or unpresentable. For
example, 0/0 is undefined as a real number and is, therefore,
represented by NaN.
Iteration on Dataframes:
In Pandas Dataframe we can iterate an element in two ways:
● Iterating over rows

● Iterating over columns
Iterating over rows :
To iterate over the rows of the DataFrame, we can use the

following functions −
● iterrows() − iterate over the rows as (index,series) pairs
● iteritems() − to iterate over the (key,value) pairs
● itertuples() − iterate over the rows as namedtuples
iterrows():
Output:
iteritems():
Output:
itertuples():
Output:
Iterating over Columns :In order to iterate over columns, we
need to create a list of dataframe columns and then iterating
through that list to pull out the data frame columns.
Operations on rows and columns:
● Add
● Select
● Delete
● Rename
Column selection:
Output:
Column addition:
Output:
Column Deletion:
Output:
Column Rename:
Output:
Row selection:
Output:
Row Addition:
Output:
Row Deletion:
Output:
Row Rename:
Output:
Head and Tail functions in Data Frames:
head(n):
Returns the first n rows.
tail(n):
Returns last n rows.
Default value for n is 5
Indexing using Labels in Data Frames: We can make one of
the columns as row index label for the data frame by using the
function set_index().
Output:
Boolean indexing in Data Frames: Boolean indexing helps us
to select the data from the Data Frames using a boolean vector.
Joining, Merging and Concatenation on Data Frames:
Merge:
pandas.merge() method is used for merging two data frames.
It will have three arguments.
● Data frame names
● how - how will take any of the three values i.e., left,right or
inner
● on - on the common column name
Merge (cont..):
Join:The join method uses the index of the dataframes.
Use <dataframe 1>.join(<dataframe 2>) to join
Concatenation:Concatenate uses pandas.concat(<List of data
frames>).
Importing/Exporting Data between CSV files and Data
Frames:
Import data from CSV file to Data Frame:We can import data
from CSV File to Data Frame by using read_csv() function.
Output:
Export data from Data Frame to CSV File:We can export data
from Data Frame to CSV File by using to_csv() function.
Syntax:
<data frame name>.to_csv(<File Path>,.....)
Thank you

1 Data Handling Using Pandas 1

Uploaded by

1 Data Handling Using Pandas 1

Uploaded by

INFORMATICS PRACTICES

B.Naresh, PGT Computer Science

1 Data Handling using Pandas and Data 30

2 Database Query using SQL 25

3 Introduction to Computer Networks 7

(Data Handling using Pandas –I)

Framework: It is a collection of various libraries which architects

pip install pandas

<Series Name>=<pd>.Series(<list name>, ...)

➢ Creating Dataframe from list of dictionaries

➢ Creating Dataframe from dictionary of Series

Accessing a group of values:

In Pandas Dataframe we can iterate an element in two ways:

● Iterating over rows

To iterate over the rows of the DataFrame, we can use the

You might also like