0% found this document useful (0 votes)
17 views63 pages

1 Data Handling Using Pandas 1

IP

Uploaded by

ashwinthulir2007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
17 views63 pages

1 Data Handling Using Pandas 1

IP

Uploaded by

ashwinthulir2007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 63

INFORMATICS PRACTICES

Code No-065
CLASS-XII
2020-2021

By

B.Naresh, PGT Computer Science


JNV Khammam
Blue Print:
Unit Unit Name Marks
No

1 Data Handling using Pandas and Data 30


Visualization

2 Database Query using SQL 25

3 Introduction to Computer Networks 7

4 Societal Impacts 8

Practical 30

Total 100
Unit 1
Data Handling using Pandas and Data
Visualization

(Data Handling using Pandas –I)


Module: Module is a file which contains python functions. It is
.py file which has python executable code or statements.
Package: Package is namespace which contains multiple
packages or modules. It is a directory which contains a special
file __init__.py.
__init__.py file denotes Python the file that contains __init__.py
as package.
Library: It is collection of various packages. There is no
difference between package and python library conceptually.

Framework: It is a collection of various libraries which architects


the code flow.
Pandas:
Pandas is the most popular open source python library used for
data analysis.
We can analyze the data in pandas in two ways-

● Series
● Dataframes
Installation of pandas:

pip install pandas


Series:
Series is 1-Dimensional array defined in python pandas to store
any data type.

Syntax:

<Series Name>=<pd>.Series(<list name>, ...)


Example:
5 15 16 4 34

Properties of Series:
• Series will contain homogeneous data type.
• Size of the series immutable
• Values in the series are mutable.
Creation of Series:
We can create a pandas series in following ways-

● From arrays
● From Lists
● From Dictionaries
● From scalar value
From Lists :

Output:
From arrays :

Output:
From Dictionary:

Output:
From Scalar Value:

Output:
Mathematical Operations on Series:
Mathematical Operations on Series (cont…):

Output:
Head and Tail functions on Series:
head and tail functions returns first and last n rows respectively.
Syntax:
<Series name>.head(n)
<Series name>.tail(n)
n-number of rows
Default value of n is 5
Selection, Indexing and Slicing on Series:
Selection: We can select a value from the series by using its
corresponding index.
Syntax:
<Series name>[<index number>]

Output:
Indexing:
Series.index attribute is used to get or set the index labels for the
given series.

Syntax:
<Series name>.index
Indexing (cont...):

Output:
Slicing:
Slicing operation on the series split the series based on the given
parameters.
Syntax:
<Series name>[<start>:<stop>:<step>]
Note: start,stop,step are optional
Default values: start=0, stop=n-1, step=1
Note: slicing will take default index
Data Frames
Data Frames:
Data Frames is a two-dimensional(2-D) data structure defined in
pandas which consist of rows and columns.
Data Frames stores an ordered collection of columns that can
store data of different types.

Example:
S.No. Name Age Marks

1 Ravi 25 99

2 Kunal 26 98
Characteristics of Data Frames:
➢ It has two indices (two axes)
○ Row index (axis=0) ->known as index
○ Column index (axis=1) ->known as column-name
➢ Value in the Data Frame will be identifiable by the
combination of row index and column index.
➢ Indices can be of any type
➢ Column can have data of different types.
➢ Value is mutable
➢ Size is mutable
Creation of Data Frames:
Syntax:
<Data Frame Name>=
pandas.DataFrame(
<2D data structure>,
<columns=<column sequence>,
<index=<index sequence>,............)
We can create Data Frame in many ways, such as-
(i) Two dimensional dictionaries
(ii) Two dimensional ndarrays(NumPy arrays)
(iii) Series type object
(iv) Another Dataframe object
(v) Text/CSV files
Creating Data frame from List:

Output:
Creating Data frame from array:

Output:
Creating Data frame from Series:

Output:
Creating Data frame from another Data frame:

Output:
(i) Two dimensional dictionaries
We can create Dataframe from Two dimensional dictionaries-

➢ Creating Dataframe from list of dictionaries

➢ Creating Dataframe from dictionary of Series


Creating Dataframe from list of dictionaries:

Output:
Creating Data frame from dictionary of Series:

Output:
(v) Text/CSV files:
We can Create Dataframe from Text/CSV Files by using
read_csv() function.
Syntax:
<data frame name>
=pandas.read_csv(filepath_or_buffer, sep=',',
delimiter=None, header='infer', names=None,
index_col=None, usecols=None, …)
(v) Text/CSV files (cont..):

Output:
Accessing values in dataframe:
Accessing a particular value:
<Data frame name>[<column name>][<index>]

Accessing a group of values:


<Data frame name>.loc[<index>],[<column name>]
Accessing values in dataframe (cont…):

Output:
NaN variable in Python:
NaN , standing for not a number, is a numeric data type used to
represent any value that is undefined or unpresentable. For
example, 0/0 is undefined as a real number and is, therefore,
represented by NaN.
Iteration on Dataframes:

In Pandas Dataframe we can iterate an element in two ways:

● Iterating over rows


● Iterating over columns
Iterating over rows :

To iterate over the rows of the DataFrame, we can use the


following functions −
● iterrows() − iterate over the rows as (index,series) pairs
● iteritems() − to iterate over the (key,value) pairs
● itertuples() − iterate over the rows as namedtuples
iterrows():

Output:
iteritems():

Output:
itertuples():

Output:
Iterating over Columns :In order to iterate over columns, we
need to create a list of dataframe columns and then iterating
through that list to pull out the data frame columns.
Operations on rows and columns:

● Add

● Select

● Delete

● Rename
Column selection:

Output:
Column addition:

Output:
Column Deletion:

Output:
Column Rename:

Output:
Row selection:

Output:
Row Addition:

Output:
Row Deletion:

Output:
Row Rename:

Output:
Head and Tail functions in Data Frames:

head(n):
Returns the first n rows.
tail(n):
Returns last n rows.
Default value for n is 5
Indexing using Labels in Data Frames: We can make one of
the columns as row index label for the data frame by using the
function set_index().

Output:
Boolean indexing in Data Frames: Boolean indexing helps us
to select the data from the Data Frames using a boolean vector.
Joining, Merging and Concatenation on Data Frames:
Merge:
pandas.merge() method is used for merging two data frames.
It will have three arguments.
● Data frame names
● how - how will take any of the three values i.e., left,right or
inner
● on - on the common column name
Merge (cont..):
Join:The join method uses the index of the dataframes.
Use <dataframe 1>.join(<dataframe 2>) to join
Concatenation:Concatenate uses pandas.concat(<List of data
frames>).
Importing/Exporting Data between CSV files and Data
Frames:
Import data from CSV file to Data Frame:We can import data
from CSV File to Data Frame by using read_csv() function.

Output:
Export data from Data Frame to CSV File:We can export data
from Data Frame to CSV File by using to_csv() function.
Syntax:
<data frame name>.to_csv(<File Path>,.....)
Thank you

You might also like