0% found this document useful (0 votes)

105 views27 pages

Working With Pandas Notes

The document discusses pandas, an open-source Python library for data manipulation and analysis. It provides an overview of pandas data structures including Series, DataFrames, and Panels. Series are one-dimensional arrays that can be created from lists, NumPy arrays, scalars, and dictionaries. DataFrames are two-dimensional tabular structures that can contain heterogeneous data across columns. The document then covers basic functionality and operations for accessing and manipulating data in Series and DataFrames.

Uploaded by

AISHI SHARMA

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

105 views27 pages

Working With Pandas Notes

Uploaded by

AISHI SHARMA

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 27

CHAPTER – 5 – DATA HANDLING USING PANDAS

Introduction - Pandas is an open-source Python Library providing high-performance data manipulation

and analysis tool using its powerful data structures. The name Pandas is derived from the word Panel Data.
In 2008, developer Wes McKinney started developing pandas when in need of high performance, flexible
tool for analysis of data.
Key Features of Pandas:
 Dataframe object help a lot in keeping track of our data.
 With a pandas dataframe, we can have different data types (float, int, string, datetime, etc) all in one
place.
 Pandas has built in functionality for like easy grouping & easy joins of data.
 Good IO capabilities; easily pull data from a MySQL database directly into a data frame.
 Tools for loading data into in-memory data objects from different file formats.
 Data alignment and integrated handling of missing data.
 Reshaping and pivoting of data sets.
 Label-based slicing, indexing and subsetting of large data sets.
Introduction to Pandas Data Structures:
Series < DataFrame < Panel
The best way to think of these data structures is that the higher dimensional data structure is a container of
its lower dimensional data structure. For example, DataFrame is a container of Series, Panel is a container
of DataFrame.

Data Structure Dimensions Description

Series 1 1D labelled homogeneous array, size immutable.

2D labelled, size-mutable tabular structure with potentially

Data Frames 2
heterogeneously typed columns.

Panel 3 3D labelled, size-mutable array.

Mutability - All Pandas data structures are value mutable (can be changed).
Note − DataFrame is widely used and one of the most important data structures. Panel is used much less.
Series - Series is a one-dimensional array like structure with homogeneous data. For example, the following
series is a collection of integers 10, 23, 56 …
10 23 56 17 52 61 73 90 26 72

Key Points:
 Homogeneous data
 Size Immutable
 Values of Data Mutable
A pandas Series can be created using the following constructor –
pandas.Series(data, index, dtype, copy)

Parameter & Description

data
data takes various forms like ndarray, list, constants

index
Index values must be unique and hashable, same length as data. Default np.arrange(n) if no index is
passed.

dtype
dtype is for data type. If None, data type will be inferred

copy
Copy data. Default False

Creating Pandas Series – A series can be created using various inputs like −
Array, Dict, Scalar value or constant, Mathematical Operations etc.
a) Create an Empty Series: An empty series can be created using Series() function of pandas library.
Example - #import the pandas library as pd
import pandas as pd
s = pd.Series()
print s
Output - Series([], dtype: float64)
b) Creating a series from Lists: In order to create a series from list, we have to first create a list after that
we can create a series from list.
Example 1 - # import the pandas library as pd
import pandas as pd
simple_list = ['g', 'e', 'e', 'k', 's']
s = pd.Series(simple_list)
print s
Output -

c) Create a Series from ndarray - If data is an ndarray, then index passed must be of the same length. If
no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3….
range(len(array))-1].
Example 1 - #import the NUMPY & PANDAS as np & pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s
Output –

We did not pass any index, so by default, it assigned the indexes ranging from 0 to len(data)-
1, i.e., 0 to 3.
Example 2 - #import the NUMPY & PANDAS as np & pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
print s
Output -
We passed the index values here. Now we can see the customized indexed values in the
output.
d) Create a Series from dictionaries - A dict can be passed as input and if no index is specified, then the
dictionary keys are taken in a sorted order to construct index. If index is passed, the values in data
corresponding to the labels in the index will be pulled out.
Example 1 - #import the NUMPY & PANDAS as np & pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print s
Output -

Observe − Dictionary keys are used to construct index.

Example 2 - #import the NUMPY & PANDAS as np & pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data,index=['b','c','d','a'])
print s
Output -

Observe − Index order is persisted and the missing element is filled with NaN (Not a Number).
e) Create a Series from Scalar (Single Item) - If data is a scalar value, an index must be provided. The
value will be repeated to match the length of index
Example - #import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
s = pd.Series(5, index=[0, 1, 2, 3])
print s
Output -

f) Create a Series from Mathematical Operations – Different types of mathematical operators (+, -, *, /
etc.) can be applied on pandas series to generate another series.
Example - # import the pandas library as pd
import pandas as pd1
s = pd1.Series([1,2,3])
t = pd1.Series([1,2,4])
u = s + t #addition operation
print(u)
u = s * t # multiplication operation
print(u)
Output -
Series Basic Functionality:

Attribute or Method & Description

axes
Returns a list of the row axis labels

dtype
Returns the dtype of the object.

empty
Returns True if series is empty.

ndim
Returns the number of dimensions of the underlying data, by definition 1.

size
Returns the number of elements in the underlying data.

values
Returns the Series as ndarray.

head()
Returns the first n rows. Maximum default value of n is 5.

tail()
Returns the last n rows. Maximum default value of n is 5.

Example - import pandas as pd

import numpy as np
#Create a series with range 10 to 60 with step 10
s = pd.Series(np.arange(10, 60, 10))
print s
print ("The axes are:" + s.axes)
print ("Is the Object empty?" + s.empty)
print ("The dimensions of the object:" + s.ndim)
print ("The size of the object:" + s.size)
print ("The actual data series is:" + s.values)
print ("The first two rows of the data series:" +
s.head(2))
print ("The last two rows of the data series:" +
s.tail(2))
Its output is as follows −

Accessing Data from Series with Position - Data in the series can be accessed similar to that in an
ndarray.
Example 1 – Retrieve the first element. As we already know, the counting starts from zero for the array,
which means the first element is stored at zeroth position and so on.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve the first element
print s[0]

Output –

Example 2 – Retrieve the first three elements in the Series. If a : is inserted in front of it, all items from that
index onwards will be extracted. If two parameters (with : between them) is used, items between the two
indexes (not including the stop index).
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve the first three element
print s[:3]
utput –

Example 3 – Retrieve the last three elements.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve the last three element
print s[-3:]
Output –

Retrieve Data Using Label (Index) – A Series is like a fixed-size dict in that you can get and set values by
index label.
Example 1 – Retrieve a single element using index label value.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve a single element
print s['a']
Output –

Example 2 – Retrieve multiple elements using a list of index label values.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve multiple elements
print s[['a','c','d']]
Output –
Example 3 – If a label is not contained, an exception is raised.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve multiple elements
print s['f']
Output –

DataFrame - A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns.
Structure –

You can think of it as an SQL table or a spreadsheet data representation.

Key Points:
 Homogeneous data
 Size Immutable
 Values of Data Mutable
pandas.DataFrame - A pandas DataFrame can be created using the following constructor –
pandas.DataFrame( data, index, columns, dtype, copy)
Parameter & Description

data
data takes various forms like ndarray, series, map, lists, dict, constants and also another
DataFrame.

index
For the row labels, the Index to be used for the resulting frame is Optional Default np.arange(n) if
no index is passed.

columns
For column labels, the optional default syntax is - np.arange(n). This is only true if no index is
passed.

dtype
Data type of each column.

copy
This command (or whatever it is) is used for copying of data, if the default is False.

Creating DataFrames – A pandas DataFrame can be created using various inputs like –
Lists, dict, Series, Numpy ndarrays, Another DataFrame
a) Create an Empty DataFrame - A basic DataFrame, which can be created is an Empty Dataframe.
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print df
Output –

b) Create a DataFrame from Lists - The DataFrame can be created using a single list or a list of lists.
Example 1 –
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
Output –
Example 2 –
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Output -

Example 3 –
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print df
Output -

Observe, the dtype parameter changes the type of Age column to floating point.
c) Create a DataFrame from Dict of ndarrays / Lists - All the ndarrays must be of same length. If index
is passed, then the length of the index should equal to the length of the arrays.
If no index is passed, then by default, index will be range(n), where n is the array length.
Example 1 –
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve',
'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print df
Output –

Observe the values 0,1,2,3. They are the default index assigned to each using the function range(n).
Example 2 – Let us now create an indexed DataFrame using arrays.
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve',
'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print df
Output –

Observe, the index parameter assigns an index to each row.

d) Create a DataFrame from List of Dicts – List of Dictionaries can be passed as input data to create a
DataFrame. The dictionary keys are by default taken as column names.
Example 1 –
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print df
Output –

Example 2 – The following example shows how to create a DataFrame by passing a list of dictionaries
and the row indices.
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index=['first', 'second'])
print df
Output –

Example 3 – The following example shows how to create a DataFrame with a list of dictionaries, row
indices, and column indices.
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]

#With two column indices, values same as dictionary keys

df1 = pd.DataFrame(data, index=['first', 'second'],
columns=['a', 'b'])

#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'],
columns=['a', 'b1'])
print df1
print df2
Output –

Observe, df2 DataFrame is created with a column index other than the dictionary key; thus, appended the
NaN’s in place. Whereas, df1 is created with column indices same as dictionary keys, so NaN’s appended.
e) Create a DataFrame from Dictionary of Series – Dictionary of Series can be passed to form a
DataFrame. The resultant index is the union of all the series indexes passed.
Example –
import pandas as pd

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),

'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)
print df
Output –

f) Create DataFrame from CSV files –

What is CSV (Comma Separated Value) file?
A CSV file is nothing more than a simple text file. However, it is the most common, simple,
and easiest method to store tabular data. This particular format arranges tables by following a
specific structure divided into rows and columns. It is these rows and columns that contain your data.
A new line terminates each row to start the next row. Similarly, a comma, also known as the delimiter,
separates columns within each row.
Take the following table as an example:

Now, the above table will look as follows if we represent it in CSV format:
Observe, a comma separates all the values in columns within each row. However, you can use other
symbols such as a semicolon (;) as a separator as well.
The two workhorse functions for reading text files (or the flat files) are read_csv() and
read_table(). They both use the same parsing code to intelligently convert tabular data into a
DataFrame object –
read_csv – This function supports reading of CSV files and save into a DataFrame object. Below
are the examples demonstrating the different parameters of read_csv function
Here is temp.csv file data we will use in all examples further –
S.No,Name,Age,City,Salary
1,Tom,28,Toronto,20000
2,Lee,32,HongKong,3000
3,Steven,43,Bay Area,8300
4,Ram,38,Hyderabad,3900
Example – Reading temp.csv file using Pandas
import pandas as pd
df=pd.read_csv("temp.csv")
print df
Output -

 sep - If the separator between each field of your data is not a comma, use the sep argument.
Example – Consider the data is separated in the above given temp.csv file by dollar ($) symbol
instead of comma (,) as below:
S.No$Name$Age$City$Salary
1$Tom$28$Toronto$20000
2$Lee$32$HongKong$3000
3$Steven$43$Bay Area$8300
4$Ram$38$Hyderabad$3900
Program -
import pandas as pd
data=pd.read_csv("temp.csv", sep='$')
print(data)
Output –

 delimiter - The delimiter argument of pandas read_csv function is same as sep.

 header - to specify which line in your data is to be considered as header (heading for each
column).
Program -
import pandas as pd
data=pd.read_csv("temp.csv", header='2')
print(data)
Output –

Note - If your csv file does not have header, then you need to set header = None while
reading it.
 names - Use the names attribute if you would want to specify column names to the
dataframe explicitly.
Program –
import pandas as pd
col_names = ['Serial No', 'Person Name', 'Age', 'Lives
in', 'Earns']
data = pd.read_csv("temp.csv", names = col_names,
header = None)
print(data)
Output –
 index_col - Use this argument to specify the row labels to use. Column name or number
both can be mentioned as index column.
Program –
import pandas as pd
df=pd.read_csv("temp.csv",index_col=['S.No'])
print df
Output –

 pandas read_csv usecols – To load specific columns into dataframe.

Program –
import pandas as pd
df=pd.read_csv("temp.csv", usecols = [0, 3])
print df
Output –

 prefix - When a data set doesn’t have any header , and you try to convert it to dataframe by
(header = None), pandas read_csv generates dataframe column names automatically with
integer values 0,1,2,…
If we want to prefix each column’s name with a string, say, “COLUMN”, such that dataframe
column names will now become COLUMN0, COLUMN1, COLUMN2 etc. we use prefix
argument.
 pandas read_csv dtype - Use dtype to set the datatype for the data or dataframe columns.
Program –
import pandas as pd
df = pd.read_csv("temp.csv", dtype={'Salary':
np.float64})
print df.dtypes
Output –

 true_values , false_values – Suppose your dataset contains Yes and No string which you
want to interpret as True and False.
 skiprows - skiprows skips the number of rows specified.
import pandas as pd
df=pd.read_csv("temp.csv", skiprows=2)
print df
Output -

 skipfooter - Indicates number of rows to skip from bottom of the file.

 nrows - If you want to read a limited number of rows, instead of all the rows in a dataset, use
nrows.
Program –
import pandas as pd
df=pd.read_csv("temp.csv", nrows=2)
print df

 skip_blank_lines - If skip_blank_lines option is set to False, then wherever blank lines are
present, NaN values will be inserted into the dataframe. If set to True, then the entire blank
line will be skipped.
 parse_dates - We can use pandas parse_dates to parse columns as datetime. You can
either use parse_dates = True or parse_dates = [‘column name’]
 iterator - Using the iterator parameter, you can define how much data you want to read , in
each iteration by setting iterator to True.
Program –
import pandas as pd
df=pd.read_csv("temp.csv", iterator = True)
print("first", data.get_chunk(2))
print("second", data.get_chunk(2))
Output –

g) Create DataFrame from Text files – Pandas built-in function pandas.read_table() method is
used to read Text files and store in DataFrame object.
Difference between CSV file and Text file:
TEXT FILE DATA – Separated by Tab (\t) CSV FILE DATA – Separated by Comma (,)
S.No Name Age City Salary S.No,Name,Age,City,Salary
1 Tom 28 Toronto 20000 1,Tom,28,Toronto,20000
2 Lee 32 HongKong 3000 2,Lee,32,HongKong,3000
3 Steven 43 Bay Area 8300 3,Steven,43,Bay Area,8300
4 Ram 38 Hyderabad 3900 4,Ram,38,Hyderabad,3900
read_table() to read the data read_csv() to read the data

Important Note – read_table() function supports all parameters which are supported
by read_csv() function.
Operations on DataFrame columns –
a) Column Selection – To select a column from DataFrame, we can simply pass column label into the
DataFrame object as given below:
Example –
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df ['one']

Output –
b) Column Addition – To add a column in existed DataFrame, we can simply pass column label and
elements into the DataFrame object as given below:
Example -
import pandas as pd

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),

'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)

# Adding a new column to an existing DataFrame object with

column label by passing new series

print ("Adding a new column by passing as Series:")

df['three']=pd.Series([10,20,30],index=['a','b','c'])
print df

print ("Adding a new column using the existing columns in

DataFrame:")
df['four']=df['one']+df['three']

print df

Output –

c) Column Deletion –
Example –
# Using the previous DataFrame, we will delete a column
# using del function
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}

df = pd.DataFrame(d)
print ("Our dataframe is:")
print df

# using del function

print ("Deleting the first column using DEL function:")
del df['one']
print df

# using pop function

print ("Deleting another column using POP function:")
df.pop('two')
print df

Output –

Operations on DataFrame rows –

a) Row Selection – Rows can be selected by passing row label to a loc function.
Example –
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)
print df.loc['b']

Output –

Selection by integer location – Rows can be selected by passing integer location to an iloc function.
Example –
import pandas as pd

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),

'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)
print df.iloc[2]

Output –

Selection by slicing – Multiple rows can be selected using ‘ : ’ operator.

Example –
import pandas as pd

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),

'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)
print df[2:4]

Output –

b) Row Addition – New rows in existed DataFrame can be added using append function. This function
will append the rows at the end.
Example –
import pandas as pd

df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])

df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])

df = df.append(df2)
print df

Output –

c) Row Deletion – We can use index label to delete or drop rows from a DataFrame. If label is duplicated,
then multiple rows will be dropped.
Observe, in the above example, the labels 0 and 1 are duplicate.
Example –
import pandas as pd

df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])

df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])

df = df.append(df2)

# Drop rows with label 0

df = df.drop(0)

print df

Output –

Observe, two rows were dropped because those two contain the same label 0.
Renaming DataFrame Rows/Columns index labels – Pandas rename() method is used to rename any
index, column or row.
Syntax - DataFrame.rename(mapper=None, index=None, columns=None,
axis=None, inplace=False)
Parameters:
mapper, index and columns: Dictionary value, key refers to the old name and value refers to
new name. Only one of these parameters can be used at once.
axis: int or string value, 0/’row’ for Rows and 1/’columns’ for Columns. Default is 0.
inplace: Makes changes in original Data Frame if True. Default is false.

-----CSV_FILE_DATA USED FOR EXAMPLES BELOW-------

ADM_NO, STUDENT_NAME, AGE, CLASS, SECTION
101024, RAHUL SINGH, 15, 10, C
101025, SHITAL KUMARI, 16, 10, B
101026, UJJWAL SHARMA, 15, 10, A
-----------------------------------------------------------------------
DataFrame made using above CSV file is:

Example 1 – Changing multiple row labels/names

# importing pandas module
import pandas as pd
# making data frame from csv file
data = pd.read_csv("DATA.CSV", index_col ="ADM_NO" )
# changing rows indexes/names/labels with rename()
data.rename(index = {101024: "STUDENT1", 101025:"STUDENT2"},
inplace = True)
print(data)
Example 2 - Changing multiple column labels/names
# importing pandas module
import pandas as pd
# making data frame from csv file
data = pd.read_csv("DATA.CSV", index_col ="ADM_NO" )
# changing columns indexes/names/labels with rename()
data.rename(columns = {'AGE':'YEARS', 'ADM_NO':'ID'},
inplace = True)
print(data)
Output –

g) Boolean Indexing – In Boolean indexing, we will select subsets of data based on the actual values of
the data in the DataFrame and not on their row/column labels or integer locations. In Boolean indexing,
we use a Boolean vector to filter the data.
Boolean indexing is a type of indexing which uses actual values of the data in the DataFrame. In Boolean
indexing, we can filter a data in four ways –
 Accessing a DataFrame with a Boolean index - In order to access a dataframe with a Boolean
index, we have to create a dataframe in which index of dataframe contains a Boolean value that is
“True” or “False”.
For example –
# importing pandas as pd
import pandas as pd
# dictionary of lists
dict = { 'name' : ["aparna", "pankaj", "sudhir", "Geeku"],
'degree' : ["MBA", "BCA", "M.Tech", "MBA"],
'score' : [90, 40, 80, 98]}
df = pd.DataFrame(dict, index = [True, False, True, False])
print(df)
Output –
Now we have created a dataframe with Boolean index after that user can access a dataframe with
the help of Boolean index.
Accessing a Dataframe with a Boolean index using .loc[ ]
In order to access a dataframe with a Boolean index using .loc[ ], we simply pass a Boolean value
(True or False) in a .loc[ ] function.
Example – print(df.loc[True])
Output -

Accessing a Dataframe with a Boolean index using .iloc[ ]

In order to access a dataframe using .iloc[ ], we have to pass a Boolean value (True or False) in a
iloc[ ] function but iloc[ ] function accept only integer as argument so it will throw an error so we can
only access a dataframe when we pass a integer in iloc[ ] function
Example – print(df.iloc[2])
Output –

 Applying a Boolean mask to a dataframe - We can apply a Boolean mask by giving list of True and
False of the same length as contain in a dataframe. When we apply a Boolean mask it will print only
that dataframe in which we pass a Boolean value True.
Example –
# importing pandas as pd
import pandas as pd
# dictionary of lists
dict = { 'name' : ["aparna", "pankaj", "sudhir", "Geeku"],
'degree' : ["MBA", "BCA", "M.Tech", "MBA"],
'score' : [90, 40, 80, 98]}
df = pd.DataFrame(dict)
print(df[[True, False, True, False]])
Output –

 Masking data based on column value - In a dataframe we can filter a data based on a column value
in order to filter data, we can apply certain condition on dataframe using different operator like ==,
>, <, <=, >=. When we apply these operator on dataframe then it produce a Series of True and False.
Example 1 – print(df[df['score']>90])
Output –

Example 2 –
masking_condition =(df['score']>80) & (df['degree']=='MBA')
print(df[masking_condition])
Output –

 Masking data based on index value - In a dataframe we can filter a data based on a column value
in order to filter data, we can create a mask based on the index values using different operator like
==, >, <, etc.
Example – print (df[df.index>1])
Output –

Summer Internship Report
No ratings yet
Summer Internship Report
12 pages
eCTD Benefits and Challenges FDA Perspectives
No ratings yet
eCTD Benefits and Challenges FDA Perspectives
33 pages
Cts Automata Fix Previous Error Debugging: Cognizant Telegram Group
50% (2)
Cts Automata Fix Previous Error Debugging: Cognizant Telegram Group
56 pages
HP Series Smart Solar Charge Controller User Manual HP2410/HP2420/HP2420-S
No ratings yet
HP Series Smart Solar Charge Controller User Manual HP2410/HP2420/HP2420-S
9 pages
Python Pandas-Series-neww
100% (1)
Python Pandas-Series-neww
80 pages
Python Pandas Series
No ratings yet
Python Pandas Series
37 pages
Exp8 SBLC
No ratings yet
Exp8 SBLC
9 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
25 pages
Python Pandas (II)
No ratings yet
Python Pandas (II)
18 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Python Pandas
No ratings yet
Python Pandas
21 pages
Ch-2 Panda: #Import The Pandas Library and Aliasing As PD
No ratings yet
Ch-2 Panda: #Import The Pandas Library and Aliasing As PD
5 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Series in Python - 1
No ratings yet
Series in Python - 1
19 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
1 IP 12 NOTES PythonPandas 2022 PDF
100% (3)
1 IP 12 NOTES PythonPandas 2022 PDF
66 pages
4b Understanding Series in Pandas - PPTX - Lyst2672
No ratings yet
4b Understanding Series in Pandas - PPTX - Lyst2672
10 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Class12 Pandas Notes
No ratings yet
Class12 Pandas Notes
23 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Pandas
No ratings yet
Pandas
11 pages
Informatics Practices Book 12 Answer Key
No ratings yet
Informatics Practices Book 12 Answer Key
54 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Python Pandas
No ratings yet
Python Pandas
22 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
Python Pandas
100% (1)
Python Pandas
35 pages
Ln. 1 - Data handling using Pandas - Series & Dataframe
No ratings yet
Ln. 1 - Data handling using Pandas - Series & Dataframe
14 pages
Pandas Viva Questions
No ratings yet
Pandas Viva Questions
23 pages
Numpy Pandas
No ratings yet
Numpy Pandas
54 pages
Python Pandas
No ratings yet
Python Pandas
230 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
Analyzing Data Using Pandas
No ratings yet
Analyzing Data Using Pandas
4 pages
Python For DScience & D Visualisation Updated
No ratings yet
Python For DScience & D Visualisation Updated
11 pages
Mdad - Numpy ML
No ratings yet
Mdad - Numpy ML
85 pages
Pandas basics
No ratings yet
Pandas basics
21 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Lab 9
No ratings yet
Lab 9
9 pages
XII_ip_Panda_I_Part_I_2023 (1) 1 1
No ratings yet
XII_ip_Panda_I_Part_I_2023 (1) 1 1
25 pages
Unit II Notes Revision
No ratings yet
Unit II Notes Revision
20 pages
Pandas python
No ratings yet
Pandas python
11 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
IP 12th Chapter 2
No ratings yet
IP 12th Chapter 2
8 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Python Libraries
No ratings yet
Python Libraries
53 pages
Document
No ratings yet
Document
4 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Data Analytics Pandas
No ratings yet
Data Analytics Pandas
33 pages
LAST MINUTES REVISION Pandas Series
No ratings yet
LAST MINUTES REVISION Pandas Series
6 pages
Panda Qus
No ratings yet
Panda Qus
8 pages
Unit 4
No ratings yet
Unit 4
36 pages
2 Python Data Processing
100% (2)
2 Python Data Processing
66 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
09_Pandas slides
No ratings yet
09_Pandas slides
33 pages
Unit 2
No ratings yet
Unit 2
81 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
12 IP Questions
No ratings yet
12 IP Questions
181 pages
Pandas Notoes For XII PDF
No ratings yet
Pandas Notoes For XII PDF
12 pages
Python Pandas Series
No ratings yet
Python Pandas Series
45 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Experiment No 1: Aim: Study of Security Tools Theory
No ratings yet
Experiment No 1: Aim: Study of Security Tools Theory
4 pages
Simulation Architecture
No ratings yet
Simulation Architecture
10 pages
Multilingual User Manual Kamvas Pro 19
No ratings yet
Multilingual User Manual Kamvas Pro 19
457 pages
ECM3701 Study Unit 8
No ratings yet
ECM3701 Study Unit 8
20 pages
Online Quiz Management Using Java: Sr. No. Roll No. (Sem-5) Full Name of Students Enrollment No. Seat No. (Semester-5)
No ratings yet
Online Quiz Management Using Java: Sr. No. Roll No. (Sem-5) Full Name of Students Enrollment No. Seat No. (Semester-5)
21 pages
Working With Microsoft PowerPoint Objects
No ratings yet
Working With Microsoft PowerPoint Objects
4 pages
Exploring The Essence of Operating Systems: From Core Functions To Advanced Architectures and Security Mechanisms
No ratings yet
Exploring The Essence of Operating Systems: From Core Functions To Advanced Architectures and Security Mechanisms
6 pages
SCCR 18 16 Revised
No ratings yet
SCCR 18 16 Revised
12 pages
All Number Related Programs in Java
No ratings yet
All Number Related Programs in Java
12 pages
Asset Tracking System Project Report
No ratings yet
Asset Tracking System Project Report
65 pages
quizlet
No ratings yet
quizlet
15 pages
ict skills part 2
No ratings yet
ict skills part 2
5 pages
Password Cracking
No ratings yet
Password Cracking
41 pages
Computational Intelligent Techniques To Detect DDOS Attacks: A Survey
No ratings yet
Computational Intelligent Techniques To Detect DDOS Attacks: A Survey
18 pages
Information Technology P2 May-June 2022 MG Eng
No ratings yet
Information Technology P2 May-June 2022 MG Eng
13 pages
Transaction Canceled DB 612 Error For Workflow Jobs in SM21
No ratings yet
Transaction Canceled DB 612 Error For Workflow Jobs in SM21
4 pages
Services in Android With Example
No ratings yet
Services in Android With Example
21 pages
Poki Games
No ratings yet
Poki Games
39 pages
Machine Learning and Predictive Analytics Guidebook For Water Engineers Ge Digital
No ratings yet
Machine Learning and Predictive Analytics Guidebook For Water Engineers Ge Digital
10 pages
SK Pembagian Tugas Mengajar 7a 7B 7C 2024-2025
No ratings yet
SK Pembagian Tugas Mengajar 7a 7B 7C 2024-2025
240 pages
Primary Ict Third Term Exam - PDF - Computer Keyboard - Input - Ou
No ratings yet
Primary Ict Third Term Exam - PDF - Computer Keyboard - Input - Ou
29 pages
AI ML DS Syllabus-Sem-Vi-Mumbai-University
No ratings yet
AI ML DS Syllabus-Sem-Vi-Mumbai-University
42 pages
Dynamic SQL
No ratings yet
Dynamic SQL
6 pages
? 100 Kubernetes Real-Time Use Cases ?
No ratings yet
? 100 Kubernetes Real-Time Use Cases ?
12 pages
The godot-rust Book
No ratings yet
The godot-rust Book
121 pages
AIS2 Database Example Prelim GUIDE ONLY
No ratings yet
AIS2 Database Example Prelim GUIDE ONLY
6 pages