0% found this document useful (0 votes)

270 views

Introduction To Pandas - Ipynb - Colaboratory

This document introduces the Pandas library in Python. Pandas allows users to store and manipulate data in data frames, which are similar to spreadsheets. Data frames allow columns to be labeled and indexed. Users can access data frames by selecting columns using their names or by choosing rows using indexes. Pandas also makes it easy to create, modify, and restructure data frames.

Uploaded by

Vincent Giang

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

270 views

Introduction To Pandas - Ipynb - Colaboratory

Uploaded by

Vincent Giang

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

8/5/2020 Introduction to Pandas.

ipynb - Colaboratory

Overview
In this module we will have a look at a Python library called Pandas.

Pandas is a library built on top of NumPy.

Pandas offer a data structure called pandas.DataFrame which is similar to NumPy Arrays with
added functionality.

Added functionalities include operations that we often use in data science such as omitting
missing values, replacing values, and etc.

Pandas Basics
Similar to NumPy, we can import Pandas by calling import pandas as pd .

By calling as pd we can use the library functions by calling pd.foo() . If you omiit as pd and
import library by calling import pandas , you will have to call functions as pandas.foo()

import pandas as pd

Pandas DataFrame
pandas.DataFrame is a widely used tabular data structure similar to a spreadsheet which we can
use to manage data within out python code.

We can have names to columns unlike in NumPy which allows to easily manipulate and nd our
data within a huge dataset.

data = {'state': ['OH', 'OH', 'OH', 'NV', 'NV'],

'year': [2000, 2001, 2002, 2000, 2002],
'pop' : [1.5, 1.4, 3.6, 2.4, 2.0]}

populationData = pd.DataFrame(data, index=['A', 'B', 'C', 'D', 'E'])

print(populationData)

state year pop

A OH 2000 1.5
B OH 2001 1.4
C OH 2002 3.6
D NV 2000 2.4
E NV 2002 2.0

https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 1/7
8/5/2020 Introduction to Pandas.ipynb - Colaboratory

We can see that dataframes allows us to have names to colums which is helpful to nding data
across multiple columns.

Also note that we can have a custom index for the rows (other than regular 0, 1, 2.. index in arrays).
Here in the population dataset we have custom indices which are 'A', 'B', 'C', 'D', 'E'

Accessing Columns
We can access columns by using column names. For example we can access state names in pour
population data by calling populationData['state']

populationData['state']

When we need to access multiple columns, we can use nested [] and pass a list of columns we
need to view.

populationData[['state', 'pop']]

Also, we can view the columns that are in the data frame by calling populationData.columns

populationData.columns

Activity
https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 2/7
8/5/2020 Introduction to Pandas.ipynb - Colaboratory

You have been given a sales dataframe which includes data on effect of multiple factors on sales
price of a commodity.

Your task is to determine rst determine what are the columns in the dataset and extract 3
columns from dataset such that it includes Date, price and a factor of your choice.

salesDataframe = pd.read_csv('https://www.cs.odu.edu/~sampath/courses/f19/cs620/files/data/va

salesDataframe.index = [1000, 1001, 1002, 1003, 1004, 1005, 1006]

# Continue with your code

salesDataframe

Acessing Rows
There are multiple ways to retrieve rows from a dataframe.

Since pandas allows us to set custom index, we can either use custom index or the regular index
(i.e 0, 1, 2..) to access data.

To view the index used by a dataframe, we can call dataframe.index

populationData.index

Sometimes custom index would be the same as regular index when we haven't speci ed a custom
index for our dataframe.

https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 3/7
8/5/2020 Introduction to Pandas.ipynb - Colaboratory

head
Through head(n) we can get the rst n number of rows of the dataframe.

populationData.head(3)

loc operation
We can use the custom index we have on the dataframe (i.e 'A', 'B', 'C', 'D', 'E' in our
population dataframe) to retrieve rows. We can do that by calling loc on our dataframe.

populationData.loc['A']

To select multiple rows, we can pass multiple indices similar to the way we accessed multiple
columns.

populationData.loc[['A', 'C']]

iloc operation
We can also use the regular index to retrieve data. (i.e 0, 1, 2..). For that we have to use iloc
operation similar to loc operation we used previously.

populationData.iloc[1]

https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 4/7
8/5/2020 Introduction to Pandas.ipynb - Colaboratory

populationData.iloc[[1, 3]]

Activity
Consider the sales dataset we have. Retrieve set of rows of your choice using head() , iloc and
loc operations.

Accessing Rows and Columns

We can combine column access methods and row access methods to get the required set of data
from the dataframe. For that, we rst select the rows and nest it with the column selection.

populationData.loc[["A", "C"]][["state", "pop"]]

Creating Dataframes
To create dataframes we have to use pd.DataFrame() function along with the data for the
dataframe.

We have to pass a dictionary to pd.DataFrame function which contains names of columns and
data we have for each row.

Let's create the temperature dataset from numpy excercise. Here we have 2 arrays for inside and
outside temperature readings.

We will be using inside and outside as column names.

data = {'inside' : [166, 108, 229, 194, 266, 102, 235, 188, 183, 129],
'outside' : [251, 238, 236, 161, 108, 291, 121, 183, 137, 133]}

https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 5/7
8/5/2020 Introduction to Pandas.ipynb - Colaboratory
temperatureDataframe = pd.DataFrame(data)

temperatureDataframe

Activity
Your task is to create a dataframe that has both acceleration data and temperature data. You may
chose column names of your choice.

accx = [0.03463151, 0.6746004 , 0.75813463, 0.14376458, 0.17252515,

0.4135009 , 0.80347004, 0.81023186, 0.66539218, 0.54754633]

accy = [0.48593401, 0.88983019, 0.87322111, 0.95533169, 0.35901729,

0.86243141, 0.36083334, 0.18515889, 0.20486895, 0.18408961]

accz = [0.29648785, 0.38779023, 0.05209736, 0.75532094, 0.27063359,

0.53516819, 0.79639674, 0.64252951, 0.18353906, 0.30367977]

https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 6/7
8/5/2020 Introduction to Pandas.ipynb - Colaboratory

https://colab.research.google.com/drive/1kIUvQX1ynNzPYz7RfzLI9b069BY9-fgS#printMode=true 7/7

Python Assignment
No ratings yet
Python Assignment
7 pages
Agroconsultant: Intelligent Crop Recommendation System Using Machine Learning Algorithms
No ratings yet
Agroconsultant: Intelligent Crop Recommendation System Using Machine Learning Algorithms
6 pages
Administrative Assistant Resume Template
100% (1)
Administrative Assistant Resume Template
2 pages
Machine Learning in Python Main Developments and T
100% (1)
Machine Learning in Python Main Developments and T
44 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Logistic Regression
100% (1)
Logistic Regression
29 pages
Lecture 1.1 - Introduction To Jupyter Notebooks and Google Colab
No ratings yet
Lecture 1.1 - Introduction To Jupyter Notebooks and Google Colab
23 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Anaconda Installation and Creating Environment - Lecture - 03
No ratings yet
Anaconda Installation and Creating Environment - Lecture - 03
40 pages
Loading The Dataset: First We Load The Dataset and Find Out The Number of Columns, Rows, NULL Values, Etc
100% (1)
Loading The Dataset: First We Load The Dataset and Find Out The Number of Columns, Rows, NULL Values, Etc
8 pages
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
100% (1)
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
11 pages
IRIS BPNN - Ipynb - Colaboratory
100% (1)
IRIS BPNN - Ipynb - Colaboratory
4 pages
Logistics Regression
100% (1)
Logistics Regression
5 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
Data Preprocesing JavaPoint
No ratings yet
Data Preprocesing JavaPoint
19 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
TP Regression
100% (1)
TP Regression
1 page
Fitting A Neural Network Model
No ratings yet
Fitting A Neural Network Model
9 pages
机器学习周志华 8.16.23 PM
No ratings yet
机器学习周志华 8.16.23 PM
443 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Course Ebook - Student Edition
No ratings yet
Course Ebook - Student Edition
102 pages
ML0101EN Clas Logistic Reg Churn Py v1
100% (1)
ML0101EN Clas Logistic Reg Churn Py v1
13 pages
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
100% (1)
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
73 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Python Setup For Machine Learning
100% (1)
Python Setup For Machine Learning
3 pages
Actividad Semana 4 - Jupyter Notebook
100% (1)
Actividad Semana 4 - Jupyter Notebook
7 pages
Supervised Learning: Andreas Müller
No ratings yet
Supervised Learning: Andreas Müller
43 pages
UNIT-4
No ratings yet
UNIT-4
79 pages
Week 2 Python For Data Science
No ratings yet
Week 2 Python For Data Science
27 pages
Pandas Visualisation
No ratings yet
Pandas Visualisation
27 pages
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
100% (1)
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
38 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
6 pages
ML0101EN Clas K Nearest Neighbors CustCat Py v1
100% (1)
ML0101EN Clas K Nearest Neighbors CustCat Py v1
11 pages
Lab7.ipynb - Colaboratory
100% (1)
Lab7.ipynb - Colaboratory
5 pages
Multicollinearity Exercise
100% (1)
Multicollinearity Exercise
6 pages
0.1 Stock Data
100% (1)
0.1 Stock Data
4 pages
Project
No ratings yet
Project
18 pages
Chapter4 (The Evaluating Multiple Models Chapter Is Really Good!)
No ratings yet
Chapter4 (The Evaluating Multiple Models Chapter Is Really Good!)
47 pages
Python Program (Journal)
No ratings yet
Python Program (Journal)
67 pages
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
100% (1)
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
17 pages
Assignment 11
100% (1)
Assignment 11
7 pages
Tools Machine Learning
No ratings yet
Tools Machine Learning
9 pages
Linear - Regression
100% (1)
Linear - Regression
39 pages
ML Lab Observation
100% (1)
ML Lab Observation
44 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
1 An Introduction To Rough Set Theory and Its Applic
No ratings yet
1 An Introduction To Rough Set Theory and Its Applic
40 pages
Create A Neural Network in 7 Steps - Neural Designer
No ratings yet
Create A Neural Network in 7 Steps - Neural Designer
11 pages
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
No ratings yet
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
4 pages
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
No ratings yet
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
4 pages
Econ209 f2024 Lab 4 Truong Gia Han
No ratings yet
Econ209 f2024 Lab 4 Truong Gia Han
11 pages
Student Booklet For Sep 2015 v6
100% (1)
Student Booklet For Sep 2015 v6
50 pages
Columbia Seaborn Tutorial
No ratings yet
Columbia Seaborn Tutorial
12 pages
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
100% (1)
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
11 pages
MACHINE LEARNING AND DATA ANALYTICS USING PYTHON LAB
No ratings yet
MACHINE LEARNING AND DATA ANALYTICS USING PYTHON LAB
36 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
100% (1)
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
13 pages
Getting Your Hands-On Climate Data - Visualize Climate Data With Python
No ratings yet
Getting Your Hands-On Climate Data - Visualize Climate Data With Python
20 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Mod7top2 Content Involvement
No ratings yet
Mod7top2 Content Involvement
4 pages
Introduction To Numpy - Ipynb - Colaboratory
No ratings yet
Introduction To Numpy - Ipynb - Colaboratory
11 pages
Module 7: America at The Turn of The Century - 1900 To The 1930s
No ratings yet
Module 7: America at The Turn of The Century - 1900 To The 1930s
6 pages
Seaborn Seaborn: 1 2 Import Seaborn As Sns Import Pandas As PD
No ratings yet
Seaborn Seaborn: 1 2 Import Seaborn As Sns Import Pandas As PD
2 pages
Formula SAE Engine Design
No ratings yet
Formula SAE Engine Design
188 pages
Introduction To SciKit - Ipynb - Colaboratory
No ratings yet
Introduction To SciKit - Ipynb - Colaboratory
2 pages
Chapter 2
No ratings yet
Chapter 2
9 pages
DataDomain PDF
100% (1)
DataDomain PDF
4 pages
Assignment 04
No ratings yet
Assignment 04
3 pages
Region of Interest Pooling Explained
No ratings yet
Region of Interest Pooling Explained
12 pages
Shortcut Keys
100% (2)
Shortcut Keys
2 pages
Improving Password Security and Memorability To Protect
No ratings yet
Improving Password Security and Memorability To Protect
14 pages
Factory Talk View SE
100% (1)
Factory Talk View SE
690 pages
PM0085-B Brochure, OPTI CCA-TS2, English (Web)
No ratings yet
PM0085-B Brochure, OPTI CCA-TS2, English (Web)
4 pages
The Analysis of Generative Music Programs PDF
100% (1)
The Analysis of Generative Music Programs PDF
13 pages
Controlador Carel mc2
No ratings yet
Controlador Carel mc2
40 pages
Testing of RESTful Web APIs
No ratings yet
Testing of RESTful Web APIs
3 pages
2015 01 19 Reportserver Configguide 2.2
No ratings yet
2015 01 19 Reportserver Configguide 2.2
58 pages
Connecting Db2 To Vb6.0
No ratings yet
Connecting Db2 To Vb6.0
7 pages
Com 214 File Organization and Management Practical
100% (1)
Com 214 File Organization and Management Practical
48 pages
Lab Manual CNS Part-1
No ratings yet
Lab Manual CNS Part-1
11 pages
Lab 2 - Intel 8086 Microprocessor: Logical Instructions and Jump Commands in Assembly Language
No ratings yet
Lab 2 - Intel 8086 Microprocessor: Logical Instructions and Jump Commands in Assembly Language
10 pages
TM GEN64 Core PDF
No ratings yet
TM GEN64 Core PDF
522 pages
Focus 35C/43C Detectors With Image Suite V4 Software
No ratings yet
Focus 35C/43C Detectors With Image Suite V4 Software
2 pages
How To Start A Cryptocurrency Exchange
No ratings yet
How To Start A Cryptocurrency Exchange
23 pages
18csc206j Sepm - Ex 3 Team 8
No ratings yet
18csc206j Sepm - Ex 3 Team 8
5 pages
University of Waterloo Midterm Examination Solutions: Marking Scheme (For Examiner Use Only)
No ratings yet
University of Waterloo Midterm Examination Solutions: Marking Scheme (For Examiner Use Only)
14 pages
IPCop With DansGuardian Installation and Configuration
No ratings yet
IPCop With DansGuardian Installation and Configuration
6 pages
CIT853
No ratings yet
CIT853
2 pages
Career Objective: Chiranjeevi Oracle Apps Technical Cell: +91-8801136343
No ratings yet
Career Objective: Chiranjeevi Oracle Apps Technical Cell: +91-8801136343
2 pages
CARIS HIPS & SIPS Changes List PDF
No ratings yet
CARIS HIPS & SIPS Changes List PDF
48 pages
Cka PDF
75% (4)
Cka PDF
58 pages
Astrix Webinar SM Upgrade Critical Factors - FINAL - 083017
No ratings yet
Astrix Webinar SM Upgrade Critical Factors - FINAL - 083017
1 page
Service Interface User Guide, February 2022 DOCA0170EN-02
No ratings yet
Service Interface User Guide, February 2022 DOCA0170EN-02
64 pages
BMW Widescreen Info
No ratings yet
BMW Widescreen Info
3 pages