0% found this document useful (0 votes)

10 views

Tutorial 2

The document is a tutorial on using the NumPy and Pandas Python library modules. It provides examples of creating and manipulating NumPy ndarrays, including one-dimensional and two-dimensional arrays. It also demonstrates various NumPy functions for arithmetic operations, statistical analysis, linear algebra and more. Additionally, it shows how to create Pandas Series objects from lists, NumPy arrays and dictionaries and access/slice Series elements.

Uploaded by

POEASO

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Tutorial 2

Uploaded by

POEASO

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

tutorial2

November 12, 2020

1 Module 2: Introduction to Numpy and Pandas

The following tutorial contains examples of using the numpy and pandas library modules. Read
the step-by-step instructions below carefully. To execute the code, click on the cell and press the
SHIFT-ENTER keys simultaneously.

1.1 2.1 Introduction to Numpy

Numpy, which stands for numerical Python, is a Python library package to support numerical
computations. The basic data structure in numpy is a multi-dimensional array object called ndarray.
Numpy provides a suite of functions that can eﬀiciently manipulate elements of the ndarray.

1.1.1 2.1.1 Creating ndarray

An ndarray can be created from a list or tuple object.

[ ]: import numpy as np

oneDim = np.array([1.0,2,3,4,5]) # a 1-dimensional array (vector)

print(oneDim)
print("#Dimensions =", oneDim.ndim)
print("Dimension =", oneDim.shape)
print("Size =", oneDim.size)
print("Array type =", oneDim.dtype)

twoDim = np.array([[1,2],[3,4],[5,6],[7,8]]) # a two-dimensional array (matrix)

print(twoDim)
print("#Dimensions =", twoDim.ndim)
print("Dimension =", twoDim.shape)
print("Size =", twoDim.size)
print("Array type =", twoDim.dtype)

arrFromTuple = np.array([(1,'a',3.0),(2,'b',3.5)]) # create ndarray from tuple

print(arrFromTuple)
print("#Dimensions =", arrFromTuple.ndim)
print("Dimension =", arrFromTuple.shape)
print("Size =", arrFromTuple.size)

There are several built-in functions in numpy that can be used to create ndarrays

1
[ ]: print(np.random.rand(5)) # random numbers from a uniform distribution␣
,→between [0,1]

print(np.random.randn(5)) # random numbers from a normal distribution

print(np.arange(-10,10,2)) # similar to range, but returns ndarray instead␣
,→of list

print(np.arange(12).reshape(3,4)) # reshape to a matrix

print(np.linspace(0,1,10)) # split interval [0,1] into 10 equally separated␣
,→values

print(np.logspace(-3,3,7)) # create ndarray with values from 10^-3 to 10^3

[ ]: print(np.zeros((2,3))) # a matrix of zeros

print(np.ones((3,2))) # a matrix of ones
print(np.eye(3)) # a 3 x 3 identity matrix

1.2 2.1.2 Element-wise Operations

You can apply standard operators such as addition and multiplication on each element of the
ndarray.

[ ]: x = np.array([1,2,3,4,5])

print(x + 1) # addition
print(x - 1) # subtraction
print(x * 2) # multiplication
print(x // 2) # integer division
print(x ** 2) # square
print(x % 2) # modulo
print(1 / x) # division

[ ]: x = np.array([2,4,6,8,10])
y = np.array([1,2,3,4,5])

print(x + y)
print(x - y)
print(x * y)
print(x / y)
print(x // y)
print(x ** y)

1.3 2.1.3 Indexing and Slicing

There are various ways to select certain elements with an ndarray.

[ ]: x = np.arange(-5,5)
print(x)

y = x[3:5] # y is a slice, i.e., pointer to a subarray in x

print(y)

2
y[:] = 1000 # modifying the value of y will change x
print(y)
print(x)

z = x[3:5].copy() # makes a copy of the subarray

print(z)
z[:] = 500 # modifying the value of z will not affect x
print(z)
print(x)

[ ]: my2dlist = [[1,2,3,4],[5,6,7,8],[9,10,11,12]] # a 2-dim list

print(my2dlist)
print(my2dlist[2]) # access the third sublist
print(my2dlist[:][2]) # can't access third element of each sublist
# print(my2dlist[:,2]) # this will cause syntax error

my2darr = np.array(my2dlist)
print(my2darr)
print(my2darr[2][:]) # access the third row
print(my2darr[2,:]) # access the third row
print(my2darr[:][2]) # access the third row (similar to 2d list)
print(my2darr[:,2]) # access the third column
print(my2darr[:2,2:]) # access the first two rows & last two columns

ndarray also supports boolean indexing.

[ ]: my2darr = np.arange(1,13,1).reshape(3,4)
print(my2darr)

divBy3 = my2darr[my2darr % 3 == 0]
print(divBy3, type(divBy3))

divBy3LastRow = my2darr[2:, my2darr[2,:] % 3 == 0]

print(divBy3LastRow)

More indexing examples.

[ ]: my2darr = np.arange(1,13,1).reshape(4,3)
print(my2darr)

indices = [2,1,0,3] # selected row indices

print(my2darr[indices,:])

rowIndex = [0,0,1,2,3] # row index into my2darr

columnIndex = [0,2,0,1,2] # column index into my2darr
print(my2darr[rowIndex,columnIndex])

3
1.4 2.1.4 Numpy Arithmetic and Statistical Functions
There are many built-in mathematical functions available for manipulating elements of nd-array.

[ ]: y = np.array([-1.4, 0.4, -3.2, 2.5, 3.4]) # generate a random vector

print(y)

print(np.abs(y)) # convert to absolute values

print(np.sqrt(abs(y))) # apply square root to each element
print(np.sign(y)) # get the sign of each element
print(np.exp(y)) # apply exponentiation
print(np.sort(y)) # sort array

[ ]: x = np.arange(-2,3)
y = np.random.randn(5)
print(x)
print(y)

print(np.add(x,y)) # element-wise addition x + y

print(np.subtract(x,y)) # element-wise subtraction x - y
print(np.multiply(x,y)) # element-wise multiplication x * y
print(np.divide(x,y)) # element-wise division x / y
print(np.maximum(x,y)) # element-wise maximum max(x,y)

[ ]: y = np.array([-3.2, -1.4, 0.4, 2.5, 3.4]) # generate a random vector

print(y)

print("Min =", np.min(y)) # min

print("Max =", np.max(y)) # max
print("Average =", np.mean(y)) # mean/average
print("Std deviation =", np.std(y)) # standard deviation
print("Sum =", np.sum(y)) # sum

1.5 2.1.5 Numpy linear algebra

Numpy provides many functions to support linear algebra operations.

[ ]: X = np.random.randn(2,3) # create a 2 x 3 random matrix

print(X)
print(X.T) # matrix transpose operation X^T

y = np.random.randn(3) # random vector

print(y)
print(X.dot(y)) # matrix-vector multiplication X * y
print(X.dot(X.T)) # matrix-matrix multiplication X * X^T
print(X.T.dot(X)) # matrix-matrix multiplication X^T * X

4
[ ]: X = np.random.randn(5,3)
print(X)

C = X.T.dot(X) # C = X^T * X is a square matrix

invC = np.linalg.inv(C) # inverse of a square matrix

print(invC)
detC = np.linalg.det(C) # determinant of a square matrix
print(detC)
S, U = np.linalg.eig(C) # eigenvalue S and eigenvector U of a square matrix
print(S)
print(U)

1.6 2.2 Introduction to Pandas

Pandas provide two convenient data structures for storing and manipulating data–Series and
DataFrame. A Series is similar to a one-dimensional array whereas a DataFrame is more simi-
lar to representing a matrix or a spreadsheet table.

1.6.1 2.2.1 Series

A Series object consists of a one-dimensional array of values, whose elements can be referenced
using an index array. A Series object can be created from a list, a numpy array, or a Python
dictionary. You can apply most of the numpy functions on the Series object.

[ ]: from pandas import Series

s = Series([3.1, 2.4, -1.7, 0.2, -2.9, 4.5]) # creating a series from a list
print(s)
print('Values=', s.values) # display values of the Series
print('Index=', s.index) # display indices of the Series

[ ]: import numpy as np

s2 = Series(np.random.randn(6)) # creating a series from a numpy ndarray

print(s2)
print('Values=', s2.values) # display values of the Series
print('Index=', s2.index) # display indices of the Series

[ ]: s3 = Series([1.2,2.5,-2.2,3.1,-0.8,-3.2],
index = ['Jan 1','Jan 2','Jan 3','Jan 4','Jan 5','Jan 6',])
print(s3)
print('Values=', s3.values) # display values of the Series
print('Index=', s3.index) # display indices of the Series

[ ]: capitals = {'MI': 'Lansing', 'CA': 'Sacramento', 'TX': 'Austin', 'MN': 'St␣

,→Paul'}

5
s4 = Series(capitals) # creating a series from dictionary object
print(s4)
print('Values=', s4.values) # display values of the Series
print('Index=', s4.index) # display indices of the Series

[ ]: s3 = Series([1.2,2.5,-2.2,3.1,-0.8,-3.2],
index = ['Jan 1','Jan 2','Jan 3','Jan 4','Jan 5','Jan 6',])
print(s3)

# Accessing elements of a Series

print('\ns3[2]=', s3[2]) # display third element of the Series

print('s3[\'Jan 3\']=', s3['Jan 3']) # indexing element of a Series

print('\ns3[1:3]=') # display a slice of the Series

print(s3[1:3])
print('s3.iloc([1:3])=') # display a slice of the Series
print(s3.iloc[1:3])

[ ]: print('shape =', s3.shape) # get the dimension of the Series

print('size =', s3.size) # get the # of elements of the Series

[ ]: print(s3[s3 > 0]) # applying filter to select elements of the Series

[ ]: print(s3 + 4) # applying scalar operation on a numeric Series

print(s3 / 4)

[ ]: print(np.log(s3 + 4)) # applying numpy math functions to a numeric Series

1.6.2 2.2.2 DataFrame

A DataFrame object is a tabular, spreadsheet-like data structure containing a collection of columns,
each of which can be of different types (numeric, string, boolean, etc). Unlike Series, a DataFrame
has distinct row and column indices. There are many ways to create a DataFrame object (e.g.,
from a dictionary, list of tuples, or even numpy’s ndarrays).

[ ]: from pandas import DataFrame

cars = {'make': ['Ford', 'Honda', 'Toyota', 'Tesla'],

'model': ['Taurus', 'Accord', 'Camry', 'Model S'],
'MSRP': [27595, 23570, 23495, 68000]}
carData = DataFrame(cars) # creating DataFrame from dictionary
carData # display the table

[ ]: print(carData.index) # print the row indices

print(carData.columns) # print the column indices

6
[ ]: carData2 = DataFrame(cars, index = [1,2,3,4]) # change the row index
carData2['year'] = 2018 # add column with same value
carData2['dealership'] = ['Courtesy Ford','Capital Honda','Spartan Toyota','N/
,→A']

carData2 # display table

Creating DataFrame from a list of tuples.

[ ]: tuplelist = [(2011,45.1,32.4),(2012,42.4,34.5),(2013,47.2,39.2),
(2014,44.2,31.4),(2015,39.9,29.8),(2016,41.5,36.7)]
columnNames = ['year','temp','precip']
weatherData = DataFrame(tuplelist, columns=columnNames)
weatherData

Creating DataFrame from numpy ndarray

[ ]: import numpy as np

npdata = np.random.randn(5,3) # create a 5 by 3 random matrix

columnNames = ['x1','x2','x3']
data = DataFrame(npdata, columns=columnNames)
data

The elements of a DataFrame can be accessed in many ways.

[ ]: # accessing an entire column will return a Series object

print(data['x2'])
print(type(data['x2']))

[ ]: # accessing an entire row will return a Series object

print('Row 3 of data table:')

print(data.iloc[2]) # returns the 3rd row of DataFrame
print(type(data.iloc[2]))
print('\nRow 3 of car data table:')
print(carData2.iloc[2]) # row contains objects of different types

[ ]: # accessing a specific element of the DataFrame

print(carData2.iloc[1,2]) # retrieving second row, third column

print(carData2.loc[1,'model']) # retrieving second row, column named 'model'

# accessing a slice of the DataFrame

print('carData2.iloc[1:3,1:3]=')
print(carData2.iloc[1:3,1:3])

7
[ ]: print('carData2.shape =', carData2.shape)
print('carData2.size =', carData2.size)

[ ]: # selection and filtering

print('carData2[carData2.MSRP > 25000]')

print(carData2[carData2.MSRP > 25000])

1.6.3 2.2.3 Arithmetic Operations

[ ]: print(data)

print('Data transpose operation:')

print(data.T) # transpose operation

print('Addition:')
print(data + 4) # addition operation

print('Multiplication:')
print(data * 10) # multiplication operation

[ ]: print('data =')
print(data)

columnNames = ['x1','x2','x3']
data2 = DataFrame(np.random.randn(5,3), columns=columnNames)
print('\ndata2 =')
print(data2)

print('\ndata + data2 = ')

print(data.add(data2))

print('\ndata * data2 = ')

print(data.mul(data2))

[ ]: print(data.abs()) # get the absolute value for each element

print('\nMaximum value per column:')

print(data.max()) # get maximum value for each column

print('\nMinimum value per row:')

print(data.min(axis=1)) # get minimum value for each row

print('\nSum of values per column:')

print(data.sum()) # get sum of values for each column

print('\nAverage value per row:')

8
print(data.mean(axis=1)) # get average value for each row

print('\nCalculate max - min per column')

f = lambda x: x.max() - x.min()
print(data.apply(f))

print('\nCalculate max - min per row')

f = lambda x: x.max() - x.min()
print(data.apply(f, axis=1))

1.6.4 2.2.4 Plotting Series and DataFrame

There are built-in functions you can use to plot the data stored in a Series or a DataFrame.

[ ]: %matplotlib inline

s3 = Series([1.2,2.5,-2.2,3.1,-0.8,-3.2,1.4],
index = ['Jan 1','Jan 2','Jan 3','Jan 4','Jan 5','Jan 6','Jan 7'])
s3.plot(kind='line', title='Line plot')

[ ]: s3.plot(kind='bar', title='Bar plot')

[ ]: s3.plot(kind='hist', title = 'Histogram')

Numpy
No ratings yet
Numpy
11 pages
Python Numpy
No ratings yet
Python Numpy
41 pages
Numpy Cheat Sheet
No ratings yet
Numpy Cheat Sheet
13 pages
Numpy
No ratings yet
Numpy
11 pages
NumPy Basics
No ratings yet
NumPy Basics
23 pages
12th IP Unit-1 Numpy - Array
No ratings yet
12th IP Unit-1 Numpy - Array
21 pages
Informatics Practices: Numpy - Array
100% (1)
Informatics Practices: Numpy - Array
28 pages
Numpy - Array
No ratings yet
Numpy - Array
21 pages
Numpy
No ratings yet
Numpy
14 pages
L2. Numpy
No ratings yet
L2. Numpy
24 pages
Numpy
No ratings yet
Numpy
20 pages
FDS Exp1,2
No ratings yet
FDS Exp1,2
4 pages
3 IntroToPython-PythonLibraries
No ratings yet
3 IntroToPython-PythonLibraries
36 pages
Numpy Tutorial by Expertized Guy
No ratings yet
Numpy Tutorial by Expertized Guy
12 pages
Python Libraries PDF
No ratings yet
Python Libraries PDF
22 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
AP19110010420 - Venkatraja
No ratings yet
AP19110010420 - Venkatraja
6 pages
Day 3.Numpy_Complete_Guide
No ratings yet
Day 3.Numpy_Complete_Guide
17 pages
NUMPY
No ratings yet
NUMPY
8 pages
Unit III - Data Manipulation Using Python
No ratings yet
Unit III - Data Manipulation Using Python
16 pages
NumPy Basics
No ratings yet
NumPy Basics
9 pages
Python 20240309 154846 0000
No ratings yet
Python 20240309 154846 0000
34 pages
XIAINumpy (1)
No ratings yet
XIAINumpy (1)
14 pages
PYTHON UNIT-5 Part-B
No ratings yet
PYTHON UNIT-5 Part-B
3 pages
Ap19110010321 - Janith
No ratings yet
Ap19110010321 - Janith
6 pages
Python Introduction
No ratings yet
Python Introduction
20 pages
Ids 6 Experiments
No ratings yet
Ids 6 Experiments
27 pages
Data Science Lab
No ratings yet
Data Science Lab
14 pages
Labmanualfds
No ratings yet
Labmanualfds
49 pages
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
100% (1)
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
27 pages
Nguyenquangmanh
No ratings yet
Nguyenquangmanh
15 pages
Section 7
No ratings yet
Section 7
33 pages
CS229 Section: Python Tutorial: Maya Srikanth
No ratings yet
CS229 Section: Python Tutorial: Maya Srikanth
39 pages
Numpy: Usage For Data Analysis Operations
No ratings yet
Numpy: Usage For Data Analysis Operations
20 pages
11.Arrays
No ratings yet
11.Arrays
12 pages
Aman Ai Primers Numpy
100% (1)
Aman Ai Primers Numpy
85 pages
numpy-capge
No ratings yet
numpy-capge
6 pages
Eneral Definitions
No ratings yet
Eneral Definitions
7 pages
w3school numpy
No ratings yet
w3school numpy
9 pages
Numpy Array11
No ratings yet
Numpy Array11
23 pages
????? ??????????
No ratings yet
????? ??????????
6 pages
Unit 5 PythonPackages(Matplotlib)
No ratings yet
Unit 5 PythonPackages(Matplotlib)
24 pages
ASSIGNMENT 3 (20_A) - Colab
No ratings yet
ASSIGNMENT 3 (20_A) - Colab
9 pages
Numpy - Basics - Jupyter Notebook
No ratings yet
Numpy - Basics - Jupyter Notebook
9 pages
Kuliah #7 Alprog - Numpy, Pandas, Matplotlib
No ratings yet
Kuliah #7 Alprog - Numpy, Pandas, Matplotlib
48 pages
Matrix Exercise
No ratings yet
Matrix Exercise
3 pages
Practical 1- Basics of R
No ratings yet
Practical 1- Basics of R
8 pages
What Is Python
No ratings yet
What Is Python
10 pages
Module 6 NumPY and Pandas
No ratings yet
Module 6 NumPY and Pandas
12 pages
NumPy
No ratings yet
NumPy
18 pages
Unit 4 Python Numpy
No ratings yet
Unit 4 Python Numpy
18 pages
Numpy
No ratings yet
Numpy
9 pages
Tutorial 1
No ratings yet
Tutorial 1
8 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
Notes Python
No ratings yet
Notes Python
93 pages
Tutorial-2 Basic NumPy (2)
No ratings yet
Tutorial-2 Basic NumPy (2)
16 pages
How To Use Popular Data Structures and Algorithms in Python ?
100% (1)
How To Use Popular Data Structures and Algorithms in Python ?
11 pages
NumPy Functions
No ratings yet
NumPy Functions
5 pages
Numpy Handbook
No ratings yet
Numpy Handbook
16 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
DataScience Internship
No ratings yet
DataScience Internship
87 pages
Easy Viz
No ratings yet
Easy Viz
68 pages
Predicting True Value of Cars Using Ml-1
No ratings yet
Predicting True Value of Cars Using Ml-1
36 pages
Python For Data Science - ANR PL - Final
No ratings yet
Python For Data Science - ANR PL - Final
194 pages
Qutip Doc 4.3 PDF
No ratings yet
Qutip Doc 4.3 PDF
317 pages
Programming For Engineers in Python: Recitation 12
No ratings yet
Programming For Engineers in Python: Recitation 12
39 pages
Using Python For Data Analysis - July 2018 - Slides
No ratings yet
Using Python For Data Analysis - July 2018 - Slides
43 pages
Data Science Lab: Numpy: Numerical Python
No ratings yet
Data Science Lab: Numpy: Numerical Python
71 pages
ENGG1003 - Programming Assignment 1 GPS Data Analysis
No ratings yet
ENGG1003 - Programming Assignment 1 GPS Data Analysis
4 pages
Smart Traffic Management System With Real Time Analysis: January 2018
No ratings yet
Smart Traffic Management System With Real Time Analysis: January 2018
5 pages
E & Ai Lab Manual
No ratings yet
E & Ai Lab Manual
31 pages
gnanadeep internship
No ratings yet
gnanadeep internship
29 pages
Python CCE - II by Atul Sadiwal?
No ratings yet
Python CCE - II by Atul Sadiwal?
7 pages
PDF High Performance Python 2nd Edition Micha Gorelick download
100% (3)
PDF High Performance Python 2nd Edition Micha Gorelick download
50 pages
Python List Concept
No ratings yet
Python List Concept
32 pages
Final Sonali
No ratings yet
Final Sonali
51 pages
Important Questions for 2nd Sessional.
No ratings yet
Important Questions for 2nd Sessional.
2 pages
Introduction To Python For Science & Engineering: David J. Pine
No ratings yet
Introduction To Python For Science & Engineering: David J. Pine
18 pages
Pytthon For Data Analysis From Scratch
100% (5)
Pytthon For Data Analysis From Scratch
37 pages
Python NumPy Cheat Sheet
No ratings yet
Python NumPy Cheat Sheet
1 page
Data Science-Logbook
No ratings yet
Data Science-Logbook
101 pages
Manual
No ratings yet
Manual
52 pages
Unit 7 Python Libraries For Data Science
No ratings yet
Unit 7 Python Libraries For Data Science
34 pages
41 Vedanti Exp1 - Colaboratory
No ratings yet
41 Vedanti Exp1 - Colaboratory
2 pages
About Pytorch Brief Details 1716579380
No ratings yet
About Pytorch Brief Details 1716579380
20 pages
00 - 5-Day Bootcamp Curiculum
No ratings yet
00 - 5-Day Bootcamp Curiculum
21 pages
Pandas Workout (MEAP V06) Reuven Lerner - The complete ebook set is ready for download today
100% (1)
Pandas Workout (MEAP V06) Reuven Lerner - The complete ebook set is ready for download today
66 pages
numpy_lab_1-5
No ratings yet
numpy_lab_1-5
9 pages
ML[1]
No ratings yet
ML[1]
49 pages
3 Lab Manual
No ratings yet
3 Lab Manual
45 pages