Introduction To Numpy - Ipynb - Colaboratory
Introduction To Numpy - Ipynb - Colaboratory
ipynb - Colaboratory
Overview
In this module, we will discuss about numpy (Numerical Python), a widely used library for
scienti c computing
Out of all the data structures of NumPy, Numpy arrays (or ndarrays) are the most widely used
data structure among scienti c community
Numpy arrays are highly optimized for larger volumes of data (millions of entries)
As a result, almost all the Python based scienti c softwares and tools are built based on
NumPy arrays.
Numpy Basics
We can import numpy to script import numpy as np .
as np allows us to use numpy functions and operations by simply using np.foo() instead of
numpy.foo() .
import numpy as np
Numpy Arrays
Creating Arrays
Numpy arrays (Also known as ndarrays )are highly optimized for larger volumes of data than
python lists.
In almost all scienti c experiments with large amounts of data, numpy arrays are used over python
lists. (~ millions of entries)
data1 = [9, 8, 7, 1, 2, 3]
numpydata1 = np.array(data1)
numpydata1
array([9, 8, 7, 1, 2, 3])
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 1/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
Numpy has some other ways to create arrays as well. We will discuss them later.
numpydata2 = np.array(data2)
numpydata2
Shape
Shape is a numpy array property that describes the number of elements in each dimension of the
numpy array.
data1 = [9, 8, 7, 1, 2, 3]
numpydata1 = np.array(data1)
numpydata1.shape
numpydata2 = np.array(data2)
numpydata2.shape
Number of Dimensions
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 2/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
Ndim property describes the number of dimensions in the array. We can access property by calling
array_name.ndim .
data1 = [9, 8, 7, 1, 2, 3]
numpydata1 = np.array(data1)
numpydata1.ndim
numpydata2 = np.array(data2)
numpydata2.ndim
Data Type
Numpy array data type property describes the data type used to store data in the array which can
be accesses by calling array_name.dtype
data1 = [9, 8, 7, 1, 2, 3]
numpydata1 = np.array(data1)
numpydata1.dtype
numpydata2 = np.array(data2)
numpydata2.dtype
Depending on the level of precision required, the appropriate data type of data needs to be
determined. It is also important to note that high precision will require more memory, and
processing power which are also constraints when analyzing data.
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 3/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
zeros
Similarly we can form a array of all elements ones by using np.ones function.
ones
Also a list with random elements can be generated using np.random.randint(start, end, shape)
function.
random
Reshape
We can change the shape of an array using reshape operation. We can use this by calling
np.reshape(array_name, target_shape)
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 4/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
np.reshape(numpydata2, (6,))
Sort
np.sort(numpydata1)
Concatenate
Concatenate operation allows to join multiple arrays along a dimension using
np.concatenate((array1, array2), axis=axis1) .
By chosing axis=1 , we can concatenate a new array in 1st dimension. (Or add a row if we have 2
dimensional data)
c.shape
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 5/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
By chosing axis=1 we can concatenate arrays in second dimension. (or add a column for 2
dimensional data)
x.shape
y = np.array([[3], [4]])
y.shape
z.shape
It is important to note that shape needs to be same in both arrays other than the dimension we are
going to concatenate to be able to concatenate. Try np.concatenate((a, b), axis=1) and
np.concatenate((x, y), axis=0)
Activity
A device having 3 accelerometers collecting acceleration information during a certain period of
time has collected acceleration encountered during each second has been stored under the
variable accerationdata .
During the same time period another device has collected temperature reading outside and inside
the assembly and has been saved to variable temperturedata . (Inside reading followed by an
outside reading)
Your task is to combine the readings to a single variable combineddata so that contains both
acceleration and temperature data.
temperaturedata = np.array([166, 251, 108, 238, 229, 236, 194, 161, 266, 108, 102, 291, 235,
121, 188, 183, 183, 137, 129, 133])
Arithmatic Operations
a = np.array([1, 2, 4])
a* 4
By using +, *, -, / ` operations with another array. we can perform element wise multiplication
between two arrays.
a = np.array([1, 2, 3])
b = np.array([3, 2, 1])
a*b
Comparison - Similarly we can perform comparison of elements using >, <, =<, >=, ==
operators with either elements or lists.
a = np.array([4, 5, 6])
a >= 5
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 7/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
a = np.array([4, 5, 6])
b = np.array([4, 3, 6])
a == b
Activity
The sensory data we had in previous stage requires some modi cations in this stage. The
acceleration_data corresponds to volatage reading from the sensors and needs to be converted
to standard measurement units. For that, each element needs to be multiplied by 2.5391.
Also determine whether there are any outliers or possible incorrect readings of acceleration.
(Corrected value being greater than 2.5 is usually considered an outlier)
mean(array)
sum(array)
amin(array)
amax(array)
np.amin(array)
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 8/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
Numpy also have Mathematical functions such as sine, cosine and tangenet. These functions get
applied element-wise to all the elements in the array.
sin(array)
tan(array)
log10(array)
np.log10(arr)
Activity
Consider the scenario we discussed earlier, instead of using inside and outside temperature
readings separately over different points of time, we will be using average of all the readings for our
scienti c experiment.
Also for our experiment, we need to get the tangent(tan) reading of the standard acceleration
values.
np.tan(accelerationdata)
Indexing
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 9/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
Similar to python lists, we can access individual elements in numpy arrays using the index for a 1D
array.
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
a[0]
In the case of multi-dimensional array, it return an array. To access individual elements, we have to
call/access index recursively.
b[1]
b[1][0]
b[1, 0]
Slicing
Slicing allows us to select multiple elements based on a simple criteria based on indices.
a = np.array([1, 2, 3, 4, 5, 6, 7])
a[3:]
a[:5]
a[3:5]
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 10/11
8/5/2020 Introduction to Numpy.ipynb - Colaboratory
Let's repeat the same element with a Numpy Array. Name the numpy array B and slice we take as Y.
https://colab.research.google.com/drive/1hH3fneZ2O1MYx7sROfe26NdQAHmC9bjD#printMode=true 11/11