0% found this document useful (0 votes)
18 views86 pages

Python Basics for Machine Learning

The document provides an overview of Python programming, specifically tailored for machine learning, covering installation, syntax, data types, and control flow. It explains key concepts such as functions, mutable vs immutable types, and examples of common programming tasks like finding factors and prime numbers. The content is structured to guide learners through the fundamentals of Python, emphasizing practical applications and coding practices.

Uploaded by

pjenith51
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views86 pages

Python Basics for Machine Learning

The document provides an overview of Python programming, specifically tailored for machine learning, covering installation, syntax, data types, and control flow. It explains key concepts such as functions, mutable vs immutable types, and examples of common programming tasks like finding factors and prime numbers. The content is structured to guide learners through the fundamentals of Python, emphasizing practical applications and coding practices.

Uploaded by

pjenith51
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Python Programming

for Machine Learning

SSN College of Engineering, 18 August 2018

Madhavan Mukund, Chennai Mathematical Institute


[Link]
Installing Python
Python is available on all platforms: Linux, MacOS
and Windows

Two main flavours of Python

Python 2.7

Python 3 (currently 3.7.x)

We will work with Python 3


Python interpreter
Python is basically an interpreted language

Load the Python interpreter

Send Python commands to the interpreter to be


executed

Easy to interactively explore language features

Can load complex programs from files

>>> from filename import *


A typical Python program
def function_1(..,..): Interpreter executes
… statements from top
def function_2(..,..):
to bottom


Function definitions
def function_k(..,..):
are “digested” for

future use
statement_1
statement_2 Actual computation
⋮ starts from
statement_n statement_1
A more messy program
statement_1 Python allows free
mixing of function
def function_1(..,..):
… definitions and
statements
statement_2
statement_3 But programs written
like this are likely to
def function_2(..,..):
be harder to

understand and
statement_4 debug

Assignment statement
Assign a value to a name

i = 5
j = 2*i
j = j + 5

Left hand side is a name

Right hand side is an expression

Operations in expression depend on type of value


Numeric values

Numbers come in two flavours

int — integers

float — fractional numbers

178, -3, 4283829 are values of type int

37.82, -0.01, 28.7998 are values of type float


Operations on numbers
Normal arithmetic operations: +,-,*,/

Note that / always produces a float

7/3.5 is 2.0, 7/2 is 3.5

Quotient and remainder: // and %

9//5 is 1, 9%5 is 4

Exponentiation: **

3**4 is 81
Other operations on
numbers

log(), sqrt(), sin(), …

Built in to Python, but not available by default

Must include math “library”

from math import *


Boolean values: bool

True, False

Logical operators: not, and, or

not True is False, not False is True

x and y is True if both of x,y are True

x or y is True if at least one of x,y is True


Comparisons
x == y, a != b,
z < 17*5, n > m,
i <= j+k, 19 >= 44*d

Combine using logical operators

n > 0 and m%n == 0

Assign a boolean expression to a name

divisor = (m%n == 0)
Examples
def divides(m,n):
if n%m == 0:
return(True)
else:
return(False)

def even(n):
return(divides(2,n))

def odd(n):
return(not even(n))
Strings —type str
Type string, str, a sequence of characters

A single character is a string of length 1

No separate type char

Enclose in quotes—single, double, even triple!


city = 'Chennai'

title = "Hitchhiker's Guide to the Galaxy"

dialogue = '''He said his favourite book is


"Hitchhiker's Guide to the Galaxy”'''
Strings as sequences

String: sequence or list of characters

Positions 0,1,2,…,n-1 for a string of length n


0 1 2 3 4
s = "hello" h e l l o
-5 -4 -3 -2 -1
Positions -1,-2,… count backwards from end

s[1] == "e", s[-2] = "l"


Operations on strings

Combine two strings: concatenation, operator +

s = "hello"

t = s + ", there"

t is now "hello, there"

len(s) returns length of s


Names, values and types
Types in Python are dynamic, but strong

Values have types

Type determines what operations are legal

Names inherit their type from their current value

Type of a name is not fixed

Unlike languages like C, C++, Java where each


name is “declared” in advance with its type
Names, values and types
Names can be assigned values of different types
as the program evolves

i = 5 # i is int
i = 7*1 # i is still int
j = i/3 # j is float, / creates float

i = 2*j # i is now float

type(e) returns type of expression e


Extracting substrings
A slice is a “segment” of a string

s = "hello"

s[1:4] is "ell"

s[i:j] starts at s[i] and ends at s[j-1]

s[:j] starts at s[0], so s[0:j]

s[i:] ends at s[len(s)-1], so s[i:len(s)]


Modifying strings
Cannot update a string “in place”

s = "hello", want to change to "help!"

s[3] = "p" — error!

Instead, use slices and concatenation

s = s[0:3] + "p!"

Strings are immutable values (more later)


Lists
Sequences of values
factors = [1,2,5,10]
names = ["Anand","Charles","Muqsit"]

Type need not be uniform


mixed = [3, True, "Yellow"]

Extract values by position, slice, like str


factors[3] is 10, mixed[0:2] is [3,True]

Length is given by len()


len(names) is 3
Nested lists

Lists can contain other lists

nested = [[2,[37]],4,["hello"]]

nested[0] is [2,[37]]
nested[1] is 4
nested[2][0][3] is "l"
nested[0][1:2] is [[37]]
Updating lists
Unlike strings, lists can be updated in place

nested = [[2,[37]],4,["hello"]]
nested[1] = 7
nested is now [[2,[37]],7,["hello"]]
nested[0][1][0] = 19
nested is now [[2,[19]],7,["hello"]]

Lists are mutable, unlike strings


Mutable vs immutable
What happens when we assign names?

x = 5
y = x
x = 7

Has the value of y changed?

No, why should it?

Does assignment copy the value or make both


names point to the same value?
Mutable vs immutable …

Does assignment copy the value or make both


names point to the same value?

For immutable values, we can assume that


assignment makes a fresh copy of a value

Values of type int, float, bool, str are


immutable

Updating one value does not affect the copy


Mutable vs immutable …
For mutable values, assignment does not make a
fresh copy

list1 = [1,3,5,7]
list2 = list1
list1[2] = 4

What is list2[2] now?

list2[2] is also 4

list1 and list2 are two names for the same list
Copying lists
How can we make a copy of a list?

A slice creates a new (sub)list from an old one

Recall l[:k] is l[0:k], l[k:] is l[k:len(l)]

Omitting both end points gives a full slice


l[:] == l[0:len(l)]

To make a copy of a list use a full slice


list2 = list1[:]
Tuples
Simultaneous assignments
(age,name,primes) = (23,"Kamal",[2,3,5])

One line swap! (x,y) = (y,x)


Assign a tuple of values to a name
point = (3.5,4.8)
Extract positions, slices:
ycoordinate = point[0]
Tuples are immutable: point[1] = 8.7 is an error
Control flow
Need to vary computation steps as values change

Control flow — determines order in which


statements are executed

Conditional execution

Repeated execution — loops

Function definitions
Conditional execution
if m%n != 0:
(m,n) = (n,m%n)
Second statement is executed only if the condition
m%n != 0 is True
Indentation demarcates body of if — must be uniform
if condition:
statement_1 # Execute conditionally
statement_2 # Execute conditionally
statement_3 # Execute unconditionally
Alternative execution

if m%n != 0:
(m,n) = (n,m%n)
else:
gcd = n

else: is optional
Shortcuts for conditions
Numeric value 0 is treated as False

Empty sequence "", [] is treated as False

Everything else is True

if m%n:
(m,n) = (n,m%n)
else:
gcd = n
Multiway branching, elif:
if x == 1: if x == 1:
y = f1(x) y = f1(x)
else: elif x == 2:
if x == 2: y = f2(x)
y = f2(x) elif x == 3:
else: y = f3(x)
if x == 3: else:
y = f3(x) y = f4(x)
else:
y = f4(x)
Loops: repeated actions

Repeat something a fixed number of times

for i in [1,2,3,4]:
y = y*i
z = z+1

Again, indentation to mark body of loop


Repeating n times
Often we want to do something exactly n times
for i in [1,2,..,n]:
. . .

range(0,n) generates sequence 0,1,…,n-1


for i in range(0,n):
. . .

range(i,j) generates sequence i,i+1,…,j-1

More details about range() later


Example
Find all factors of a number n

Factors must lie between 1 and n

def factors(n):
flist = []
for i in range(1,n+1):
if n%i == 0:
flist = flist + [i]
return(flist)
Loop based on a condition
If we don’t know number of repetitions in advance

while condition:
. . .

Execute body if condition evaluates to True

After each iteration, check condition again

Body must ensure progress towards termination!


Example
Euclid’s gcd algorithm using remainder

Update m, n till we find n to be a divisor of m

def gcd(m,n):
if m < n:
(m,n) = (n,m)
while m%n != 0:
(m,n) = (n,m%n)
return(n)
A typical Python program
def function_1(..,..): Interpreter executes
… statements from top
def function_2(..,..):
to bottom


Function definitions
def function_k(..,..):
are “digested” for

future use
statement_1
statement_2 Actual computation
⋮ starts from
statement_n statement_1
Function definition
def f(a,b,c):
statement_1
statement_2
..
return(v)
..

Function name, arguments/parameters

Body is indented

return() statement exits and returns a value


Passing values to functions

Argument value is substituted for name

def power(x,n): power(3,5)


ans = 1 x = 3
for i in range(0,n): n = 5
ans = ans*x ans = 1
return(ans) for i in range..

Like an implicit assignment statement


Passing values …

Same rules apply for mutable, immutable values

Immutable value will not be affected at calling


point

Mutable values will be affected


Example
def update(l,i,v): ns = [3,11,12]
if i >= 0 and i < len(l): z = 8
l[i] = v update(ns,2,z)
return(True) update(ns,4,z)
else:
v = v+1 ns is [3,11,8]
return(False) z remains 8

Return value may be ignored

If there is no return(), function ends when last


statement is reached
Can pass functions

Apply f to x n times

def apply(f,x,n): def square(x):


res = x return(x*x)
for i in range(n):
res = f(res) apply(square,5,2)
return(res)
square(square(5))

625
Scope of names
Names within a function have local scope

def stupid(x):
n = 17
return(x)

n = 7
v = stupid(28)
# What is n now?

n is still 7
Name n inside function is separate from n outside
Defining functions
A function must be defined before it is invoked

This is OK This is not

def f(x): def f(x):


return(g(x+1)) return(g(x+1))

def g(y): z = f(77)


return(y+3)
def g(y):
z = f(77) return(y+3)
Recursive functions

A function can call itself — recursion

def factorial(n):
if n <= 0:
return(1)
else:
val = n * factorial(n-1)
return(val)
Some examples
Find all factors of a number n

Factors must lie between 1 and n

def factors(n):
factorlist = []
for i in range(1,n+1):
if n%i == 0:
factorlist = factorlist + [i]
return(factorlist)
Primes
Prime number — only factors are 1 and itself

factors(17) is [1,17]

factors(18) is [1,2,3,6,9,18]

def isprime(n):
return(factors(n) == [1,n])

1 should not be reported as a prime

factors(1) is [1], not [1,1]


Primes upto n

List all primes below a given number

def primesupto(n):
primelist = []
for i in range(1,n+1):
if isprime(i):
primelist = primelist + [i]
return(primelist)
First n primes

List the first n primes

def nprimes(n):
(count,i,plist) = (0,1,[])
while(count < n):
if isprime(i):
(count,plist) = (count+1,plist+[i])
i = i+1
return(plist)
More about range()
range(i,j) produces the sequence i,i+1,…,j-1

range(j) automatically starts from 0; 0,1,…,j-1

range(i,j,k) increments by k; i,i+k,…,i+nk

Stops with n such that i+nk < j <= i+(n+1)k

Count down? Make k negative!


range(i,j,-1), i > j, produces i,i-1,…,j+1
range() and lists
Compare the following

for i in [0,1,2,3,4,5,6,7,8,9]:

for i in range(0,10):

Is range(0,10) == [0,1,2,3,4,5,6,7,8,9]?

In Python2, yes

In Python3, no!
range() and lists
Can convert range() to a list using list()

list(range(0,5)) == [0,1,2,3,4]

Other type conversion functions using type names

str(78) = "78"

int("321") = 321

But int("32x") yields error


Lists

Lists are mutable

list1 = [1,3,5,6]
list2 = list1
list1[2] = 7

list1 is now [1,3,7,6]

So is list2
Lists
On the other hand

list1 = [1,3,5,6]
list2 = list1
list1 = list1[0:2] + [7] + list1[3:]

list1 is now [1,3,7,6]

list2 remains [1,3,5,6]

Concatenation produces a new list


Extending a list

Adding an element to a list, in place

list1 = [1,3,5,6]
list2 = list1
[Link](12)

list1 is now [1,3,5,6,12]

list2 is also [1,3,5,6,12]


List functions
[Link](v) — extend list1 by a single
value v

[Link](list2) — extend list1 by a list of


values

In place equivalent of list1 = list1 + list2

[Link](x) — removes first occurrence of x

Error if no copy of x exists in list1


A note on syntax

[Link](x) rather than append(list1,x)

list1 is an object

append() is a function to update the object

x is an argument to the function


List membership

x in l returns True if value x is found in list l

# Safely remove x from l


if x in l:
[Link](x)

# Remove all occurrences of x from l


while x in l:
[Link](x)
Other functions
[Link]() — reverse l in place

[Link]() — sort l in ascending order

[Link](x) — find leftmost position of x in l

Avoid error by checking if x in l

[Link](x) — find rightmost position of x in l

Many more … see Python documentation!


Initialising names
A name cannot be used before it is assigned a
value

y = x + 1 # Error if x is unassigned

May forget this for lists where update is implicit

[Link](v)

Python needs to know that l is a list


Initialising names …
def factors(n):

for i in range(1,n+1):
if n%i == 0:
[Link](i)

return(flist)
Initialising names …
def factors(n):

flist = []

for i in range(1,n+1):
if n%i == 0:
[Link](i)

return(flist)
Sequences of values

Two basic ways of storing a sequence of values

Arrays

Lists

What’s the difference?


Arrays
Single block of memory, elements of uniform type
Typically size of sequence is fixed in advance

Indexing is fast
Access seq[i] in constant time for any i
Compute offset from start of memory block

Inserting between seq[i] and seq[i+1] is


expensive

Contraction is expensive
Lists
Values scattered in memory
Each element points to the next—“linked” list
Flexible size

Follow i links to access seq[i]


Cost proportional to i

Inserting or deleting an element is easy


“Plumbing”
Operations
Exchange seq[i] and seq[j]
Constant time in array, linear time in lists

Delete seq[i] or Insert v after seq[i]


Constant time in lists (if we are already at seq[i])
Linear time in array

Algorithms on one data structure may not transfer


to another
Example: Binary search
Python lists

Are built in lists in Python lists or arrays?

Documentation suggests they are lists


Allow efficient expansion, contraction

However, positional indexing allows us to treat


them as arrays

Numpy package provides real arrays (later)


Generalizing lists
l = [13, 46, 0, 25, 72]

View l as a function, associating values to positions

l : {0,1,..,4} ⟶ integers

l(0) = 13, l(4) = 72

0,1,..,4 are keys

l[0],l[1],..,l[4] are corresponding values


Dictionaries
Allow keys other than range(0,n)

Key could be a string

test1["Dhawan"] = 84
test1["Pujara"] = 16
test1["Kohli"] = 200

Python dictionary

Any immutable value can be a key

Can update dictionaries in place —mutable, like lists


Dictionaries
Empty dictionary is {}, not []

Initialization: test1 = {}

Note: test1 = [] is empty list, test1 = () is


empty tuple

Keys can be any immutable values

int, float, bool, string, tuple

But not lists, or dictionaries


Dictionaries
Can nest dictionaries

score["Test1"]["Dhawan"] = 84
score["Test1"]["Kohli"] = 200
score["Test2"]["Dhawan"] = 27

Directly assign values to a dictionary

score = {"Dhawan":84, "Kohli":200}


score = {"Test1":{"Dhawan":84,
"Kohli":200}, "Test2":{"Dhawan":50}}
Operating on dictionaries
[Link]() returns sequence of keys of dictionary d
for k in [Link]():
# Process d[k]
[Link]() is not in any predictable order
for k in sorted([Link]()):
# Process d[k]
sorted(l) returns sorted copy of l, [Link]()
sorts l in place
[Link]() is not a list —use list([Link]())
Operating on dictionaries
Similarly, [Link]() is sequence of values in d
total = 0
for s in [Link]():
total = total + test1
Test for key using in, like list membership
for n in ["Dhawan","Kohli"]:
total[n] = 0
for match in [Link]():
if n in score[match].keys():
total[n] = total[n] + score[match][n]
Dictionaries vs lists

Assigning to an unknown key inserts an entry

d = {}
d[0] = 7 # No problem, d == {0:7}

… unlike a list

l = []
l[0] = 7 # IndexError!
Reading from the keyboard
Read a line of input and assign to userdata

userdata = input()

Display a message prompting the user

userdata = input("Enter a number: ")

Input is always a string, convert as required

userdata = input("Enter a number: “)


usernum = int(userdata)
Printing to screen
Print values of names, separated by spaces

print(x,y)
print(a,b,c)

Print a message

print("Not a number. Try again")

Intersperse message with values of names

print("Values are x:", x, "y:", y)


Fine tuning print()
By default, print( ) appends new line character '\n'
to whatever is printed

Each print( ) appears on a new line

Specify what to append with argument end="…"


Add space,
print("Continue on the", end=" ") no new line
print("same line", end=".\n")
print("Next line.") Add full stop,
new line
Continue on the same line.
Next line.
Fine tuning print()
Items are separated by space by default

(x,y) = (7,10)
print("x is",x,"and y is",y,".")

x is 7 and y is 10 .

Specify separator with argument sep="…"

print("x is ",x," and y is ",y,".", sep="")

x is 7 and y is 10.
Numpy
Homogenous multidimensional arrays
>>> [Link]
>>> import numpy as np
(3, 5)

>>> a = >>> [Link]


[Link](15).reshape(3,5 2
)
>>> [Link]
'int64'
>>> a
array([[ 0, 1, 2, 3, 4], >>> [Link]
[ 5, 6, 7, 8, 9], 15
[10,11,12,13,14]])
>>> type(a)
)
<type '[Link]'>
Numpy
Array creation
>>> a = >>> b =
[Link]([2,3,4]) [Link]([(1.5,2,3),
(4,5,6)])
>>> a
array([2, 3, 4]) >>> b
array([[ 1.5, 2. , 3. ],
>>> [Link]
dtype('int64') [ 4. , 5. , 6. ]])

>>> [Link]
dtype('float64')
Basic operations
>>> a = [Link]( [20,30,40,50] )

>>> b = [Link]( 4 )

>>> b >>> 10*[Link](a)


array([0, 1, 2, 3]) array([ 9.12945251,
>>> c = a-b -9.88031624,
7.4511316 ,
>>> c -2.62374854])
array([20, 29, 38, 47])
>>> a<35
>>> b**2 array([ True, True,
array([0, 1, 4, 9]) False, False])
Slicing
>>> a = [Link](10)**3

>>> a
array([ 0, 1, 8, 27, 64, 125, 216, 343, 512,
729])

>>> a[2]
8

>>> a[2:5]
array([ 8, 27, 64])

>>> a[0:5] = -1000

>>> a
array([-1000, -1000, -1000, -1000, -1000, 125, 216,
343, 512, 729])
Iteration
>>> for i in a:
... print(i**(1/3.))

nan
nan
nan
nan
nan
5.0
6.0
7.0
8.0
9.0
Summary

Python combines simple syntax with rich features

Strings, lists, tuples, dictionaries

Numpy library implements arrays

Sklearn library implements many ML models

Deep learning interface to Tensorflow


Online resources

[Link] Python

[Link] NumPy

[Link] scikit-learn

[Link] TensorFlow

You might also like