Artificial Intelligence-Lab Manual
Artificial Intelligence-Lab Manual
ARTIFICIAL INTELLIGENCE
Preface
This lab manual is about artificial intelligence (AI), which is one of the most advanced and
complicated yet advantageous emerging technologies. In this lab, students will learn about AI
algorithms from basic libraries, machine learning, and deep learning functions using Python
programming language.
Tools/ Technologies
Python Language
Anaconda (Jupiter Notebook)
Pycharm
Google Colab
2
BS (Software Engineering) 2024
TABLE OF CONTENTS
Preface........................................................................................................................................ 2
Tools/ Technologies ................................................................................................................... 2
LAB 1: Introduction to Artificial intelligence and Python and Installation of Python IDE ...... 4
LAB 2: Python programming (Syntax, printing, data types and ............................................... 7
variables, conditional loops) ...................................................................................................... 7
LAB 3: Python programming (loops, functions, classes) ........................................................ 14
LAB 4: Python programming (lists, tuples, strings, dictionaries) ........................................... 20
LAB 5: Intelligent Agents ........................................................................................................ 28
LAB 6: Graph Search: Uninformed search and Informed search ............................................ 31
LAB 7: Introduction to NumPy, Pandas, Scikit-learn and Matplotlib Python Packages ........ 35
LAB 8: Introduction to Machine Learning, Deep learning and deep learning Frameworks
(TensorFlow, Keras) in Python ................................................................................................ 40
LAB 9: Supervised Machine Learning: Classification with K-Nearest Neighbors (KNN)..... 44
LAB 10: Supervised Machine Learning: Regression with K-Nearest Neighbors ................... 47
LAB 11: Supervised Machine Learning: Regression with Support Vector Machines and
Decision Trees ......................................................................................................................... 48
LAB 12: Unsupervised machine learning: K-mean clustering ................................................ 51
LAB 13: Implementation of Neural Networks (NN) in Python .............................................. 53
LAB 14: Evaluation Metrics to evaluate machine learning algorithms ................................... 55
LAB 15: Reinforcement Learning ........................................................................................... 58
LAB 16: Final Evaluation ........................................................................................................ 60
3
BS (Software Engineering) 2024
Objectives
Understand the fundamental concepts of Artificial Intelligence.
Get acquainted with tools and libraries commonly used in AI.
Theoretical Description
Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are
designed to think and learn like humans. This includes a wide range of capabilities, such as
problem-solving, learning from experience, and understanding natural language. The scope of
AI is vast and encompasses various subfields, including machine learning, neural networks,
and robotics. AI has evolved significantly since its inception in the 1950s when Alan Turing
proposed the Turing Test to measure a machine's ability to exhibit intelligent behavior. From
the early days of symbolic AI and expert systems to the modern era of deep learning and neural
networks, AI has made tremendous strides. Today, AI applications are ubiquitous, ranging
from voice assistants like Siri and Alexa to advanced systems in healthcare, finance, and
autonomous vehicles.
AI can be categorized into two main types: Narrow AI and General AI. Narrow AI, also known
as Weak AI, is designed to perform specific tasks, such as image recognition or language
translation, and operates within a limited context. In contrast, General AI, or Strong AI, aims
to possess the cognitive abilities of a human, capable of understanding, learning, and applying
knowledge across a wide range of tasks. Additionally, AI encompasses various learning
paradigms, including supervised learning, where machines are trained on labeled data;
unsupervised learning, which involves finding patterns in unlabeled data; and reinforcement
learning, where agents learn to make decisions through trial-and-error interactions with their
environment. These distinctions and learning approaches are fundamental to the development
and application of AI technologies across different domains.
4
BS (Software Engineering) 2024
5
BS (Software Engineering) 2024
execute, and share Python code, making it an ideal tool for collaborative projects and accessing
powerful computational resources.
Lab Task:
Run basic “hello world” program on any of the above IDE.
Run few basic programs of python to get familiar with IDE of python.
6
BS (Software Engineering) 2024
Theoretical Description
Learning python for a C++/C# programmer
Let us try to quickly compare the syntax of python with that of C++/C#:
C++/C# Python
7
BS (Software Engineering) 2024
Math in Python
Calculations are simple with Python, and expression syntax is straightforward: the operators
+, -,
* and / work as expected; parentheses () can be used for grouping.
# Python 3: Simple arithmetic>>> 1 / 2
0.5
>>> 2 ** 3 #Exponent operator
8>
>> 17 / 3 # classic division returns a float
5.666666666666667
>>> 17 // 3 # floor division
5>
>> 23%3 #Modulus operator
2
Python Operators
+ Addition 4+5 9
- Subtraction 8-5 3
* Multiplication 4*5 20
% Modulus 19%3 5
** Exponent 2**4 16
Comments in Python:
8
BS (Software Engineering) 2024
word1 = "Good"
word2 = "Morning"
word3 = "to you too!"
print (word1, word2)
sentence = word1 + " " + word2 + " " +word3
print (sentence)
Relational operators
Expression Function
!= not equal to
== is equal to
Boolean Logic:
Boolean logic is used to make more complicated conditions for if statements that rely on more
than one condition. Python’s Boolean operators are and, or, and not. The and operator takes
two arguments, and evaluates as True if, and only if, both of its arguments are True. Otherwise
it evaluates to False. The or operator also takes two arguments. It evaluates if either (or both)
9
BS (Software Engineering) 2024
of its arguments are False. Unlike the other operators we’ve seen so far, not only takes one
argument and inverts it. The result of not True is False, and not False is True.
Operator Precedence
Operator Description
() Parentheses
Conditional Statements
‘if' - Statement
y=1
if y == 1:
print ("y still equals 1, I was just checking")
‘if - else' - Statement
a=1
if a > 5:
print ("This shouldn't happen.")
else:
10
BS (Software Engineering) 2024
11
BS (Software Engineering) 2024
Lab Task
TASK 1:
Write a program that first displays a simple cafe menu (see example below), asks the user to
enter the number of a choice, and either prints the appropriate action OR prints an error
message that their choice was not valid.
Example output:
1. Soup and salad
2. Pasta with meat sauce
3. Chef's special
Which number would you like to order? 2 One Pasta with meat sauce coming right up!
Another example output:
1. Soup and salad
2. Pasta with meat sauce
3. Chef's special
Which number would you like to order? 5
Sorry, that is not a valid choice.
TASK 2:
Once upon a time in Apple land, John had three apples, Mary had five apples, and Adam had
six apples. They were all very happy and lived for a long time. End of story.
12
BS (Software Engineering) 2024
13
BS (Software Engineering) 2024
Objectives
To learn and implement loops, functions and classes
Theoretical Description
The 'while' loop
a=0
while a < 10:
a=a+1
print (a )
Range function:
Range(5) #[0,1,2,3,4]
Range(1,5) #[1,2,3,4]
Range(1,10,3) #[1,4,7]
The 'for' loop
14
BS (Software Engineering) 2024
a=x+y
b=x-y
return a,b
result1, result2 = add_sub(5,10)
print(result1, result2)
def multiplybytwo(x):
return x*2
a = multiplybytwo(70)
The computer would actually see this:
a=140
Define a Function?
def function_name(parameter_1,parameter_2):
{this is the code in the function}
return {value (e.g. text or number) to return to the main program}
range() Function:
If you need to iterate over a sequence of numbers, the built-in function range() comes in
handy. It generates iterator containing arithmetic progressions:
>>> range(10) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
It is possible to let the range start at another number, or to specify a different increment (even
negative; sometimes this is called the ‘step’):
>>> list(range(5, 10))
[5, 6, 7, 8, 9]
>>> list(range(0, 10, 3) )
[0, 3, 6, 9]
>>> list(range(-10, -100, -30) )
[-10, -40, -70]
The range() function is especially useful in loops.
Classes & Inheritance
The word 'class' can be used when describing the code where the class is defined.
A variable inside a class is known as an Attribute
A function inside a class is known as a method
• A class is like a
15
BS (Software Engineering) 2024
– Prototype
– Blue-print
– An object creator
• A class defines potential objects
– What their structure will be
– What they will be able to do
• Objects are instances of a class
– An object is a container of data: attributes
– An object has associated functions: methods
Syntax:
# Defining a class
class class_name:
[statement 1]
[statement 2]
[statement 3] [etc]
Inheritance Syntax:
class child_class(parent_class):
def __init__(self,x):
# it will modify the _init_ function from parent class
# additional methods can be defined here
‘self’ keyword:
The first argument of every class method, including __init__, is always a reference to the
current instance of the class. By convention, this argument is always named self. In the
__init__ method, self refers to the newly created object; in other class methods, it refers to
the instance whose method was called.
Anytime you create an object of the class, the first thing it does before going on to any other
line is it
looks for init function and it calls whatever is written in here. You don’t have to call it
explicitly like
any other function
Example1:
class MyClass:
16
BS (Software Engineering) 2024
i = 12345
def f(self):
return 'hello world'
x = MyClass()
print (x.i)
print (x.f())
Example2:
class Complex:
def __init__(self, realpart, imagpart):
self.r = realpart
self.i = imagpart
x = Complex(3.0, -4.5)
print (x.r," ",x.i )
Example3:
class Shape:
def __init__(self,x,y): #The __init__ function always runs
first
self.x = x
self.y = y
description = "This shape has not been described yet"
author = "Nobody has claimed to make this shape yet"
def area(self):
return self.x * self.y
def perimeter(self):
return 2 * self.x + 2 * self.y
def describe(self,text):
self.description = text
def authorName(self,text):
self.author = text
def scaleSize(self,scale):
self.x = self.x * scale
self.y = self.y * scale
17
BS (Software Engineering) 2024
a=Shape(3,4)
print (a.area())
Inheritance Example:
class Square(Shape):
def __init__(self,x):
self.x = x
self.y = x
class DoubleSquare(Square):
def __init__(self,y):
self.x = 2 * y
self.y = y
def perimeter(self):
return 2 * self.x + 2 * self.y
Module
A module is a python file that (generally) has only definitions of variables, functions, and
classes.
Example: Module name mymodule.py
# Define some variables:
ageofqueen = 78
# define some functions
def printhello():
print ("hello")
# define a class
class Piano:
def __init__(self):
self.type = input("What type of piano?: ")
self.height = input("What height (in feet)?: ")
self.price = input("How much did it cost?: ")
self.age = input("How old is it (in years)?: ")
def printdetails(self):
print ("This piano is a/an " + self.height + " foot")
print (self.type, "piano, " + self.age, "years old and costing " +
18
BS (Software Engineering) 2024
Lab Task
TASK 1
Write a program to find the largest of ten numbers provided by user, using functions.
TASK 2
Create a class name basic_calc with following attributes and methods;
Two integers (values are passed with instance creation)
Different methods such as addition, subtraction, division, multiplication
19
BS (Software Engineering) 2024
Theoretical Description
Strings
Indexes of String
Characters in a string are numbered with indexes starting at 0:
Example:
name = "J. Smith”
Index 0 1 2 3 4 5 6 7
Character J . - S m i t h
String Properties
20
BS (Software Engineering) 2024
Most of our variables have one value in them - when we put a new value in the variable, the
old value is overwritten
x=2
x=4
print(x)
4A
collection allows us to put many values in a single “variable”
A collection is nice because we can carry all many values around in one convenient package.
Strings are “immutable” - we cannot change the contents of a string - we must make a new
string to make any change Lists are “mutable” - we can change an element of a list using the
index operator Lists are what they seem - a list of values. Each one of them is numbered,
starting from zero. You can remove values from the list, and add new values to the end.
Example: Your many cats' names. Compound data types, used to group together other values.
The most versatile is the list, which can be written as a list of comma-separated values
21
BS (Software Engineering) 2024
(items) between square brackets. List items need not all have the same type.
>>> num = [1,2,3]
>>> names = ['Talal', 'Husnain', 'Saeed', 'Aezid']
>>> hybrid = [5,5.6,'text']
>>> combined = [num,names,hybrid]
>>> combined
[[1, 2, 3], ['Talal', 'Husnain', 'Saeed', 'Aezid'], [5, 5.6, 'text']]
>>>
cats = ['Tom', 'Snappy', 'Kitty', 'Jessie', 'Chester']
print (cats[2])
cats.append(‘Oscar’)
print (len(cats))
#Remove 2nd cat, Snappy.
del cats[1]
Compound datatype:
>>> a = ['spam', 'eggs', 100, 1234]
A[:3]
A[3:]
>>> a[1:-1] #start at element at index 1, end before last element
['eggs', 100]
>>> a[:2] + ['bacon', 2*2]
['spam', 'eggs', 'bacon', 4]
>>> 3*a[:3] + ['Boo!']
['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
>>> a= ['spam', 'eggs', 100, 1234]
>>> a[2] = a[2] + 23
>>> a
['spam', 'eggs', 123, 1234]
Replace some items:
>>> a[0:2] = [1, 12]
>>> a
[1, 12, 123, 1234]
22
BS (Software Engineering) 2024
Remove some:
>>> a[0:2] = []
>>> a
[123, 1234]
Clear the list: replace all items with an empty list:
>>> a[:] = []
>>> a
[]
Length of list:
>>> a = ['a', 'b', 'c', 'd']
>>> len(a)
4
Nested lists
>>> q = [2, 3]
>>> p = [1, q, 4]
>>> len(p)
3>
>> p[1]
[2, 3].
Del nums [3:]
This is used to remove multiple values and this will remove from values after
index number 3
Functions of lists
list.append(x): Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L): Extend the list by appending all the items in the given list; equivalent to
a[len(a):]
= L.
list.insert(i, x): Insert an item at a given position. The first argument is the index of the
element before which to insert, so a.insert(0, x) inserts at the front of the list.
list.remove(x): Remove the first item from the list whose value is x. It is an error if there is
no such item. (based on number you entered)
23
BS (Software Engineering) 2024
list.pop(i): Remove the item at the given position in the list, and return it. If no index is
specified, a.pop() removes and returns the last item in the list. (based on index number you
entered) If you don’t specify the index number the last element will be removed. Concept of
stack (LIFO)
list.count(x): Return the number of times x appears in the list.
list.sort(): Sort the items of the list, in place.
list.reverse(): Reverse the elements of the list, in place.
Tuples (imutable)
Tuples are just like lists, but you can't change their values. Again, each value is numbered
starting from zero, for easy reference. Example: the names of the months of the year.
Square brackets are used for list so parenthesis are used for tuples
months = ('January' , 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September',
'October', 'November', 'December')
Index Value
0 January
1 February
2 March
3 April
4 May
5 June
6 July
7 August
8 September
9 October
10 November
Sets
A set is an unordered collection with no duplicate elements. Basic uses include membership
testing and eliminating duplicate entries. Set objects also support mathematical operations
like union, intersection, difference, and symmetric difference. Curly braces or the set()
24
BS (Software Engineering) 2024
function can be used to create sets. Note: to create an empty set you have to use set(), not {};
the latter creates an empty dictionary.
Example 1:
>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> fruit = set(basket) # create a set without duplicates
>>> fruit
{'banana', 'orange', 'pear', 'apple' }
>>> 'orange' in fruit # fast membership testing
True
>>> 'crabgrass' in fruit
False
Example 2:
>>> # Demonstrate set operations on unique letters from two words
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a # unique letters in a
{'a', 'r', 'b', 'c', 'd'}
>>> a - b # letters in a but not in b
{'r', 'd', 'b'}
>>> a | b # letters in either a or b
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
>>> a & b # letters in both a and b
{'a', 'c'}
>>> a ^ b # letters in a or b but not both
{'r', 'd', 'b', 'm', 'z', 'l'}
Set comprehensions are also supported:
>>> a = {x for x in 'abracadabra' if x not in 'abc'}
>>> a
{'r', 'd'}
Dictionaries
Dictionaries are similar to what their name suggests - a dictionary. In a dictionary, you have
an 'index' of words, and for each of them a definition. In python, the word is called a 'key',
25
BS (Software Engineering) 2024
and the definition a 'value'. The values in a dictionary aren't numbered - they aren't in any
specific order, either - the key does the same thing. You can add, remove, and modify the
values in dictionaries. Example: telephone book. The main operations on a dictionary are
storing a value with some key and extracting the value given the key. It is also possible to
delete a key:value pair with del. If you store using a key that is already in use, the old value
associated with that key is forgotten. It is an error to extract a value using a non-existent key.
Performing list(d.keys()) on a dictionary returns a list of all the keys used in the dictionary, in
arbitrary order (if you want it sorted, just use sorted(d.keys()) instead). To check whether a
single key is in the dictionary, use the in keyword. At one time, only one value may be stored
against a particular key. Storing a new value for an existing key overwrites its old value. If
you need to store more than one value for a particular key, it can be done by storing a list as
the value for a key.
phonebook = {'ali':8806336, 'omer':6784346,'shoaib':7658344, 'saad':1122345}
#Add the person '' to the phonebook:
phonebook['waqas'] = 1234567
print("Original Phonebook")
print(phonebook)
# Remove the person 'shoaib' from the phonebook:
del phonebook['shoaib']
print("'shoaib' deleted from phonebook")
print(phonebook)
phonebook = {'Andrew Parson':8806336, \
'Emily Everett':6784346, 'Peter Power':7658344, \
'Louis Lane':1122345}
print("New phonebook")
print(phonebook)
#Add the person 'Gingerbread Man' to the phonebook:
phonebook['Gingerbread Man'] = 1234567
list(phonebook.keys())
sorted(phonebook.keys())
print( 'waqas' in phonebook)
print( 'Emily Everett' in phonebook)
26
BS (Software Engineering) 2024
Lab Task
TASK1
Write Python Program to Calculate the Length of a String Without Using Built-In
len() Function.
TASK 2
Write a program that creates a list of 10 random integers. Then create two lists by
name odd_list and even_list that have all odd and even values of the list respectively.
27
BS (Software Engineering) 2024
Objectives
To learn and implement intelligent agents using Python
Theoretical Description
Agents and Environment
An AI system is composed of an agent and its environment. The agents act in their
environment. The environment may contain other agents.
An agent is anything that can perceive its environment through sensors and acts upon that
environment through effectors.
● A human agent has sensory organs such as eyes, ears, nose, tongue and skin parallel to the
sensors, and other organs such as hands, legs, mouth, for effectors.
● A robotic agent replaces cameras and infrared range finders for the sensors, and various
motors and actuators for effectors.
● A software agent has encoded bit strings as its programs and actions.
Agent Terminology
28
BS (Software Engineering) 2024
Types of Agents
1. Table-driven agent
2. Reflex agent
3. Model-based reflex agent
4. Goal-based agent
5. Utility-based agent
6. Learning agent
Simple Reflex Agent
Simple reflex agents act only on the basis of the current percept, ignoring the rest of the
percept history. The agent function is based on the condition-action rule: if condition then
action. This agent function only succeeds when the environment is fully observable. Some
reflex agents can also contain information on their current state which allows them to
disregard conditions whose actuators are already triggered.
Pseudocode
Example: Vacuum cleaner Consider the vacuum world, this particular world has just two
locations: squares A and B. The vacuum agent perceives which square it is in and whether
there is dirt in the square. It can choose to move left, move right, suck up the dirt, or do
nothing. Agent function is the following: if the current square is dirty, then suck, otherwise
move to the other square.
Write a model based reflex agent for the vacuum cleaner. (Hint: Agent has initial states
knowledge)
If the current square is dirty, then suck; otherwise, move to the other square.
Initial state is 1, where square A and Square B, both are dirty.
Pseudocode to the problem is as follows;
function Reflex-Vacuum-Agent( [location,status]) returns an action
static: last A, last B, numbers, initially ∞
if status = Dirty then and so on
Code
class ModelBasedVacuumAgent():
def __init__(self,init_a,init_b):
self.model = {"Loc_a" : init_a, "Loc_b" : init_b}
29
BS (Software Engineering) 2024
Lab Task
This particular world has just two locations: squares A and B. The vacuum agent perceives
which square it is in and whether there is dirt in the square. It can choose to move left, move
right, suck up the dirt, or do nothing. One very simple agent function is the following: if the
current square is dirty, then suck, otherwise move to the other square.
Write a simple reflex agent for the vacuum cleaner. (Hint: Agent has no initial states
knowledge) If the current square is dirty, then suck; otherwise, move to the other square.
Pseudocode to the task is as follows; function Reflex-Vacuum-Agent( [location,status])
returns an action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Left
30
BS (Software Engineering) 2024
Objectives
To learn and Implement Depth-First-Search algorithm
Theoretical Description
Depth-first search
Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data
structures. One starts at the root (selecting some arbitrary node as the root in the case of a
graph) and
explores as far as possible along each branch before backtracking.
Pseudocode:
Input: A graph G and a vertex v of G
Output: All vertices reachable from v labelled as discovered
A recursive implementation of DFS:
1 procedure DFS(G,v):
2 label v as discovered
3 for all edges from v to w in G.adjacentEdges(v) do
4 if vertex w is not labeled as discovered then
5 recursively call DFS(G,w)
A non-recursive implementation of DFS:
1 procedure DFS-iterative(G,v):
2 let S be a stack
3 S.push(v)
4 while S is not empty
5 v = S.pop()
6 if v is not labeled as discovered:
7 label v as discovered
8 for all edges from v to w in G.adjacentEdges(v) do
9 S.push(w)
Iterative Deepening Depth-first search
Iterative Deepening Depth-first search (ID-DFS) is a state space/graph search strategy in
31
BS (Software Engineering) 2024
which a depth-limited version of depth-first search is run repeatedly with increasing depth
limits until the goal is found. IDDFS is equivalent to breadth-first search, but uses much less
memory; on each iteration, it visits the nodes in the search tree in the same order as depth-
first search, but the cumulative order in which nodes are first visited is effectively breadth-
first.
Pseudocode
Set all nodes to "not visited";
s = new Stack(); ******* Change to use a stack
s.push(initial node); ***** Push() stores a value in a stack
while ( s ≠ empty ) do
{
x = s.pop(); ****** Pop() remove a value from the stack
if ( x has not been visited )
{
visited[x] = true; // Visit node x !
for ( every edge (x, y) /* we are using all edges ! */ )
if ( y has not been visited )
s.push(y); ***** Use push() !
}
}
Breadth-first search
Breadth-first search (BFS) is an algorithm that is used to graph data or searching tree or
traversing structures. The full form of BFS is the Breadth-first search. The algorithm
efficiently visits and marks all the key nodes in a graph in an accurate breadthwise fashion.
This algorithm selects a single node (initial or source point) in a graph and then visits all the
nodes adjacent to the selected node. Remember, BFS accesses these nodes one by one. Once
the algorithm visits and marks the starting node, then it moves towards the nearest unvisited
nodes and analyses them. Once visited, all nodes are marked. These iterations continue until
all the nodes of the graph have been successfully visited and marked.
Graph traversals
A graph traversal is a commonly used methodology for locating the vertex position in the
graph. It is an advanced search algorithm that can analyze the graph with speed and precision
32
BS (Software Engineering) 2024
along with marking the sequence of the visited vertices. This process enables you to quickly
visit each node in a graph without being locked in an infinite loop.
How BFS works:
1. Graph traversal requires the algorithm to visit, check, and/or update every single
un-visited node in a tree-like structure. Graph traversals are categorized by the order in
which they visit the nodes on the graph.
2. BFS algorithm starts the operation from the first or starting node in a graph and traverses
it thoroughly. Once it successfully traverses the initial node, then the next non-traversed
vertex in the graph is visited and marked.
3. Hence, you can say that all the nodes adjacent to the current vertex are visited and
traversed in the first iteration. A simple queue methodology is utilized to implement the
working of a BFS algorithm
Pseudocode
Set all nodes to "not visited";
q = new Queue();
q.enqueue(initial node);
while ( q ≠ empty ) do
{
x = q.dequeue();
if ( x has not been visited )
{
visited[x] = true; // Visit node x !
for ( every edge (x, y) /* we are using all edges ! */ )
if ( y has not been visited )
q.enqueue(y); // Use the edge (x,y) !!!
}
}
33
BS (Software Engineering) 2024
Lab Task
Task 1
Use BFS to find the shortest path between A and F. (Hint: the distance between any
consecutive vertices is 1, i.e. distance between A and D is 2 ((A to B=1) + (B to D=1) = 2)
Task 2
Using DFS, check if there is any path exists between any two nodes? Also the return the path.
e.g. If user two vertices i.e. 2 and 1; the program should return : Yes the paths exist, which
are [2,1],[2,0,1].
34
BS (Software Engineering) 2024
Objectives
To learn about Python most widely used libraries in machine learning
Theoretical Description
Different Python Packages
NUMPY
NumPy is the cornerstone toolbox for scientific computing with Python. NumPy provides,
among other things, support for multidimensional arrays with basic operations on them and
useful linear algebra functions. Many toolboxes use the NumPy array representations as an
efficient basic data structure.
Examples
#importing numpy package
mport numpy as np
b=np.array([[[1,2,3,5],[2,3,4,4]],[[1,2,3,5],[2,3,4,4]]])
#printing the data type
print(b.dtype)
out: int32
#printing the dimension of the NumPy array
print(b.ndim)
Out: 3
#printing the shape the NumPy array
print(b.shape)
Out: (2, 2, 4)
# printing the size the NumPy array i.e. total number of elements
print(b.size)
out: 16
#to generate an array of numerical numbers from 10 to 100 with 2 steps
c=np.arange(10,100,2)
print(c)
35
BS (Software Engineering) 2024
36
BS (Software Engineering) 2024
of Pandas is a fast and efficient DataFrame object for data manipulation with integrated
indexing. The DataFrame structure can be seen as a spreadsheet which offers very flexible
ways of working with it. You can easily transform any dataset in the way you want, by
reshaping it and adding or removing columns or rows. It also provides high-performance
functions for aggregating, merging, and joining datasets. Pandas also has tools for importing
and exporting data from different formats: comma-separated value (CSV), text files,
Microsoft Excel, SQL databases, and the fast HDF5 format. In many situations, the data you
have in such formats will not be complete or totally structured. For such cases, Pandas offers
handling of missing data and intelligent data alignment. Furthermore, Pandas provides a
convenient Matplotlib interface.
Examples
#importing pandas library
import pandas as pd
#creating Pandas series
a=pd.Series([1,2,3,4],index=['a','b','c','d'])
print(a)
marks={"A":10,"B":20,"C":30}
print(marks)
grades={"A":2,"B":3,"C":5}
#converting dictionaries to the Pandas series
pd1=pd.Series(marks)
print(pd1)
pd2=pd.Series(grades)
#print(marks)
#print(pd1)
#Pandas DataFrame
pd3=pd.DataFrame({"marks":pd1,"grades":pd2})
print(pd3)
#adding dictionary to the Pandas DataFrame
pd3["percentage"]=pd3["marks"]/100
print(pd3)
#deleting from Pandas Dataframe
37
BS (Software Engineering) 2024
del pd3["percentage"]
#Thresholding
print(pd3[pd3['marks']>10])
print(pd3)
Matplotlib
Matplotlib produces publication-quality figures in a variety of hardcopy formats and
interactive
environments across platforms. Matplotlib can be used in Python scripts, the Python and
IPython shell, web application servers, and various graphical user interface toolkits.
matplotlib.pyplot is a collection of functions that make matplotlib work like MATLAB.
Majority of plotting commands in pyplot have MATLAB analogs with similar arguments.
Example
#importing Matplotlib.Pyplot
import matplotlib.pyplot as plt
x=np.linspace(0,10,1000)
#conitonus plotting
plt.plot(x,np.sin(x), color="red")
# Discrete Plotting
plt.scatter(x[0:20],np.sin(x[0:20]), color="red")
plt.xlabel("x")
plt.ylabel("y")
plt.title("sine")
plt.show()
Lab Task
TASK 1
Write a NumPy program to create a random 10x4 array and extract the first five rows of the
array and store them into a variable.
TASK 2
Write a Pandas program to select the rows where the number of attempts in the examination
is greater than 2.
38
BS (Software Engineering) 2024
TASKS 3
From the sample data given in TASK 2; write a program to calculate the average of the
scores. The program should be able to ignore NaN values.
39
BS (Software Engineering) 2024
Objectives
To understand machine learning and deep learning and deep learning frameworks in
Python
Theoretical Description
Our imaginations have long been captivated by visions of machines that can learn and imitate
human intelligence. Software programs that can acquire new knowledge and skills through
experience are becoming increasingly common. We use such machine learning programs to
discover new music that we might enjoy, and to find exactly the shoes we want to purchase
online. Machine learning programs allow us to dictate commands to our smart phones, and
allow our thermostats to set their own temperatures. Machine learning programs can decipher
sloppily-written mailing addresses better than humans, and can guard credit cards from fraud
more vigilantly. From investigating new medicines to estimating the page views for versions
of a headline, machine learning software is becoming central to many industries. Machine
learning has even encroached on activities that have long been considered uniquely human,
such as writing the sports column recapping the Duke basketball team's loss to UNC.
Learning from experience
Machine learning systems are often described as learning from experience either with or
without
supervision from humans. In supervised learning problems, a program predicts an output for
an input by learning from pairs of labeled inputs and outputs. That is, the program learns
from examples of the "right answers". In unsupervised learning, a program does not learn
from labeled data. Instead, it attempts to discover patterns in data. For example, assume that
you have collected data describing the heights and weights of people. An example of an
unsupervised learning problem is dividing the data points into groups. A program might
produce groups that correspond to men and women, or children and adults. Now assume that
the data is also labeled with the person's sex. An example of a supervised learning problem is
to induce a rule for predicting whether a person is male or female based on his or her height
and weight. We will discuss algorithms and examples of supervised and unsupervised
40
BS (Software Engineering) 2024
41
BS (Software Engineering) 2024
Deep Learning
Deep learning is a specific subset of Machine Learning, which is a specific subset of
Artificial
Intelligence. For individual definitions:
● Artificial Intelligence is the broad mandate of creating machines that can think
intelligently
● Machine Learning is one way of doing that, by using algorithms to glean insights from
data (see our gentle introduction here)
● Deep Learning is one way of doing that, using a specific algorithm called a Neural
Network
Don’t get lost in the taxonomy – Deep Learning is just a type of algorithm that seems to work
really well for predicting things. Deep Learning and Neural Nets, for most purposes, are
effectively synonymous. If people try to confuse you and argue about technical definitions,
don’t worry about it: like Neural Nets, labels can have many layers of meaning.
Neural networks are inspired by the structure of the cerebral cortex. At the basic level is the
perceptron, the mathematical representation of a biological neuron. Like in the cerebral
cortex, there can be several layers of interconnected perceptrons. Input values, or in other
words our underlying data, get passed through this “network” of hidden layers until they
eventually converge to the output layer. The output layer is our prediction: it might be one
node if the model just outputs a number, or a few nodes if it’s a multiclass classification
problem. The hidden layers of a Neural Net perform modifications on the data to eventually
feel out what its relationship with the target variable is. Each node has a weight, and it
multiplies its input value by that weight. Do that over a few different layers, and the Net is
able to essentially manipulate the data into something meaningful.
What is TensorFlow?
“TensorFlow is an open-source machine learning library for research and production.
TensorFlow offers APIs for beginners and experts to develop for desktop, mobile, web, and
cloud.” - TensorFlow Website
What is Keras?
“Keras is a high-level neural networks API, written in Python and capable of running on top
of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast
experimentation. Being able to go from idea to result with the least possible delay is key to
42
BS (Software Engineering) 2024
Lab Task
1. Difference Supervised machine learning and Unsupervised machine learning?
2. Difference between Classification problem and Regression problem?
3. Difference between machine learning and deep learning?
4. Difference between Keras and Tensorflow frameworks?
43
BS (Software Engineering) 2024
Objectives
To learn and Implement K-Nearest Neighbor machine learning classifier.
Theoretical Description
K-Nearest Neighbors (KNN)
KNN is a simple model for regression and classification tasks. It is so simple that its name
describes most of its learning algorithm. The titular neighbors are representations of training
instances in a metric space. A metric space is a feature space in which the distances between
all members of a set are defined. In the previous chapter's pizza problem, our training
instances were represented in a metric space because the distances between all the pizza
diameters was defined. These neighbors are used to estimate the value of the response
variable for a test instance. The hyperparameter k specifies how many neighbors can be used
in the estimation. A hyperparameter is a parameter that controls how the algorithm learns;
hyperparameters are not estimated from the training data and are sometimes set manually.
Finally, the k neighbors that are selected are those that are nearest to the test instance, as
measured by some distance function.
For classification tasks, a set of tuples of feature vectors and class labels comprise the
training set. KNN is a capable of binary, multi-class, and multi-label classification; we will
define these tasks later, and we will focus on binary classification in this chapter. The
simplest KNN classifiers use the mode of the KNN labels to classify test instances, but other
strategies can be used. The k is often set to an odd number to prevent ties.
In regression tasks, the feature vectors are each associated with a response variable that takes
a
real-valued scalar instead of a label. The prediction is the mean or weighted mean of the
KNN response variables.
Implementation:
Dataset
The data set contains 3 classes of 50 instances each, where each class refers to a type of iris
plant.
44
BS (Software Engineering) 2024
Attribute Information:
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class:
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
Code:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
data=pd.read_csv(r'F:\path\iris_data_2.csv')
print(data.head())
x=data.drop('species','columns')
y=data['species']
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.20)
#print(x_train)
scaler=StandardScaler()
scaler.fit(x_train)
#print(x_train)
x_train=scaler.transform(x_train)
x_test=scaler.transform(x_test)
print(x_train)
print(x_test)
from sklearn.neighbors import KNeighborsClassifier
classifier=KNeighborsClassifier(n_neighbors=3)
classifier.fit(x_train,y_train)
result=classifier.predict(x_test)
print(result)
45
BS (Software Engineering) 2024
Lab Task
Implement support vector machine for the same classification problem.
46
BS (Software Engineering) 2024
Objectives
To learn and Implement K-Nearest Neighbor regressor
Theoretical Description
K-Nearest Neighbors (KNN)
KNN is a simple model for regression and classification tasks. It is so simple that its name
describes most of its learning algorithm. The titular neighbors are representations of training
instances in a metric space. A metric space is a feature space in which the distances between
all members of a set are defined. In the previous chapter's pizza problem, our training
instances were represented in a metric space because the distances between all the pizza
diameters was defined. These neighbors are used to estimate the value of the response
variable for a test instance. The hyperparameter k specifies how many neighbors can be used
in the estimation. A hyperparameter is a parameter that controls how the algorithm learns;
hyperparameters are not estimated from the training data and are sometimes set manually.
Finally, the k neighbors that are selected are those that are nearest to the test instance, as
measured by some distance function.
For classification tasks, a set of tuples of feature vectors and class labels comprise the
training set. KNN is a capable of binary, multi-class, and multi-label classification; we will
define these tasks later, and we will focus on binary classification in this chapter. The
simplest KNN classifiers use the mode of the KNN labels to classify test instances, but other
strategies can be used. The k is often set to an odd number to prevent ties. In regression tasks,
the feature vectors are each associated with a response variable that takes a real-valued scalar
instead of a label. The prediction is the mean or weighted mean of the KNN response
variables.
Lab Task
Implement linear regression and Decision Tree for the same regression problem.
47
BS (Software Engineering) 2024
Theoretical Description
Support Vector Machines (SVM)
Support Vector Machines are supervised learning models used for both classification and
regression tasks. The main idea of SVM for regression (SVR) is to find a function that
deviates from the actual observed responses by a value no greater than epsilon (ε) for each
training point, and at the same time, is as flat as possible. This can be formulated as an
optimization problem, where the goal is to minimize the model complexity while penalizing
deviations larger than epsilon using a slack variable. The result is a robust model that can
generalize well to unseen data.
Key concepts in SVM for regression include:
Kernel Trick: This allows SVM to fit the model in higher-dimensional space without
explicitly transforming the data into that space.
Hyperparameters: Regularization parameter (C) and the kernel parameters (like
gamma for the RBF kernel) control the trade-off between model complexity and
training error.
Decision Tree
Decision Trees are non-parametric models used for regression and classification tasks. In the
context of regression, a Decision Tree splits the data into subsets based on feature values that
result in the most homogeneous sets, measured by the variance reduction criterion. Each split
is made to minimize the variance of the response variable in the resulting subsets. The final
prediction is the mean value of the response variable in the corresponding terminal node of
the tree.
Key concepts in Decision Tree regression include:
Tree Depth: Controls the complexity of the model; deeper trees can capture more
detail but are prone to overfitting.
48
BS (Software Engineering) 2024
49
BS (Software Engineering) 2024
plt.xlabel('data')
plt.ylabel('target')
plt.title('Support Vector Regression')
plt.legend()
plt.show()
Task 2: Implementing Decision Tree for Regression
from sklearn.tree import DecisionTreeRegressor
# Train Decision Tree model
tree_reg = DecisionTreeRegressor(max_depth=5)
tree_reg.fit(X_train, y_train)
# Predict and evaluate
y_pred_tree = tree_reg.predict(X_test)
mse_tree = mean_squared_error(y_test, y_pred_tree)
print(f"Mean Squared Error for Decision Tree: {mse_tree}")
# Plotting
plt.scatter(X, y, color='darkorange', label='data')
plt.plot(X_test, y_pred_tree, color='green', lw=2, label='Decision Tree model')
plt.xlabel('data')
plt.ylabel('target')
plt.title('Decision Tree Regression')
plt.legend()
plt.show()
Task 3: Comparing Performance
Compare Mean Squared Errors: Evaluate the performance of each model using the mean
squared error.
Visualize Predictions: Plot the predictions of both models and compare how well they fit the
data.
Discuss Overfitting: Analyze if any of the models show signs of overfitting and suggest
methods to mitigate it (e.g., pruning for Decision Trees or adjusting hyperparameters for
SVM).
50
BS (Software Engineering) 2024
Objectives
To learn and Implement K-mean clustering machine learning technique.
Theoretical Description
Unsupervised machine learning
In machine learning, the problem of unsupervised learning is that of trying to find hidden
structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no
error or reward signal to evaluate the goodness of a potential solution. This distinguishes
unsupervised from supervised learning. Unsupervised learning is defined as the task
performed by algorithms that learn from a training set of unlabeled or unannotated examples,
using the features of the inputs to categorize them according to some geometric or statistical
criteria. Unsupervised learning encompasses many techniques that seek to summarize and
explain key features or structures of the data. Many methods employed in unsupervised
learning are based on data mining methods used to preprocess data. Most unsupervised
learning techniques can be summarized as those that tackle the following four groups of
problems:
• Clustering: has as a goal to partition the set of examples into groups.
• Dimensionality reduction: aims to reduce the dimensionality of the data. Here, we
encounter
techniques such as Principal Component Analysis (PCA), independent component analysis,
and
nonnegative matrix factorization.
• Outlier detection: has as a purpose to find unusual events (e.g., a malfunction), that
distinguish part of the data from the rest according to certain criteria.
• Novelty detection: deals with cases when changes occur in the data (e.g., in streaming
data). The most common unsupervised task is clustering, which we focus on in this Lab.
Clustering
Clustering is a process of grouping similar objects together; i.e., to partition unlabeled
examples into
51
BS (Software Engineering) 2024
Lab Task
Use sepal_length and sepal_width as features from the iris_dataset and apply K-mean
clustering.
52
BS (Software Engineering) 2024
Objectives
To learn and implement ANN in Python.
Theoretical Description
Artificial Neural Networks
An Artificial Neural Network (ANN) is a computational model and architecture that
simulates biological neurons and the way they function in our brain. Typically, an ANN has
layers of interconnected nodes. The nodes and their inter-connections are analogous to the
network of neurons in our brain. A typical ANN has an input layer, an output layer, and at
least one hidden layer between the input and output with inter-connections.
Any basic ANN will always have multiple layers of nodes, specific connection patterns and
links between the layers, connection weights and activation functions for the nodes/neurons
that convert weighted inputs to outputs. The process of learning for the network typically
involves a cost function and the objective is to optimize the cost function (typically minimize
the cost). The weights keep getting updated in the process of learning.
Backpropagation
The backpropagation algorithm is a popular technique to train ANNs and it led to a
resurgence in the
popularity of neural networks in the 1980s. The algorithm typically has two main stages—
propagation and weight updates. They are described briefly as follows.
1. Propagation
a. The input data sample vectors are propagated forward through the neural network to
generate the output values from the output layer.
b. Compare the generated output vector with the actual/desired output vector for that input
data vector.
c. Compute difference in error at the output units.
d. Backpropagate error values to generate deltas at each node/neuron.
2. Weight Update
a. Compute weight gradients by multiplying the output delta (error) and input activation.
b. Use learning rate to determine percentage of the gradient to be subtracted from original
53
BS (Software Engineering) 2024
Lab Task
Change the number of the layers and neurons and observe if the score improves or not.
54
BS (Software Engineering) 2024
Objectives
To learn evaluation metrics to evaluate machine learning algorithms.
Theoretical Description
Confusion Matrix
Confusion Matrix as the name suggests gives us a matrix as output and describes the
complete performance of the
model. Lets assume we have a binary classification problem. We have some samples
belonging to two classes: YES or NO. There are 4 important terms :
● True Positives : The cases in which we predicted YES and the actual output was also YES.
● True Negatives : The cases in which we predicted NO and the actual output was NO.
● False Positives : The cases in which we predicted YES and the actual output was NO.
● False Negatives : The cases in which we predicted NO and the actual output was YES.
Accuracy
Accuracy for the matrix can be calculated by taking average of the values lying across the
“main diagonal” i.e
Confusion Matrix forms the basis for the other types of metrics.
Area Under Curve
Area Under Curve(AUC) is one of the most widely used metrics for evaluation. It is used for
binary classification
problem. AUC of a classifier is equal to the probability that the classifier will rank a
randomly chosen positive example higher than a randomly chosen negative example. Before
defining AUC, let us understand two basic terms :
● True Positive Rate (Sensitivity) : True Positive Rate is defined as TP/ (FN+TP). True
Positive Rate
corresponds to the proportion of positive data points that are correctly considered as positive,
with
respect to all positive data points.
● True Negative Rate (Specificity) : True Negative Rate is defined as TN / (FP+TN). False
55
BS (Software Engineering) 2024
Positive
Rate corresponds to the proportion of negative data points that are correctly considered as
negative,
with respect to all negative data points.
● False Positive Rate : False Positive Rate is defined as FP / (FP+TN). False Positive Rate
corresponds
to the proportion of negative data points that are mistakenly considered as positive, with
respect to all
negative data points.
False Positive Rate and True Positive Rate both have values in the range [0, 1]. FPR and TPR
both are computed at varying threshold values such as (0.00, 0.02, 0.04, …., 1.00) and a
graph is drawn. AUC is the area under the curve of plot False Positive Rate vs True Positive
Rate at different points in [0, 1].
F1 Score
F1 Score is used to measure a test’s accuracy
F1 Score is the Harmonic Mean between precision and recall. The range for F1 Score is [0,
1]. It tells you how precise your classifier is (how many instances it classifies correctly), as
well as how robust it is (it does not miss a significant number of instances). High precision
but lower recall, gives you an extremely accurate, but it then misses a large number of
instances that are difficult to classify. The greater the F1 Score, the better is the performance
of our model.
F1 Score tries to find the balance between precision and recall.
● Precision : It is the number of correct positive results divided by the number of positive
results
predicted by the classifier.
● Recall : It is the number of correct positive results divided by the number of all relevant
samples (all
samples that should have been identified as positive).
Lab Task
Implement this code:
56
BS (Software Engineering) 2024
cm=confusion_matrix(y_test,result)
tn=cm[0,0]
fp=cm[0,1]
print(confusion_matrix(y_test,result))
print('auc: ',roc_auc_score(y_test,result))
plot_roc_curve(classifier,x_test,y_test)
plt.show()
print(classification_report(y_test,result))
print(accuracy_score(y_test,result))
57
BS (Software Engineering) 2024
Objectives
Understand the basic concepts and principles of Reinforcement Learning (RL).
Explore the structure and components of RL algorithms, including agents,
environments, states, actions, and rewards.
Implement a simple RL algorithm in Python to solve a basic problem.
Analyze the performance of the RL algorithm and visualize the learning
process.
Theoretical Background:
Reinforcement learnin Reinforcement Learning (RL) is a type of machine learning where an
agent learns to make decisions by interacting with an environment. Unlike supervised learning,
which relies on labeled data, RL focuses on learning from the consequences of actions, using
rewards and punishments as signals for positive and negative behavior. The core components
of an RL system include:
Agent: The learner or decision-maker.
Environment: The external system with which the agent interacts.
State: A representation of the current situation of the agent.
Action: A set of all possible moves the agent can make.
Reward: Feedback from the environment to evaluate the agent's action.
Policy: A strategy used by the agent to decide actions based on the current state.
Value Function: A function that estimates the expected reward of a state, helping the
agent to act optimally.
One of the simplest forms of RL is Q-learning, where the agent learns a value function that
gives the expected utility of taking a given action in a given state and following the optimal
policy thereafter. The Q-value is updated iteratively based on the Bellman equation:
𝑄(𝑠,𝑎)←𝑄(𝑠,𝑎)+𝛼[𝑟+𝛾max𝑎′𝑄(𝑠′,𝑎′)−𝑄(𝑠,𝑎)]Q(s,a)←Q(s,a)+α[r+γmaxa′Q(s′,a′)−Q(s,a)]
where 𝛼α is the learning rate, 𝛾γ is the discount factor, 𝑟r is the reward received, 𝑠s is the
current state, 𝑎a is the action taken, and 𝑠′s′ is the next state.
58
BS (Software Engineering) 2024
Lab Tasks:
1. Introduction to the Environment:
Choose a simple RL environment, such as the OpenAI Gym's "CartPole-v1".
Install the necessary libraries (gym and numpy).
2. Implementing Q-learning:
Initialize the Q-table with zeros.
Set up the learning parameters: learning rate (𝛼α), discount factor (𝛾γ), and
exploration-exploitation trade-off (epsilon).
Implement the Q-learning algorithm:
59
BS (Software Engineering) 2024
60