Python Notes BVL
Python Notes BVL
Lecture Notes on
Application Development Using Python
(18CS55)
Prepared by:
SHOBHA B S Asst. Prof.
Course Outcomes
At the end of the course the student will be able to:
1. Demonstrate proficiency in handling loops and creation of functions.
2. Identify the methods to create and manipulate lists, tuples and dictionaries.
3. Develop programs for string processing and file organization
4. Interpret the concepts of Object-Oriented Programming as used in Python.
Syllabus
Bangalore Institute of Technology Department of Robotics & AI
MODULE I
1.4 FUNCTIONS
Function calls, Built-in functions,Type conversion functions, Random numbers, Math functions, Adding
new functions , Definitions and uses, Flow of execution, Parameters and arguments , Fruitful functions
and void functions , Why functions?
MODULE I
operations specified by the program instructions. CPU will perform the given tasks with a
tremendous speed. Hence, the good programmer has to keep the CPU busy by providing
enough tasks to it.
Main Memory: It is the storage area to which the CPU has a direct access. Usually, the
programs stored in the secondary storage are brought into main memory before the execution.
The processor (CPU) will pick a job from the main memory and performs the tasks. Usually,
information stored in the main memory will be vanished when the computer is turned-off.
Software
Input and output Network
Devices CPU
Main
Memory
Secondary Memory: The secondary memory is the permanent storage of computer. Usually, the
size of secondary memory will be considerably larger than that of main memory. Hard disk,USB
drive etc can be considered as secondary memory storage.
I/O Devices: These are the medium of communication between the user and the computer.
Keyboard, mouse, monitor, printer etc. are the examples of I/O devices.
Network Connection: Nowadays, most of the computers are connected to network and hence
they can communicate with other computers in a network. Retrieving the information from
other computers via network will be slower compared to accessing the secondary memory.
Moreover, network is not reliable always due to problem in connection.
The programmer has to use above resources sensibly to solve the problem.
Usually, a programmer will be communicating with CPU by telling it „what to do next‟. The usage
of main memory, secondary memory, I/O devices also can be controlled by the programmer.
To communicate with the CPU for solving a specific problem, one has to write a set of instructions.
Such a set of instructions is called as a program.
Understanding Programming
A programmer must have skills to look at the data/information available about a problem, analyze
it and then to build a program to solve the problem. The skills to be possessed by a good programmer
includes –
Thorough knowledge of programming language: One needs to know the vocabulary and
grammar (technically known as syntax) of the programming language. This will help in
constructing proper instructions in the program.
Skill of implementing an idea: A programmer should be like a “story teller”. That is, he must
be capable of conveying something effectively. He/she must be able to solve the problem by
designing suitable algorithm and implementing it. And, the program must provide appropriate
output as expected.
Thus, the art of programming requires the knowledge about the problem’s requirement and the
strength/weakness of the programming language chosen for the implementation. It is always advisable
to choose appropriate programming language that can cater the complexity of the problem to be solved.
Words and Sentences
Every programming language has its own constructs to form syntax of the language.
Basic constructs of a programming language includes set of characters and keywords that it
supports.
The keywords have special meaning in any language and they are intended for doing specific task.
Python has a finite set of keywords as given in Table below.
Because, there are separate set of editors (IDE) available for different OS like Window, UNIX,
Ubuntu, Soloaris, Mac, etc. The basic Python can be downloaded from the link:
https://www.python.org/downloads/
Python has rich set of libraries for various purposes like large-scale data processing, predictive
analytics, scientific computing etc. Based on one‟s need, the required packages can be downloaded.
But, there is a free open source distribution Anaconda, which simplifies package management and
deployment.
Hence, it is suggested for the readers to install Anaconda from the below given link, rather than just
installing a simple Python.
https://anaconda.org/anaconda/python
Successful installation of anaconda provides you Python in a command prompt, the default editor
IDLE and also a browser-based interactive computing environment known as jupyter notebook.
The prompt >>> (usually called as chevron) indicates the system is ready to take Python
instructions.
If you would like to use the default IDE of Python, that is, the IDLE, then you can just run IDLE
and you will get the editor.
After understanding the basics of few editors of Python, let us start our communication with
Python, by saying Hello World. The Python uses print() function for displaying the contents.
Consider the following code –
Here, after typing the first line of code and pressing the enter key, we could able to get the output
of that line immediately. Then the prompt (>>>) is returned on the screen. This indicates, Python is
ready to take next instruction as input for processing.
Once we are done with the program, we can close or terminate Python by giving quit() command as
shown –
>>> quit() #Python terminates
Introduction to Python Programming Page 5
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
Here, x, y and z are variables storing respective values. As each line of code above is processed
immediately after the line, the variables are storing the given values.
Observe that, though each line is treated independently, the knowledge (or information) gained in the
previous line will be retained by Python and hence, the further lines can make use of previously used
variables.
Thus, each line that we write at the Python prompt are logically related, though they look
independent.
NOTE that, Python do not require variable declaration (unlike in C, C++, Java etc) before its use. One can
use any valid variable name for storing the values. Depending on the type (like number, string etc) of
the value being assigned, the type and behavior of the variable name is judged by Python.
Writing a Program
Introduction to Python Programming Page 6
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
As Python is interpreted language, one can keep typing every line of code one after the other (and
immediately getting the output of each line) as shown in previous section
. But, in real-time scenario, typing a big program is not a good idea. It is not easy to logically debug
such lines.
Hence, Python programs can be stored in a file with extension .py and then can be run using python
command.
Programs written within a file are obviously reusable and can be run whenever we want. Also, they
are transferrable from one machine to other machine via pen-drive, CD etc.
What is a Program?
A program is a sequence of instructions intended to do some task.
For example, if we need to count the number of occurrences of each word in a text document, we
can write a program to do so.
Writing a program will make the task easier compared to manually counting the words in a
document.
Moreover, most of the times, the program is a generic solution. Hence, the same program may be
used to count the frequency of words in another file.
The person who does not know anything about the programming also can run this program to count
the words.
Programming languages like Python will act as an intermediary between the computer and the
programmer. The end-user can request the programmer to write a program to solve one‟s problem.
Reuse: When we write the programs for general-purpose utility tasks, it is better to write them
with a separate name, so that they can be used multiple times whenever/wherever required.
This is possible with the help of functions.
The art of programming involves thorough understanding of the above constructs and using them
legibly.
What Could Possibly Go Wrong?
It is obvious that one can do mistakes while writing a program. The possible mistakes are categorized as
below –
Syntax Errors: The statements which are not following the grammar (or syntax) of the
programming language are tend to result in syntax errors. Python is a case- sensitive language.
Hence, there is a chance that a beginner may do some syntactical mistakes while writing a
program. The lines involving such mistakes are encountered by the Python when you run the
program and the errors are thrown by specifying possible reasons for the error. The
programmer has to correct them and then proceed further.
Logical Errors: Logical error occurs due to poor understanding of the problem. Syntactically,
the program will be correct. But, it may not give the expected output. For example, you are
intended to find a%b, but, by mistake you have typed a/b. Then it is a logical error.
Semantic Errors: A semantic error may happen due to wrong use of variables, wrong
operations or in wrong order. For example, trying to modify un-initialized variable etc.
NOTE: There is one more type of error – runtime error, usually called as exceptions. It may occur due
to wrong input (like trying to divide a number by zero), problem in database connectivity etc. When a
run-time error occurs, the program throws some error, which may not be understood by the normal
user. And he/she may not understand how to overcome such errors. Hence, suspicious lines of code have
to be treated by the programmer himself by the procedure known as exception handling. Python provides
mechanism for handling various possible exceptions like ArithmeticError, FloatingpointError,
EOFError, MemoryError etc
After understanding some important concepts about programming and programming languages, we
will now move on to learn Python as a programming language with its syntax and constructs.
>>> type("hello")
<class 'str'>
>>> type(3)
<class 'int'>
>>> type(10.5)
<class 'float'>
>>> type("15")
<class 'str'>
In the above four examples, one can make out various types str, int and float.
Observe the 4th example – it clearly indicates that whatever enclosed within a double quote is a
string.
Variables
A variable is a named-literal which helps to store a value in the program.
Variables may take value that can be modified wherever required in the program.
Note that, in Python, a variable need not be declared with a specific type before its usage.
Whenever we want a variable, just use it. The type of it will be decided by the value assigned to it.
A value can be assigned to a variable using assignment operator (=).
Consider the example given below–
>>> x=10
>>> print(x)
10 #output
>>> type(x)
<class 'int'> #type of x is integer
>>> y="hi"
>>> print(y)
hi #output
>>> type(y)
<class 'str'> #type of y is string
It is observed from above examples that the value assigned to variable determines the type of that
variable.
It is a good programming practice to name the variable such that its name indicates its purpose in
the program.
Examples:
>>> 3a=5 #starting with a number
SyntaxError: invalid syntax
>>> a$=10 #contains $
SyntaxError: invalid syntax
>>> if=15 #if is a keyword
SyntaxError: invalid syntax
Statements
A statement is a small unit of code that can be executed by the Python interpreter.
It indicates some action to be carried out.
In fact, a program is a sequence of such statements.
Two kinds of statements are: print being an expression statement and assignment statement
Following are the examples of statements –
>>> print("hello") #printing statement
hello
>>> x=5 #assignment statement
>>> print(x) #printing statement
5
Relational or Comparison Operators are used to check the relationship (like less than, greater than
etc) between two operands. These operators return a Boolean value – either True or False.
Assignment Operators: Apart from simple assignment operator = which is used for assigning
values to variables, Python provides compound assignment operators.
For example,
x=x+y can be written as x+=y
Now, += is compound assignment operator. Similarly, one can use most of the arithmetic and
bitwise operators (only binary operators, but not unary) like *, /, %, //, &, ^ etc. as compound
assignment operators.
For example,
>>> x=3
>>> y=5
>>> x+=y #x=x+y
>>> print(x)
8
>>> y//=2 #y=y//2
>>> print(y)
2 #only integer part will be printed
NOTE:
1. Python has a special feature – one can assign values of different types to multiple variables in
a single statement.
For example,
>>> x, y, st=3, 4.2, "Hello"
>>> print("x= ", x, " y= ",y, " st= ", st)
x=3 y=4.2 st=Hello
2. Python supports bitwise operators like &(AND), | (OR), ~(NOT), ^(XOR), >>(right shift)
and <<(left shift). These operators will operate on every bit of the operands. Working procedure
of these operators is same as that in other languages like C and C++.
3. There are some special operators in Python viz. Identity operator (is and is not) and
membership operator (in and not in). These will be discussed in further Modules.
Expressions
A combination of values, variables and operators is known as expression.
Following are few examples of expression –
x=5
y=x+10
z= x-y*3
The Python interpreter evaluates simple expressions and gives results even without print().
For example,
>>> 5
5 #displayed as it is
>>> 1+2
3 #displayed the sum
But, such expressions do not have any impact when written into Python script file.
Order of Operations
When an expression contains more than one operator, the evaluation of operators depends on the
precedence of operators.
The Python operators follow the precedence rule (which can be remembered as PEMDAS) as given
below –
Parenthesis have the highest precedence in any expression. The operations within
parenthesis will be evaluated first. For example, in the expression (a+b)*c, the
addition has to be done first and then the sum is multiplied with c.
Exponentiation has the 2nd precedence. But, it is right associative. That is, if there are two
exponentiation operations continuously, it will be evaluated from right to left (unlike
most of other operators which are evaluated from left to right).
For example,
>>> print(2**3) #It is 23
8
2
>>> print(2**3**2) #It is 512 i.e., 23
Multiplication and Division are the next priority. Out of these two operations, whichever
comes first in the expression is evaluated.
>>> print(5*2/4) #multiplication and then division 2.5
>>> print(5/4*2) #division and then multiplication 2.5
Introduction to Python Programming Page 12
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
Addition and Subtraction are the least priority. Out of these two operations, whichever
appears first in the expression is evaluated i.e., they are evaluated from left to right
String Operations
String concatenation can be done using + operator as shown below –
>>> x="32"
>>> y="45"
>>> print(x+y)
3245
Observe the output: here, the value of y (a string “45”, but not a number 45) is placed just in
front of value of x( a string “32”). Hence the result would be “3245” and its type would be
string.
NOTE: One can use single quotes to enclose a string value, instead of double quotes.
There are several such other utility functions in Python, which will be discussed later.
Comments
It is a good programming practice to add comments to the program wherever required.
This will help someone to understand the logic of the program.
Comment may be in a single line or spread into multiple lines.
A single-line comment in Python starts with the symbol #.
Multiline comments are enclosed within a pair of 3-single quotes.
Python (and all programming languages) ignores the text written as comment lines.
They are only for the programmer‟s (or any reader‟s) reference.
Ex2.
basic=10000
da=0.3*basic
gross_sal=basic+da
print("Gross Sal = ",gross_sal) #output is 13000
One can observe that both of these two examples are performing same task.
But, compared to Ex1, the variables in Ex2 are indicating what is being calculated.
That is, variable names in Ex2 are indicating the purpose for which they are being used in the
program. Such variable names are known as mnemonic variable names. The word mnemonic means
memory aid. The mnemonic variables are created to help the programmer to remember the purpose
for which they have been created.
Python can understand the set of reserved words (or keywords), and hence it flashes an error when
such words are used as variable names by the programmer.
Moreover, most of the Python editors have a mechanism to show keywords in a different color.
Hence, programmer can easily make out the keyword immediately when he/she types that word.
Debugging
Some of the common errors a beginner programmer may make are syntax errors.
Though Python flashes the error with a message, sometimes it may become hard to understand the
cause of errors. Some of the examples are given here –
Here, there is a space between the terms avg and sal, which is not allowed.
Ex2. >>>m=09
SyntaxError: invalid token
As shown in above examples, the syntax errors will be alerted by Python. But, programmer is
responsible for logical errors or semantic errors. Because, if the program does not yield into expected
output, it is due to mistake done by the programmer, about which Python is unaware of.
Boolean Expressions
A Boolean Expression is an expression which results in True or False.
The True and False are special values that belong to class bool.
Check the following –
>>> type(True)
<class 'bool'>
>>> type(False)
<class 'bool'>
Boolean expression may be as below –
>>> 10==12
False
>>> x=10
>>> y=10
>>> x==y
True
Various comparison operations are shown in Table.
Examples:
>>> a=10
>>> b=20
>>> x= a>b
>>> print(x)
False
>>> print(a==b)
False
>>> print("a<b is ", a<b)
a<b is True
>>> print("a!=b is", a!=b)
a!=b is True
>>> 10 is 20
False
>>> 10 is 10
True
NOTE: For a first look, the operators ==and is look same. Similarly, the operators !=and is not look
the same. But, the operators == and != does the equality test. That is, they will compare the values
stored in the variables. Whereas, the operators is and is not does the identity test. That is, they will
compare whether two objects are same. Usually, two objects are same when their memory locations
are same. This concept will be more clear when we take up classes and objects in Python.
Logical Operators
There are 3 logical operators in Python as shown in Table
not Return true, if the operand is false (it is a unary operator) not a
NOTE:
1. Logical operators treat the operands as Boolean (True or False).
2. Python treats any non-zero number as True and zero as False.
3. While using and operator, if the first operand is False, then the second operand is not
evaluated by Python. Because False and’ed with anything is False.
4. In case of or operator, if the first operand is True, the second operand is not evaluated.
Because True or’ed with anything is True.
if condition: False
Statement block condition?
Consider an example –
>>> x=10
>>> if x<40:
print("Fail") #observe indentation after if
Fail #output
Usually, the if conditions have a statement block.
In any case, the programmer feels to do nothing when the condition is true, the statement block can
be skipped by just typing pass statement as shown below –
>>> if x<0:
pass #do nothing when x is negative
Alternative Execution
A second form of if statement is alternative execution, in which there are two possibilities based on
condition evaluation.
Here, when the condition is true, one set of statements will be executed and when the condition is
false, another set of statements will be executed.
The syntax and flowchart are as given below –
if condition:
Statement block -1
else:
Statement block -2
Sample output:
Enter x: 13
x is odd
Introduction to Python Programming Page 19
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
Nested Conditionals
o The conditional statements can be nested.
o That is, one set of conditional statements can be nested inside the other.
o It can be done in multiple ways depending on programmer‟s requirements.
o Examples are given below –
Sample Output:
Enter marks:68
First Class
Here, the outer condition marks>=60 is checked first. If it is true, then there are two branches for the
inner conditional. If the outer condition is false, the above code does nothing.
if gender == "M" :
if age >= 21:
print("Boy, Eligible for Marriage")
else:
print("Boy, Not Eligible for Marriage")
elif gender == "F":
if age >= 18:
print("Girl, Eligible for Marriage")
else:
print("Girl, Not Eligible for Marriage")
Sample Output:
Enter gender: F
Enter age: 17
NOTE: Nested conditionals make the code difficult to read, even though there are proper indentations.
Hence, it is advised to use logical operators like and to simplify the nested conditionals.
For example, the outer and inner conditions in Ex1 above can be joined as -
if marks>=60 and marks<70:
#do something
Chained Conditionals
o Some of the programs require more than one possibility to be checked for
executing a set of statements.
o That means, we may have more than one branch. This is solved with the help of
chained conditionals.
o The syntax and flowchart is given F
below –
if condition1:
Statement Block-1
elif condition2:
Statement Block-2
|
|
|
|
elif condition_n:
Statement Block-n
else:
Statement Block-(n+1)
The conditions are checked one by one sequentially. If any condition is satisfied, the respective
statement block will be executed and further conditions are not checked. Note that, the last else
block is not necessary always.
else:
print("Fail")
Sample Output:
Enter marks: 78
First Class
Output:
Enter a:12
Enter b:0
Introduction to Python Programming Page 22
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
Here, the expression x<10 and x+y>25 involves the logical operator and. Now, x<10 is evaluated
first, which results to be False. As there is an and operator, irrespective of the result of x+y>25, the
whole expression will be False.
In such situations, Python ignores the remaining part of the expression. This is known as short-
circuiting the evaluation.
When the first part of logical expression results in True, then the second part has to be evaluated to
know the overall result.
The short-circuiting not only saves the computational time, but it also leads to a technique known
as guardian pattern.
Consider following sequence of statements –
>>> x=5
>>> y=0
>>> x>=10 and (x/y)>2
False
>>> x>=2 and (x/y)>2
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module> x>=2 and (x/y)>2
ZeroDivisionError: division by zero
Here, when we executed the statement x>=10 and (x/y)>2, the first half of logical expression itself
was False and hence by applying short-circuit rule, the remaining part was not executed at all.
Whereas, in the statement x>=2 and (x/y)>2, the first half is True and the second half is resulted in
runtime-error. Thus, in the expression x>=10 and (x/y)>2, short-circuit rule acted as a guardian by
preventing an error.
One can construct the logical expression to strategically place a guard evaluation just before the
evaluation that might cause an error as follows:
>>> x=5
>>> y=0
>>> x>=2 and y!=0 and(x/y)>2
False
Here, x>=2 results in True, but y!=0 evaluates to be False. Hence, the expression (x/y)>2is never
reached and possible error is being prevented from happening.
Debugging
One can observe from previous few examples that when a runtime error occurs, it displays a term
Traceback followed by few indications about errors.
A traceback is a stack trace from the point of error-occurrence down to the call-sequence till the
point of call.
This is helpful when we start using functions and when there is a sequence of multiple function calls
from one to other.
Then, traceback will help the programmer to identify the exact position where the error occurred.
Most useful part of error message in traceback are –
What kind of error it is
Where it occurred
Compared to runtime errors, syntax errors are easy to find, most of the times. But, whitespace errors
in syntax are quite tricky because spaces and tabs are invisible.
For example –
>>> x=10
>>> y=15
SyntaxError: unexpected indent
The error here is because of additional space given before y. As Python has a different meaning
(separate block of code) for indentation, one cannot give extra spaces as shown above.
In general, error messages indicate where the problem has occurred. But, the actual error may be
before that point, or even in previous line of code.
1.4 FUNCTIONS
In this section, we will discuss various types of built-in functions, user-defined functions,
applications/uses of functions etc.
Function Calls
A function is a named sequence of instructions for performing a task.
When we define a function we will give a valid name to it, and then specify the instructions for
performing required task.
Later, whenever we want to do that task, a function is called by its name.
Consider an example:
>>> type(15)
<class 'int'>
Here type is a function name, 15 is the argument to a function and <class 'int'> is the result of the
function.
Usually, a function takes zero or more arguments and returns the result.
Built-in Functions
Python provides a rich set of built-in functions for doing various tasks.
The programmer/user need not know the internal working of these functions; instead, they need to
know only the purpose of such functions.
Some of the built in functions are given below –
max(): This function is used to find maximum value among the arguments. It can be used for
numeric values or even to strings.
o max(10, 20, 14, 12) #maximum of 4 integers
20
o max("hello world")
'w' #character having maximum ASCII code
o max(3.5, -2.1, 4.8, 15.3, 0.2)
15.3 #maximum of 5 floating point values
min(): As the name suggests, it is used to find minimum of arguments.
o min(10, 20, 14, 12) #minimum of 4 integers
10
o min("hello world")
'' #space has least ASCII code here
o min(3.5, -2.1, 4.8, 15.3, 0.2)
-2.1 #minimum of 5 floating point values
len(): This function takes a single argument and finds its length. The argument can be a string, list,
tuple etc.
There are many other built-in functions available in Python. They are discussed in further Modules,
wherever they are relevant.
Random Numbers
Most of the programs that we write are deterministic.
That is, the input (or range of inputs) to the program is pre-defined and the output of the program is
one of the expected values.
But, for some of the real-time applications in science and technology, we need randomly generated
output. This will help in simulating certain scenario.
Random number generation has important applications in games, noise detection in electronic
communication, statistical sampling theory, cryptography, political and business prediction etc.
These applications require the program to be nondeterministic.
There are several algorithms to generate random numbers. But, as making a program completely
nondeterministic is difficult and may lead to several other consequences, we generate pseudo-
random numbers.
That is, the type (integer, float etc) and range (between 0 and 1, between 1 and 100 etc) of the
random numbers are decided by the programmer, but the actual numbers are unknown.
Moreover, the algorithm to generate the random number is also known to the programmer. Thus,
the random numbers are generated using deterministic computation and hence, they are known as
pseudo-random numbers!!
Python has a module random for the generation of random numbers. One has to import this
module in the program. The function used is also random().
By default, this function generates a random number between 0.0 and 1.0 (excluding 1.0).
For example –
print(random.random())
0.5287778188896328 #one more random number
Importing a module creates an object.
Using this object, one can access various functions and/or variables defined in that module.
Functions are invoked using a dot operator.
There are several other functions in the module random apart from the function random(). (Do not
get confused with module name and function name. Observe the parentheses while referring a
function name).
Few are discussed hereunder:
randint(): It takes two arguments low and high and returns a random integer between these two
arguments (both low and high are inclusive). For example,
>>>random.randint(2,20)
14 #integer between 2 and 20 generated
>>> random.randint(2,20) 10
choice(): This function takes a sequence (a list type in Python) of numbers as an argument and
returns one of these numbers as a random number. For example,
>>> t=[1,2, -3, 45, 12, 7, 31, 22] #create a list t
>>> random.choice(t) #t is argument to choice()
12 #one of the elements in t
Introduction to Python Programming Page 27
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
>>> random.choice(t)
1 #one of the elements in t
Various other functions available in random module can be used to generate random numbers
following several probability distributions like Gaussian, Triangular, Uniform, Exponential, Weibull,
Normal etc.
Math Functions
Python provides a rich set of mathematical functions through the module math.
To use these functions, the math module has to be imported in the code.
Some of the important functions available in math are given hereunder
sqrt(): This function takes one numeric argument and finds the square root of that argument.
>>> math.sqrt(34) #integer argument
5.830951894845301
>>> math.sqrt(21.5) #floating point argument
4.636809247747852
log10(): This function is used to find logarithm of the given argument, to the base 10.
>>> math.log10(2)
0.3010299956639812
log(): This is used to compute natural logarithm (base e) of a given number.
>>> math.log(2)
0.6931471805599453
sin(): As the name suggests, it is used to find sine value of a given argument. Note that, the
argument must be in radians (not degrees). One can convert the number of degrees into radians by
multiplying pi/180 as shown below –
>>>math.sin(90*math.pi/180) #sin(90) is 1
1.0
0.9999999999999999
pow(): This function takes two arguments x and y, then finds x to the power of y.
>>> math.pow(3,4)
81.0
Adding New Functions (User-defined Functions)
Python facilitates programmer to define his/her own functions.
The function written once can be used wherever and whenever required.
The syntax of user-defined function would be –
def fname(arg_list):
statement_1
statement_2
……………
Statement_n
return value
The first line in the function def fname(arg_list)is known as function header/definition. The
remaining lines constitute function body.
The function header is terminated by a colon and the function body must be indented.
To come out of the function, indentation must be terminated.
Unlike few other programming languages like C, C++ etc, there is no main()
function or specific location where a user-defined function has to be called.
The programmer has to invoke (call) the function wherever required.
Consider a simple example of user-defined function –
def myfun():
print("Hello")
print("Inside the function")
Observe indentation
print("Example of function")
Statements outside the myfun()
function without indentation. print("Example over")
Here, the first output indicates that myfun is an object which is being stored at the memory address
0x0219BFA8 (0x indicates octal number).
The second output clearly shows myfunis of type function.
(NOTE: In fact, in Python every type is in the form of class. Hence, when we apply type on any
variable/object, it displays respective class name. The detailed study of classes will be done in Module
4.)
The flow of execution of every program is sequential from top to bottom, a function can be invoked
only after defining it.
Usage of function name before its definition will generate error. Observe the following code:
print("Example of function")
myfun() #function call before definition
print("Example over")
def myfun():
print("Inside myfun()")
(2)
repeat()
print("Example over")
The output is –
Example of function Inside
myfun() Inside repeat()
Inside myfun() Example
over
Observe the output of the program to understand the flow of execution of the program.
Initially, we have two function definitions myfun()and repeat()one after the other. But, functions
are not executed unless they are called (or invoked). Hence, the first line to execute in the above
program is –
print("Example of function")
Then, there is a function call repeat(). So, the program control jumps to this function. Inside
repeat(), there is a call for myfun().
Now, program control jumps to myfun()and executes the statements inside and returns back to
repeat() function. The statement print(“Inside repeat()”) is executed.
Once again there is a call for myfun()function and hence, program control jumps there. The
function myfun() is executed and returns to repeat().
As there are no more statements in repeat(), the control returns to the original position of its call.
Now there is a statement print("Example over")to execute, and program is terminated.
In the above program, var is called as parameter and x and y are called as arguments.
The argument is being passed when a function test() is invoked. The parameter receives the
argument as an input and statements inside the function are executed.
As Python variables are not of specific data types in general, one can pass any type of value to the
function as an argument.
Python has a special feature of applying multiplication operation on arguments while passing them
to a function. Consider the modified version of above program –
One can observe that, when the argument is of type string, then multiplication indicates that string
is repeated 3 times.
Whereas, when the argument is of numeric type (here, integer), then the value of that argument is
literally multiplied by 3.
A function that performs some task, but do not return any value to the calling function is known as
void function. The examples of user-defined functions considered till now are void functions.
The function which returns some result to the calling function after performing a task is known as
fruitful function. The built-in functions like mathematical functions, random number generating
functions etc. that have been considered earlier are examples for fruitful functions.
One can write a user-defined function so as to return a value to the calling function as shown in
the following example –
def sum(a,b):
return a+b
x=int(input("Enter a number:"))
y=int(input("Enter another number:"))
s=sum(x,y)
print("Sum of two numbers:",s)
In the above example, The function sum() take two arguments and returns their sum to the
receiving variable s.
When a function returns something and if it is not received using a LHS variable, then the return
value will not be available.
Introduction to Python Programming Page 33
BPLCK105/205B
Bangalore Institute of Technology Department of Robotics & AI
For instance, in the above example if we just use the statement sum(x,y) instead of s=sum(x,y),
then the value returned from the function is of no use.
On the other hand, if we use a variable at LHS while calling void functions, it will receive None.
For example,
p= test(var) #function used in previous example
print(p)
Now, the value of p would be printed as None. Note that, None is not a string, instead it is of type
class 'NoneType'. This type of object indicates no value.
Why Functions?
Functions are essential part of programming because of following reasons –
Creating a new function gives the programmer an opportunity to name a group of statements,
which makes the program easier to read, understand, and debug.
Functions can make a program smaller by eliminating repetitive code. If any modification is
required, it can be done only at one place.
Dividing a long program into functions allows the programmer to debug the independent functions
separately and then combine all functions to get the solution of original problem.
Well-designed functions are often useful for many programs. The functions written once for a
specific purpose can be re-used in any other program.
Observe that the two values are separated by a space without mentioning anything specific. This is
possible because of the existence of an argument sep in the print() function whose default value is
white space. This argument makes sure that various values to be printed are separated by a space for
a better representation of output.
The programmer has a liberty in Python to give any other character(or string) as a separator by
explicitly mentioning it in print() as shown below –
We can observe that the values have been separated by slash, which is given as a value for the
>>> dept="CSE"
>>> college=”MITM"
>>> print(dept, college, sep='@')
CSE@MITM
If you want to deliberately suppress any separator, then the value of sep can be set with empty string
as shown below –
>>> print("Hello","World", sep='')
HelloWorld
You might have observed that in Python program, the print() adds a new line after printing the data.
In a Python script file, if you have two statements like –
print(“Hello”)
print(“World”)
then, the output would be
Hello
World
This may be quite unusual for those who have experienced programming languages like C, C++ etc.
In these languages, one has to specifically insert a new-line character (\n) to get the output in
different lines. But, in Python without programmer‟s intervention, a new line will be inserted. This is
possible because, the print() function in Python has one more special argument end whose default
value itself is new-line. Again, the default value of this argument can be changed by the programmer
as shown below (Run these lines using a script file, but not in the terminal/command prompt) –
print(“Hello”, end= „@‟)
print(“World”)
Ex1: When multiple variables have to be displayed embedded within a string, the format()
function is useful as shown below –
>>> x=10
>>> y=20
>>> print("x={0}, y={1}".format(x,y))
x=10, y=20
While using format() the arguments of print() must be numbered as 0, 1, 2, 3, etc. and they must be
provided inside the format() in the same order.
Ex2: The format() function can be used to specify the width of the variable (the number of
spaces that the variable should occupy in the output) as well. Consider below given example which
displays a number, its square and its cube.
for x in range(1,5):
print("{0:1d} {1:3d} {2:4d}".format(x,x**2, x**3))
OUTPUT
1 1 1
2 4 8
3 9 27
4 16 64
Here, 1d, 3d and 4d indicates 1-digit space, 2-digit space etc. on the output screen.
Ex3: One can use % symbol to have required number of spaces for a variable. This will be useful
in printing floating point numbers.
>>> x=19/3
>>> print(x)
6.333333333333333 #observe number of digits after dot
>>> print("%.3f"%(x)) #only 3 places after decimal point 6.333
>>> x=20/3
>>> y=13/7
>>> print("x= ",x, "y=",y) #observe actual digits
x=6.666666666666667 y= 1.8571428571428572
>>> print("x=%0.4f, y=%0.2f"%(x,y))
x=6.6667, y=1.86 #observe rounding off digits
MODULE 2
2.1 ITERATION
2.2 STRING
2.3 FILES
MODULE – 2
2.1 ITERATION
Iteration is a processing repeating some task. In a real time programming, we require a set
of statements to be repeated certain number of times and/or till a condition is met. Every
programming language provides certain constructs to achieve the repetition of tasks. In this
section, we will discuss various such looping structures.
statements_after_while
Here, while is a keyword. The condition is evaluated first. Till its value remains true,
the statement_1 to statement_n will be executed. When the condition becomes
false, the loop is terminated and statements after the loop will be executed. Consider an
example –
n=1
while n<=5:
print(n) #observe indentation
n=n+1
print("over")
In the above example, a variable n is initialized to 1. Then the condition n<=5 is being
checked. As the condition is true, the block of code containing print statement (print(n))
and increment statement (n=n+1) are executed. After these two lines, condition is checked
again. The procedure continues till condition becomes false, that is when n becomes 6. Now,
the while-loop is terminated and next statement after the loop will be executed. Thus, in this
example, the loop is iterated for 5 times.
Note that, a variable n is initialized before starting the loop and it is incremented inside the
loop. Such a variable that changes its value for every iteration and controls the total execution
of the loop is called as iteration variable or counter variable. If the count variable is not
updated properly within the loop, then the loop may not terminate and keeps executing
infinitely.
n=1
while True:
print(n)
n=n+1
Here, the condition specified for the loop is the constant True, which will never get
terminated. Sometimes, the condition is given such a way that it will never become false and
hence by restricting the program control to go out of the loop. This situation may happen
either due to wrong condition or due to not updating the counter variable.
In some situations, we deliberately want to come out of the loop even before the normal
termination of the loop. For this purpose break statement is used. The following example
depicts the usage of break. Here, the values are taken from keyboard until a negative number
is entered. Once the input is found to be negative, the loop terminates.
while True:
x=int(input("Enter a number:"))
if x>= 0:
print("You have entered ",x)
else:
print("You have entered a negative number!!")
break #terminates the loop
Sample output:
Enter a number:23
You have entered 23
Enter a number:12
You have entered 12
Enter a number:45
You have entered 45
Enter a number:0
You have entered 0
Enter a number:-2
You have entered a negative number!!
In the above example, we have used the constant True as condition for while-loop, which
will never become false. So, there was a possibility of infinite loop. This has been avoided
by using break statement with a condition. The condition is kept inside the loop such a way
that, if the user input is a negative number, the loop terminates. This indicates that, the loop
may terminate with just one iteration (if user gives negative number for the very first time) or
it may take thousands of iteration (if user keeps on giving only positive numbers as input).
Hence, the number of iterations here is unpredictable. But, we are making sure that it will not
be an infinite-loop, instead, the user has control on the loop.
Sometimes, programmer would like to move to next iteration by skipping few statements in
the loop, based on some condition. For this purpose continue statement is used. For
example, we would like to find the sum of 5 even numbers taken as input from the keyboard.
The logic is –
Read a number from the keyboard
If that number is odd, without doing anything else, just move to next iteration for
reading another number
If the number is even, add it to sum and increment the accumulator variable.
When accumulator crosses 5, stop the program
sum=0
count=0
while True:
x=int(input("Enter a number:"))
if x%2 !=0:
continue
else:
sum+=x
count+=1
if count==5:
break
Sample Output:
Enter a number:13
Enter a number:12
Enter a number:4
Enter a number:5
Enter a number:-3
Enter a number:8
Enter a number:7
Enter a number:16
Enter a number:6
Sum= 46
2.1.3 Definite Loops using for
The while loop iterates till the condition is met and hence, the number of iterations are usually
unknown prior to the loop. Hence, it is sometimes called as indefinite loop. When we know
total number of times the set of statements to be executed, for loop will be used. This is
called as a definite loop. The for-loop iterates over a set of numbers, a set of words, lines in
a file etc. The syntax of for-loop would be –
for var in list/sequence:
statement_1
statement_2
………………
statement_n
statements_after_for
Ex: In the below given example, a list names containing three strings has been created.
Then the counter variable x in the for-loop iterates over this list. The variable x takes the
elements in names one by one and the body of the loop is executed.
NOTE: In Python, list is an important data type. It can take a sequence of elements of different
types. It can take values as a comma separated sequence enclosed within square brackets.
Elements in the list can be extracted using index (just similar to extracting array elements in
C/C++ language). Various operations like indexing, slicing, merging, addition and deletion of
elements etc. can be applied on lists. The details discussion on Lists will be done in Module
3.
The for loop can be used to print (or extract) all the characters in a string as shown below –
for i in "Hello":
print(i, end=‟\t‟)
Output:
H e l l o
When we have a fixed set of numbers to iterate in a for loop, we can use a function
range(). The function range() takes the following format –
range(start, end, steps)
The start and end indicates starting and ending values in the sequence, where end is
excluded in the sequence (That is, sequence is up to end-1). The default value of start is
0. The argument steps indicates the increment/decrement in the values of sequence with
the default value as 1. Hence, the argument steps is optional. Let us consider few examples
on usage of range() function.
Output:
0 1 2 3 4
Here, 0 is the default starting value. The statement range(5) is same as range(0,5)
and range(0,5,1).
Output:
5 4 3 2 1
The function range(5,0,-1)indicates that the sequence of values are 5 to 0(excluded) in
steps of -1 (downwards).
Ex3. Printing only even numbers less than 10–
for i in range(0,10,2):
print(i, end= „\t‟)
Output:
0 2 4 6 8
The while-loop and for-loop are usually used to go through a list of items or the contents of a
file and to check maximum or minimum data value. These loops are generally constructed
by the following procedure –
Initializing one or more variables before the loop starts
Performing some computation on each item in the loop body, possibly changing the
variables in the body of the loop
Looking at the resulting variables when the loop completes
The construction of these loop patterns are demonstrated in the following examples.
Counting and Summing Loops: One can use the for loop for counting number of items in
the list as shown –
count = 0
for i in [4, -2, 41, 34, 25]:
count = count + 1
print(“Count:”, count)
Here, the variable count is initialized before the loop. Though the counter variable i is not
being used inside the body of the loop, it controls the number of iterations. The variable
count is incremented in every iteration, and at the end of the loop the total number of
elements in the list is stored in it.
One more loop similar to the above is finding the sum of elements in the list –
total = 0
for x in [4, -2, 41, 34, 25]:
total = total + x
print(“Total:”, total)
NOTE: In practice, both of the counting and summing loops are not necessary, because there
are built-in functions len() and sum() for the same tasks respectively.
Maximum and Minimum Loops: To find maximum element in the list, the following code
can be used –
big = None
print('Before Loop:', big)
for x in [12, 0, 21,-3]:
if big is None or x > big :
big = x
print('Iteration Variable:', x, 'Big:', big)
print('Biggest:', big)
Output:
Before Loop: None
Similarly, one can have a loop for finding smallest of elements in the list as given below –
small = None
print('Before Loop:', small)
for x in [12, 0, 21,-3]:
if small is None or x < small :
small = x
print('Iteration Variable:', x, 'Small:', small)
print('Smallest:', small)
Output:
Before Loop: None
Iteration Variable: 12 Small: 12
Iteration Variable: 0 Small: 0
Iteration Variable: 21 Small: 0
Iteration Variable: -3 Small: -3
Smallest: -3
NOTE: In Python, there are built-in functions max() and min() to compute maximum and
minimum values among. Hence, the above two loops need not be written by the programmer
explicitly. The inbuilt function min() has the following code in Python –
def min(values):
smallest = None
for value in values:
if smallest is None or value < smallest:
smallest = value
return smallest
2.2 STRINGS
A string is a sequence of characters, enclosed either within a pair of single quotes or double
quotes. Each character of a string corresponds to an index number, starting with zero as
shown below –
S= “Hello World”
character H e l l O w o r l d
index 0 1 2 3 4 5 6 7 8 9 10
The characters of a string can be accessed using index enclosed within square brackets.
For example,
>>> word1="Hello"
>>> word2='hi'
>>> x=word1[1] #2nd character of word1 is extracted
>>> print(x)
e
>>> y=word2[0] #1st character of word1 is extracted
>>> print(y)
h
Python supports negative indexing of string starting from the end of the string as shown
below –
S= “Hello World”
character H e l L o w o r l D
Negative index -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
The characters can be extracted using negative index also. For example,
>>> var=“Hello”
>>> print(var[-1])
o
>>> print(var[-4])
e
Whenever the string is too big to remember last positive index, one can use negative index
to extract characters at the end of string.
The index for string varies from 0 to length-1. Trying to use the index value beyond
this range generates error.
>>> var="Hello"
>>> ln=len(var)
>>> ch=var[ln]
IndexError: string index out of range
Output:
H e l l o
In the above example, the for loop is iterated from first to last character of the string st.
That is, in every iteration, the counter variable i takes the values as H, e, l, l and o. The
loop terminates when no character is left in st.
This will extract character from ith character of st till (j-1)th character in steps of k. If first
index i is not present, it means that slice should start from the beginning of the string. If the
second index j is not mentioned, it indicates the slice should be till the end of the string. The
third parameter k, also known as stride, is used to indicate number of steps to be
incremented after extracting first character. The default value of stride is 1.
Consider following examples along with their outputs to understand string slicing.
print(st[3:8:2]) #output is l o
Starting from 3rd character, till 7th character, every alternative index is considered.
print(st[-1:]) #output is d
Here, starting index is -1, ending index is not mentioned (means, it takes the index
10) and the stride is default value 1. So, we are trying to print characters from -1 (which
is the last character of negative indexing) till 10th character (which is also the last
character in positive indexing) in incremental order of 1. Hence, we will get only last
character as output.
By the above set of examples, one can understand the power of string slicing and of Python
script. The slicing is a powerful tool of Python which makes many task simple pertaining to
data types like strings, Lists, Tuple, Dictionary etc. (Other types will be discussed in later
Modules)
Here, we are trying to change the 4th character (index 3 means, 4th character as the first index
is 0) to t. The error message clearly states that an assignment of new item (a string) is not
possible on string object. So, to achieve our requirement, we can create a new string using
slices of existing string as below –
st=input("Enter a string:")
ch=input("Enter a character to be counted:")
c=countChar(st,ch)
print("{0} appeared {1} times in {2}".format(ch,c,st))
Sample Output:
Enter a string: hello how are you?
Enter a character to be counted: h
h appeared 2 times in hello how are you?
Output is same. As the value contained in st and hello both are same, the equality
results in True.
Output is greater. The ASCII value of h is greater than ASCII value of H. Hence, hello
is greater than Hello.
NOTE: A programmer must know ASCII values of some of the basic characters. Here are
few –
A–Z : 65 – 90
a–z : 97 – 122
0–9 : 48 – 57
Space 32
Enter Key 13
The built-in set of members of any class can be accessed using the dot operator as shown–
objName.memberMethod(arguments)
The dot operator always binds the member name with the respective object name. This is
very essential because, there is a chance that more than one class has members with same
name. To avoid that conflict, almost all Object oriented languages have been designed with
this common syntax of using dot operator. (Detailed discussion on classes and objects will
be done in later Modules.)
The methods are usually called using the object name. This is known as method invocation.
We say that a method is invoked using an object.
capitalize(s) : This function takes one string argument s and returns a capitalized version
of that string. That is, the first character of s is converted to upper case, and all other
characters to lowercase. Observe the examples given below –
Ex1. >>> s="hello"
>>> s1=str.capitalize(s)
>>> print(s1)
Hello #1st character is changed to uppercase
Observe in Ex2 that the first character is converted to uppercase, and an in-between
uppercase letter W of the original string is converted to lowercase.
>>> st='HELLO'
>>> st1=st.lower()
>>> print(st1)
hello
>>> print(st) #no change in original string
HELLO
s.find(s1) : The find() function is used to search for a substring s1 in the string s. If
found, the index position of first occurrence of s1 in s, is returned. If s1 is not found in s,
then -1 is returned.
>>> st='hello'
>>> i=st.find('l')
>>> print(i) #output is 2
>>> i=st.find('lo')
>>> print(i) #output is 3
Here, the substring „cal‟ is found in the very first position of st, hence the result is 0.
>>> i=st.find('cal',10,20)
>>> print(i) #output is 17
Here, the substring cal is searched in the string st between 10th and 20th position and
hence the result is 17.
>>> i=st.find('cal',10,15)
>>> print(i) #ouput is -1
In this example, the substring 'cal' has not appeared between 10th and 15th
character of st. Hence, the result is -1.
s.strip(): Returns a copy of string s by removing leading and trailing white spaces.
The strip() function can be used with an argument chars, so that specified chars are
removed from beginning or ending of s as shown below –
>>> st="###Hello##"
>>> st1=st.strip('#')
>>> print(st1) #all hash symbols are removed
Hello
We can give more than one character for removal as shown below –
S.startswith(prefix, start, end): This function has 3 arguments of which start and end
are option. This function returns True if S starts with the specified prefix, False otherwise.
When start argument is provided, the search begins from that position and returns True
or False based on search result.
>>> st="hello world"
>>> st.startswith("w",6) #True because w is at 6th position
When both start and end arguments are given, search begins at start and ends at end.
S.count(s1, start, end): The count() function takes three arguments – string, starting
position and ending position. This function returns the number of non-overlapping
occurrences of substring s1 in string S in the range of start and end.
There are many more built-in methods for string class. Students are advised to explore
more for further study.
Now, our aim is to extract only ieee.org, which is the domain name. We can think of logic as–
o Identify the position of @, because all domain names in email IDs will be after the
symbol @
o Identify a white space which appears after @ symbol, because that will be the
end of domain name.
o Extract the substring between @ and white-space.
The concept of string slicing and find() function will be useful here.Consider the code given
below –
Execute above program to get the output as ieee.org. One can apply this logic in a loop,
when our string contains series of email IDs, and we may want to extract all those mail IDs.
>>> sum=20
>>> '%d' %sum
„20‟ #string „20‟, but not integer 20
Note that, when applied on both integer operands, the % symbol acts as a modulus operator.
When the first operand is a string, then it is a format operator. Consider few examples
illustrating usage of format operator.
FILES
2.3 FILES
File handling is an important requirement of any programming language, as it allows us to
store the data permanently on the secondary storage and read the data from a permanent
source. Here, we will discuss how to perform various operations on files using the
programming language Python.
2.3.1 Persistence
The programs that we have considered till now are based on console I/O. That is, the input
was taken from the keyboard and output was displayed onto the monitor. When the data to
be read from the keyboard is very large, console input becomes a laborious job. Also, the
output or result of the program has to be used for some other purpose later, it has to be
stored permanently. Hence, reading/writing from/to files are very essential requirement of
programming.
We know that the programs stored in the hard disk are brought into main memory to
execute them. These programs generally communicate with CPU using conditional
execution, iteration, functions etc. But, the content of main memory will be erased when we
turn-off our computer. We have discussed these concepts in Module1 with the help of
Figure 1.1. Here we will discuss about working with secondary memory or files. The files
stored on the secondary memory are permanent and can be transferred to other machines
using pen-drives/CD.
Here, filename is name of the file to be opened. This string may be just a name of the
file, or it may include pathname also. Pathname of the file is optional
when the file is stored in current working directory
mode This string indicates the purpose of opening a file. It takes a pre-
defined set of values as given in Table 2.1
fhand It is a reference to an object of file class, which acts as a handler or
tool for all further operations on files.
When our Python program makes a request to open a specific file in a particular mode,
then OS will try to serve the request. When a file gets opened successfully, then a file
object is returned. This is known as file handle and is as shown in Figure 2.1. It will help to
perform various operations on a file through our program. If the file cannot be opened due
to some reason, then error message (traceback) will be displayed.
A file opening may cause an error due to some of the reasons as listed below –
o File may not exist in the specified path (when we try to read a file)
o File may exist, but we may not have a permission to read/write a file
o File might have got corrupted and may not be in an opening state
Since, there is no guarantee about getting a file handle from OS when we try to open a file,
it is always better to write the code for file opening using try-except block. This will help us
to manage error situation.
Mode Meaning
r Opens a file for reading purpose. If the specified file does not exist in the
specified path, or if you don‟t have permission, error message will be
displayed. This is the default mode of open() function in Python.
w Opens a file for writing purpose. If the file does not exist, then a new file
with the given name will be created and opened for writing. If the file
already exists, then its content will be over-written.
a Opens a file for appending the data. If the file exists, the new content will
be appended at the end of existing content. If no such file exists, it will be
created and new content will be written into it.
r+ Opens a file for reading and writing.
w+ Opens a file for both writing and reading. Overwrites the existing file if the
file exists. If the file does not exist, creates a new file for reading and
writing.
a+ Opens a file for both appending and reading. The file pointer is at
the end of the file if the file exists. The file opens in the append
mode. If the file does not exist, it creates a new file for reading and
writing.
rb Opens a file for reading only in binary format
wb Opens a file for writing only in binary format
ab Opens a file for appending only in binary format
NOTE: There is one more type of file called binary file, which contains the data in the form
of bits. These files are capable of storing text, image, video, audio etc. All these data will be
stored in the form of a group of bytes whose formatting will be known. The supporting
program can interpret these files properly, whereas when opened using normal text editor,
they look like messy, unreadable set of characters.
NOTE: Before executing the below given program, create a text file (using Notepad or
similar editor) myfile.txt in the current working directory (The directory where you are going
store your Python program). Open this text file and add few random lines to it and then
close. Now, open a Python script file, say countLines.py and save it in the same directory
as that of your text file myfile.txt. Then, type the following code in Python script
countLines.py and execute the program. (You can store text file and Python script file in
different directories. But, if you do so, you have to mention complete path of text file in the
open() function.)
print("Total lines=",count)
fhand.close()
Output:
Line Number 1 : hello how are you?
Line Number 2 : I am doing fine
In the above program, initially, we will try to open the file 'myfile.txt. As we have
already created that file, the file handler will be returned and the object reference to this file
will be stored in fhand. Then, in the for-loop, we are using fhand as if it is a sequence of
lines. For each line in the file, we are counting it and printing the line. In fact, a line is
identified internally with the help of new-line character present at the end of each line.
Though we have not typed \n anywhere in the file myfile.txt, after each line, we would
have pressed enter-key. This act will insert a \n, which is invisible when we view the file
through notepad. Once all lines are over, fhand will reach end-of-file and hence terminates
the loop. Note that, when end of file is reached (that is, no more characters are present in
the file), then an attempt to read will return None or empty character „‟ (two quotes without
space in between).
Once the operations on a file is completed, it is a practice to close the file using a function
close(). Closing of a file ensures that no unwanted operations are done on a file handler.
Moreover, when a file was opened for writing or appending, closure of a file ensures that
the last bit of data has been uploaded properly into a file and the end-of-file is maintained
properly. If the file handler variable (in the above example, fhand ) is used to assign some
other file object (using open() function), then Python closes the previous file automatically.
If you run the above program and check the output, there will be a gap of two lines between
each of the output lines. This is because, the new-line character \n is also a part of the
variable line in the loop, and the print() function has default behavior of adding a line at
the end (due to default setting of end parameter of print()). To avoid this double-line
spacing, we can remove the new-line character attached at the end of variable line by
using built-in string function rstrip() as below –
It is obvious from the logic of above program that from a file, each line is read one at a time,
processed and discarded. Hence, there will not be a shortage of main memory even though
we are reading a very large file. But, when we are sure that the size of our file is quite
small, then we can use read() function to read the file contents. This function will read
entire file content as a single string. Then, required operations can be done on this string
using built-in string functions. Consider the below given example –
fhand=open('myfile.txt')
s=fhand.read()
print(“Total number of characters:”,len(s))
print(“String up to 20 characters:”, s[:20])
After executing above program using previously created file myfile.txt, then the output
would be –
Total number of characters:50
String up to 20 characters: hello how are you?
I
>>> fhand=open(“mynewfile.txt","w")
>>> print(fhand)
<_io.TextIOWrapper name='mynewfile.txt' mode='w' encoding='cp1252'>
If the file specified already exists, then the old contents will be erased and it will be ready to
write new data into it. If the file does not exists, then a new file with the given name will be
created.
The write() method is used to write data into a file. This method returns number of
characters successfully written into a file. For example,
Now, the file object keeps track of its position in a file. Hence, if we write one more line into
the file, it will be added at the end of previous line. Here is a complete program to write few
lines into a file –
fhand=open('f1.txt','w')
for i in range(5):
line=input("Enter a line: ")
fhand.write(line+"\n")
fhand.close()
The above program will ask the user to enter 5 lines in a loop. After every line has been
entered, it will be written into a file. Note that, as write() method doesn‟t add a new-line
character by its own, we need to write it explicitly at the end of every line. Once the loop
gets over, the program terminates. Now, we need to check the file f1.txt on the disk (in
the same directory where the above Python code is stored) to find our input lines that have
been written into it.
Now, if we run the above program, we will get the lines which starts with h –
hello how are you?
how about you?
count =0
for line in fhand:
count+=1
print("Line Number ",count, ":", line)
print("Total lines=",count)
fhand.close()
In this program, the user input filename is received through variable fname, and the same
has been used as an argument to open() method. Now, if the user input is myfile.txt
(discussed before), then the result would be
Total lines=3
Everything goes well, if the user gives a proper file name as input. But, what if the input
filename cannot be opened (Due to some reason like – file doesn‟t exists, file permission
denied etc)? Obviously, Python throws an error. The programmer need to handle such run-
time errors as discussed in the next section.
count =0
for line in fhand:
count+=1
print("Line Number ",count, ":", line)
print("Total lines=",count)
fhand.close()
In the above program, the command to open a file is kept within try block. If the specified
file cannot be opened due to any reason, then an error message is displayed saying File
cannot be opened, and the program is terminated. If the file could able to open
successfully, then we will proceed further to perform required task using that file.
2.3.9 Debugging
While performing operations on files, we may need to extract required set of lines or words
or characters. For that purpose, we may use string functions with appropriate delimiters
that may exist between the words/lines of a file. But, usually, the invisible characters like
white-space, tabs and new-line characters are confusing and it is hard to identify them
properly. For example,
Here, by looking at the output, it may be difficult to make out where there is a space, where
is a tab etc. Python provides a utility function called as repr() to solve this problem. This
method takes any object as an argument and returns a string representation of that object.
For example, the print() in the above code snippet can be modified as –
>>> print(repr(s))
'1 2\t 3\n 4'
Note that, some of the systems use \n as new-line character, and few others may use \r
(carriage return) as a new-line character. The repr() method helps in identifying that too.
MODULE – 3
3.1 LISTS
A list is an ordered sequence of values. It is a data structure in Python. The values inside
the lists can be of any type (like integer, float, strings, lists, tuples, dictionaries etc) and are
called as elements or items. The elements of lists are enclosed within square brackets. For
example,
ls1=[10,-4, 25, 13]
ls2=[‚Tiger‛, ‚Lion‛, ‚Cheetah‛]
Here, ls1 is a list containing four integers, and ls2 is a list containing three strings. A list
need not contain data of same type. We can have mixed type of elements in list. For
example,
ls3=[3.5, ‘Tiger’, 10, [3,4]]
Here, ls3 contains a float, a string, an integer and a list. This illustrates that a list can be
nested as well.
In fact, list() is the name of a method (special type of method called as constructor –
which will be discussed in Module 4) of the class list. Hence, a new list can be created
using this function by passing arguments to it as shown below –
>>> ls2=list([3,4,1])
>>> print(ls2)
[3, 4, 1]
Observe here that, the inner list is treated as a single element by outer list. If we would like
to access the elements within inner list, we need to use double-indexing as shown below –
>>> print(ls[2][0])
2
>>> print(ls[2][1])
3
Note that, the indexing for inner-list again starts from 0. Thus, when we are using double-
indexing, the first index indicates position of inner list inside outer list, and the second index
means the position particular value within inner list.
Unlike strings, lists are mutable. That is, using indexing, we can modify any value within list.
In the following example, the 3rd element (i.e. index is 2) is being modified –
The list can be thought of as a relationship between indices and elements. This relationship
is called as a mapping. That is, each index maps to one of the elements in a list.
34
hi
[2,3]
-5
List elements can be accessed with the combination of range() and len() functions as well –
ls=[1,2,3,4]
for i in range(len(ls)):
ls[i]=ls[i]**2
>>> ls1=[1,2,3]
>>> ls2=[5,6,7]
>>> print(ls1+ls2) #concatenation using +
[1, 2, 3, 5, 6, 7]
>>> ls1=[1,2,3]
>>> print(ls1*3) #repetition using *
[1, 2, 3, 1, 2, 3, 1, 2, 3]
t=['a','b','c','d','e'] [i:j:k]
Extracting full list without using any index, but only a slicing operator –
>>> print(t[:])
['a', 'b', 'c', 'd', 'e']
append(): This method is used to add a new element at the end of a list.
>>> ls=[1,2,3]
>>> ls.append(‘hi’)
>>> ls.append(10)
>>> print(ls)
[1, 2, 3, ‘hi’, 10]
extend(): This method takes a list as an argument and all the elements in this list
are added at the end of invoking list.
>>> ls1=[1,2,3]
>>> ls2=[5,6]
>>> ls2.extend(ls1) ls1.extend(ls2)
>>> print(ls2) print(ls1)
[5, 6, 1, 2, 3] [1,2,3,5,6]
sort(): This method is used to sort the contents of the list. By default, the function
will sort the items in ascending order.
When we want a list to be sorted in descending order, we need to set the argument
as shown –
>>> ls.sort(reverse=True)
>>> print(ls)
[16, 10, 5, 3, -2]
clear(): This method removes all the elements in the list and makes the list empty.
>>> ls=[1,2,3]
>>> ls.clear()
>>> print(ls)
[]
index(): This method is used to get the index position of a particular value in the list.
>>> ls=[4, 2, 10, 5, 3, 2, 6]
>>> ls.index(2)
1
Here, the number 2 is found at the index position 1. Note that, this function will give
index of only the first occurrence of a specified value. The same function can be
used with two more arguments start and end to specify a range within which the
search should take place.
Here, the argument ls1 for the append() function is treated as one item, and made as
an inner list to ls2. On the other hand, if we replace append() by extend() then the
result would be –
>>> ls1=[1,2,3]
>>> ls2=[5,6]
>>> ls2.extend(ls1)
>>> print(ls2)
[5, 6, 1, 2, 3]
2. The sort() function can be applied only when the list contains elements of compatible
types. But, if a list is a mix non-compatible types like integers and string, the comparison
cannot be done. Hence, Python will throw TypeError. For example,
>>> ls=[34,[2,3],5]
>>> ls.sort()
TypeError: '<' not supported between instances of 'list' and 'int'
Integers and floats are compatible and relational operations can be performed on them.
Hence, we can sort a list containing such items.
3. The sort() function uses one important argument keys. When a list is containing tuples,
it will be useful. We will discuss tuples later in this Module.
4. Most of the list methods like append(), extend(), sort(), reverse() etc. modify the list
object internally and return None.
>>> ls=[2,3]
>>> ls1=ls.append(5)
>>> print(ls)
[2,3,5]
>>> print(ls1)
None
When an element at a particular index position has to be deleted, then we can give
that position as argument to pop() function.
>>> t = ['a', 'b', 'c']
>>> x = t.pop(1) #item at index 1 is popped
>>> print(t)
['a', 'c']
>>> print(x)
b
remove(): When we don‟t know the index, but know the value to be removed, then
this function can be used.
Note that, this function will remove only the first occurrence of the specified value,
but not all occurrences.
>>> ls=[5,8, -12, 34, 2, 6, 34]
>>> ls.remove(34)
>>> print(ls)
[5, 8, -12, 2, 6, 34]
Unlike pop() function, the remove() function will not return the value that has been
deleted.
del: This is an operator to be used when more than one item to be deleted at a time.
Here also, we will not get the items deleted.
>>> ls=[3,6,-2,8,1]
>>> del ls[2] #item at index 2 is deleted
>>> print(ls)
[3, 6, 8, 1]
>>> avg=sum(ls)/len(ls)
>>> print(avg)
11.857142857142858
When we need to read the data from the user and to compute sum and average of those
numbers, we can write the code as below –
ls= list()
while (True):
x= input('Enter a number: ')
if x== 'done':
break
x= float(x)
ls.append(x)
1
2 ls=[1,2]
3 ls=[1,2,3]
done
In the above program, we initially create an empty list. Then, we are taking an infinite while-
loop. As every input from the keyboard will be in the form of a string, we need to convert x
into float type and then append it to a list. When the keyboard input is a string „done‟, then
the loop is going to get terminated. After the loop, we will find the average of those
numbers with the help of built-in functions sum() and len().
The method list() breaks a string into individual letters and constructs a list. If we want a list
of words from a sentence, we can use the following code –
>>> s="Hello how are you?"
>>> ls=s.split()
>>> print(ls)
['Hello', 'how', 'are', 'you?']
Note that, when no argument is provided, the split() function takes the delimiter as white
space. If we need a specific delimiter for splitting the lines, we can use as shown in
following example –
>>> dt="20/03/2018"
>>> ls=dt.split('/')
>>> print(ls)
['20', '03', '2018']
There is a method join() which behaves opposite to split() function. It takes a list of strings
as argument, and joins all the strings into a single string based on the delimiter provided.
For example –
>>> ls=["Hello", "how", "are", "you"]
>>> d=' '
>>> d.join(ls)
'Hello how are you'
Here, we have taken delimiter d as white space. Apart from space, anything can be taken
as delimiter. When we don‟t need any delimiter, use empty string as delimiter.
Consider a situation –
a= “hi”
b= “hi”
Now, the question is whether both a and b refer to the same string. There are two
possible states –
a hi a
hi
b hi b
In the first situation, a and b are two different objects, but containing same value. The
modification in one object is nothing to do with the other. Whereas, in the second case,
both a and b are referring to the same object. That is, a is an alias name for b and vice-
versa. In other words, these two are referring to same memory location.
To check whether two variables are referring to same object or not, we can use is operator.
>>> a= “hi”
>>> b= “hi”
>>> a is b #result is True
>>> a==b #result is True
When two variables are referring to same object, they are called as identical objects.
When two variables are referring to different objects, but contain a same value, they are
known as equivalent objects. For example,
>>> s1=input(“Enter a string:”) #assume you entered hello
>>> s2= input(“Enter a string:”) #assume you entered hello
If two objects are identical, they are also equivalent, but if they are equivalent, they are not
necessarily identical.
String literals are interned by default. That is, when two string literals are created in the
program with a same value, they are going to refer same object. But, string variables read
from the key-board will not have this behavior, because their values are depending on the
user‟s choice.
>>> ls1=[1,2,3]
>>> ls2=[1,2,3]
>>> ls1 is ls2 #output is False
>>> ls1 == ls2 #output is True
3.1.10 Aliasing
When an object is assigned to other using assignment operator, both of them will refer to
same object in the memory. The association of a variable with an object is called as
reference.
>>> ls1=[1,2,3]
>>> ls2= ls1
>>> ls1 is ls2 #output is True
Now, ls2 is said to be reference of ls1. In other words, there are two references to the
same object in the memory.
An object with more than one reference has more than one name, hence we say that object
is aliased. If the aliased object is mutable, changes made in one alias will reflect the other.
>>> ls2[1]= 34
>>> print(ls1) #output is [1, 34, 3]
def del_front(t):
del t[0]
Here, the argument ls and the parameter t both are aliases to same object.
One should understand the operations that will modify the list and the operations that
create a new list. For example, the append() function modifies the list, whereas the +
operator creates a new list.
>>> t1 = [1, 2]
>>> t2 = t1.append(3)
>>> print(t1) #output is [1 2 3]
>>> print(t2) #prints None
>>> t3 = t1 + [5]
>>> print(t3) #output is [1 2 3 5]
>>> t2 is t3 #output is False
Here, after applying append() on t1 object, the t1 itself has been modified and t2 is not
going to get anything. But, when + operator is applied, t1 remains same but t3 will get the
updated result.
The programmer should understand such differences when he/she creates a function
intending to modify a list. For example, the following function has no effect on the original
list –
def test(t):
t=t[1:]
ls=[1,2,3]
test(ls)
print(ls) #prints [1, 2, 3]
def test(t):
return t[1:]
ls=[1,2,3]
ls1=test(ls)
print(ls1) #prints [2, 3]
print(ls) #prints [1, 2, 3]
In the above example also, the original list is not modified, because a return statement
always creates a new object and is assigned to LHS variable at the position of function call.
3.2 DICTIONARIES
A dictionary is a collection of unordered set of key:value pairs, with the requirement that
keys are unique in one dictionary. Unlike lists and strings where elements are accessed
using index values (which are integers), the values in dictionary are accessed using keys. A
key in dictionary can be any immutable type like strings, numbers and tuples. (The tuple
can be made as a key for dictionary, only if that tuple consist of string/number/ sub-tuples).
As lists are mutable – that is, can be modified using index assignments, slicing, or using
methods like append(), extend() etc, they cannot be a key for dictionary.
One can think of a dictionary as a mapping between set of indices (which are actually keys)
and a set of values. Each key maps to a value.
To initialize a dictionary at the time of creation itself, one can use the code like –
>>> tel_dir={'Tom': 3491, 'Jerry':8135}
>>> print(tel_dir)
{'Tom': 3491, 'Jerry': 8135}
>>> tel_dir['Donald']=4793
>>> print(tel_dir)
{'Tom': 3491, 'Jerry': 8135, 'Donald': 4793}
NOTE that the order of elements in dictionary is unpredictable. That is, in the above
example, don‟t assume that 'Tom': 3491 is first item, 'Jerry': 8135 is second item
etc. As dictionary members are not indexed over integers, the order of elements inside it
may vary. However, using a key, we can extract its associated value as shown below –
>>> print(tel_dir['Jerry'])
8135
Here, the key 'Jerry' maps with the value 8135, hence it doesn‟t matter where exactly it
is inside the dictionary.
If a particular key is not there in the dictionary and if we try to access such key, then the
KeyError is generated.
>>> print(tel_dir['Mickey'])
KeyError: 'Mickey'
The len() function on dictionary object gives the number of key-value pairs in that object.
>>> print(tel_dir)
{'Tom': 3491, 'Jerry': 8135, 'Donald': 4793}
>>> len(tel_dir)
3
The in operator can be used to check whether any key (not value) appears in the dictionary
object.
>>> 'Mickey' in tel_dir #output is False
>>> 'Jerry' in tel_dir #output is True
>>> 3491 in tel_dir #output is False
We observe from above example that the value 3491 is associated with the key 'Tom' in
tel_dir. But, the in operator returns False.
The dictionary object has a method values() which will return a list of all the values
associated with keys within a dictionary. If we would like to check whether a particular value
exist in a dictionary, we can make use of it as shown below –
Each of the above methods will perform same task, but the logic of implementation will be
different. Here, we will see the implementation using dictionary.
It can be observed from the output that, a dictionary is created here with characters as keys
and frequencies as values. Note that, here we have computed histogram of counters.
Dictionary in Python has a method called as get(), which takes key and a default value as
two arguments. If key is found in the dictionary, then the get() function returns
corresponding value, otherwise it returns default value. For example, get(h,0)
for ch in s:
d[ch]=d.get(ch,0)+1
print(d)
In the above program, for every character ch in a given string, we will try to retrieve a
value. When the ch is found in d, its value is retrieved, 1 is added to it, and restored. If ch
is not found, 0 is taken as default and then 1 is added to it.
Output would be –
Tom 3491 tel_dir[tom]=3491
Jerry 8135
Mickey 1253
Note that, while accessing items from dictionary, the keys may not be in order. If we want to
print the keys in alphabetical order, then we need to make a list of the keys, and then sort
that list. We can do so using keys() method of dictionary and sort() method of lists.
Consider the following code –
Note: The key-value pair from dictionary can be together accessed with the help of a
method items() as shown –
The usage of comma-separated list k,v here is internally a tuple (another data structure in
Python, which will be discussed later).
Now, we need to count the frequency of each of the word in this file. So, we need to take
an outer loop for iterating over entire file, and an inner loop for traversing each line in a file.
Then in every line, we count the occurrence of a word, as we did before for a character.
The program is given as below –
d=dict()
print(d)
The output of this program when the input file is myfile.txt would be –
While solving problems on text analysis, machine learning, data analysis etc. such kinds of
treatment of words lead to unexpected results. So, we need to be careful in parsing the text
and we should try to eliminate punctuation marks, ignoring the case etc. The procedure is
discussed in the next section.
The str class has a method maketrans() which returns a translation table usable for another
method translate(). Consider the following syntax to understand it more clearly –
The above statement replaces the characters in fromstr with the character in the same
position in tostr and delete all characters that are in deletestr. The fromstr and
tostr can be empty strings and the deletestr parameter can be omitted.
Using these functions, we will re-write the program for finding frequency of words in a file.
import string
try:
fhand=open(fname)
except:
print("File cannot be opened")
exit()
d=dict()
print(d)
Comparing the output of this modified program with the previous one, we can make out that
all the punctuation marks are not considered for parsing and also the case of the alphabets
are ignored.
3.2.5 Debugging
When we are working with big datasets (like file containing thousands of pages), it is
difficult to debug by printing and checking the data by hand. So, we can follow any of the
following procedures for easy debugging of the large datasets –
Scale down the input: If possible, reduce the size of the dataset. For example if the
program reads a text file, start with just first 10 lines or with the smallest example you
can find. You can either edit the files themselves, or modify the program so it reads only
the first n lines. If there is an error, you can reduce n to the smallest value that
manifests the error, and then increase it gradually as you correct the errors.
Check summaries and types: Instead of printing and checking the entire dataset,
consider printing summaries of the data: for example, the number of items in a
dictionary or the total of a list of numbers. A common cause of runtime errors is a value
that is not the right type. For debugging this kind of error, it is often enough to print the
type of a value.
Write self-checks: Sometimes you can write code to check for errors automatically. For
example, if you are computing the average of a list of numbers, you could check that the
result is not greater than the largest element in the list or less than the smallest. This is
called a sanity check because it detects results that are “completely illogical”. Another
kind of check compares the results of two different computations to see if they are
consistent. This is called a consistency check.
Pretty print the output: Formatting debugging output can make it easier to spot an
error.
Here we will consider the implementation of a dictionary of n records with keys k1, k2 …kn.
Hashing is based on the idea of distributing keys among a one-dimensional array
H[0…m-1], called hash table.
For each key, a value is computed using a predefined function called hash function. This
function assigns an integer, called hash address, between 0 to m-1 to each key. Based
on the hash address, the keys will be distributed in a hash table.
For example, if the keys k1, k2, …., kn are integers, then a hash function can be
h(K) = K mod m.
Let us take keys as 65, 78, 22, 30, 47, 89. And let hash function be,
h(k) = k%10.
Then the hash addresses may be any value from 0 to 9. For each key, hash address will
be computed as –
h(65) = 65 %10 = 5
h(78) = 78%10 = 8
h(22)= 22 % 10 =2
h(30)= 30 %10 =0
h(47) = 47 %10 = 7
h(89)=89 % 10 = 9
Hash Collisions: Let us have n keys and the hash table is of size m such that m<n. As
each key will have an address with any value between 0 to m-1, it is obvious that more than
one key will have same hash address. That is, two or more keys need to be hashed into the
same cell of hash table. This situation is called as hash collision.
In the worst case, all the keys may be hashed into same cell of hash table. But, we can
avoid this by choosing proper size of hash table and hash function. Anyway, every hashing
scheme must have a mechanism for resolving hash collision. There are two methods for
hash collision resolution, viz.
Open hashing
closed hashing
Open Hashing (or Separate Chaining): In open hashing, keys are stored in linked lists
attached to cells of a hash table. Each list contains all the keys hashed to its cell. For
example, consider the elements
65, 78, 22, 30, 47, 89, 55, 42, 18, 29, 37.
If we take the hash function as h(k)= k%10, then the hash addresses will be –
h(65) = 65 %10 = 5 h(78) = 78%10 = 8
h(22)= 22 % 10 =2 h(30)= 30 %10 =0
h(47) = 47 %10 = 7 h(89)=89 % 10 = 9
h(55)=55%10 =5 h(42)=42%10 =2
h(18)=18%10 =8 h(29)=29%10=9
h(37)=37%10 =7
30 22 65 47 78 89
42 55 37 18 29
Operations on Hashing:
Searching: Now, if we want to search for the key element in a hash table, we need
to find the hash address of that key using same hash function. Using the obtained
hash address, we need to search the linked list by tracing it, till either the key is
found or list gets exhausted.
Insertion: Insertion of new element to hash table is also done in similar manner.
Hash key is obtained for new element and is inserted at the end of the list for that
particular cell.
Deletion: Deletion of element is done by searching that element and then deleting it
from a linked list.
Closed Hashing (or Open Addressing): In this technique, all keys are stored in the
hash table itself without using linked lists. Different methods can be used to resolve hash
collisions. The simplest technique is linear probing.
This method suggests to check the next cell from where the collision occurs. If that cell is
empty, the key is hashed there. Otherwise, we will continue checking for the empty cell in a
circular manner. Thus, in this technique, the hash table size must be at least as large as
the total number of keys. That is, if we have n elements to be hashed, then the size of hash
table should be greater or equal to n.
Example: Consider the elements 65, 78, 18, 22, 30, 89, 37, 55, 42
Let us take the hash function as h(k)= k%10, then the hash addresses will be –
h(65) = 65 %10 = 5 h(78) = 78%10 = 8
h(18)=18%10 =8 h(22)= 22 % 10 =2
h(30)= 30 %10 =0 h(89)=89 % 10 = 9
h(37)=37%10 =7 h(55)=55%10 =5
h(42)=42%10 =2
Since there are 9 elements in the list, our hash table should at least be of size 9. Here we
are taking the size as 10.
Drawbacks:
Searching may become like a linear search and hence not efficient.
3.3 TUPLES
A tuple is a sequence of items, similar to lists. The values stored in the tuple can be of any
type and they are indexed using integers. Unlike lists, tuples are immutable. That is, values
within tuples cannot be modified/reassigned. Tuples are comparable and hashable objects.
Hence, they can be made as keys in dictionaries.
A tuple can be created in Python as a comma separated list of items – may or may not be
enclosed within parentheses.
If we would like to create a tuple with single value, then just a parenthesis will not suffice.
For example,
Thus, to have a tuple with single item, we must include a comma after the item. That is,
>>> t2=tuple()
>>> type(t2)
<class 'tuple'>
If we provide an argument of type sequence (a list, a string or tuple) to the method tuple(),
then a tuple with the elements in a given sequence will be created –
>>> t=tuple('Hello')
>>> print(t)
('H', 'e', 'l', 'l', 'o')
>>> t=tuple([3,[12,5],'Hi'])
>>> print(t)
(3, [12, 5], 'Hi')
Note that, in the above example, both t and t1 objects are referring to same memory
location. That is, t1 is a reference to t.
Elements in the tuple can be extracted using square-brackets with the help of indices.
Similarly, slicing also can be applied to extract required number of items from tuple.
Modifying the value in a tuple generates error, because tuples are immutable –
>>> t[0]='Kiwi'
TypeError: 'tuple' object does not support item assignment
We wanted to replace „Mango‟ by „Kiwi‟, which did not work using assignment. But, a tuple
can be replaced with another tuple involving required modifications –
>>> t=('Kiwi',)+t[1:]
>>> print(t)
('Kiwi', 'Banana', 'Apple')
>>> (1,2,3)==(1,2,5)
False
>>> (3,4)==(3,4)
True
The meaning of < and > in tuples is not exactly less than and greater than, instead, it
means comes before and comes after. Hence in such cases, we will get results different
from checking equality (==).
>>> (1,2,3)<(1,2,5)
True
>>> (3,4)<(5,2)
True
When we use relational operator on tuples containing non-comparable types, then
TypeError will be thrown.
>>> (1,'hi')<('hello','world')
TypeError: '<' not supported between instances of 'int' and 'str'
The sort() function internally works on similar pattern – it sorts primarily by first element, in
case of tie, it sorts on second element and so on. This pattern is known as DSU –
Decorate a sequence by building a list of tuples with one or more sort keys
preceding the elements from the sequence,
Sort the list of tuples using the Python built-in sort(), and
Undecorate by extracting the sorted elements of the sequence.
The list is: [(3, 'Ram'), (3, 'and'), (5, 'Seeta'), (4, 'went'),
(2, 'to'), (6, 'forest'), (4, 'with'), (8, 'Lakshman')]
In the above program, we have split the sentence into a list of words. Then, a tuple
containing length of the word and the word itself are created and are appended to a list.
Observe the output of this list – it is a list of tuples. Then we are sorting this list in
descending order. Now for sorting, length of the word is considered, because it is a first
element in the tuple. At the end, we extract length and word in the list, and create another
list containing only the words and print it.
>>> x,y=10,20
>>> print(x) #prints 10
>>> print(y) #prints 20
When we have list of items, they can be extracted and stored into multiple variables as
below –
The best known example of assignment of tuples is swapping two values as below –
>>> a=10
>>> b=20
>>> a, b = b, a
>>> print(a, b) #prints 20 10
While doing assignment of multiple variables, the RHS can be any type of sequence like
list, string or tuple. Following example extracts user name and domain from an email ID.
>>> email='chetanahegde@ieee.org'
>>> usrName, domain = email.split('@')
>>> print(usrName) #prints chetanahegde
>>> print(domain) #prints ieee.org
As dictionary may not display the contents in an order, we can use sort() on lists and then
print in required order as below –
>>> d = {'a':10, 'b':1, 'c':22}
>>> t = list(d.items())
>>> print(t)
[('b', 1), ('a', 10), ('c', 22)]
>>> t.sort()
>>> print(t)
[('a', 10), ('b', 1), ('c', 22)]
This loop has two iteration variables because items() returns a list of tuples. And key,
val is a tuple assignment that successively iterates through each of the key-value pairs in
the dictionary. For each iteration through the loop, both key and value are advanced to the
next key-value pair in the dictionary in hash order.
Once we get a key-value pair, we can create a list of tuples and sort them –
print("List of tuples:",ls)
ls.sort(reverse=True)
print("List of sorted tuples:",ls)
In the above program, we are extracting key, val pair from the dictionary and appending
it to the list ls. While appending, we are putting inner parentheses to make sure that each
pair is treated as a tuple. Then, we are sorting the list in the descending order. The sorting
would happen based on the telephone number (val), but not on name (key), as first
element in tuple is telephone number (val).
import string
fhand = open('test.txt')
counts = dict()
for line in fhand:
line = line.translate(str.maketrans('', '',string.punctuation))
line = line.lower()
lst = list()
for key, val in list(counts.items()):
lst.append((val, key))
lst.sort(reverse=True)
for key, val in lst[:10]:
print(key, val)
Run the above program on any text file of your choice and observe the output.
list=[ ] index
tuples=()
dictionary={ } key value
telDir={} dictionar
1. Strings are more limited compared to other sequences like lists and Tuples.
Because, the elements in strings must be characters only. Moreover, strings are
immutable. Hence, if we need to modify the characters in a sequence, it is better to
go for a list of characters than a string.
2. As lists are mutable, they are most common compared to tuples. But, in some
situations as given below, tuples are preferable.
a. When we have a return statement from a function, it is better to use tuples
rather than lists.
b. When a dictionary key must be a sequence of elements, then we must use
immutable type like strings and tuples
c. When a sequence of elements is being passed to a function as arguments,
usage of tuples reduces unexpected behavior due to aliasing.
3. As tuples are immutable, the methods like sort() and reverse() cannot be applied on
them. But, Python provides built-in functions sorted() and reversed() which will take
a sequence as an argument and return a new sequence with modified results.
3.3.8 Debugging
Lists, Dictionaries and Tuples are basically data structures. In real-time programming, we
may require compound data structures like lists of tuples, dictionaries containing tuples and
lists etc. But, these compound data structures are prone to shape errors – that is, errors
caused when a data structure has the wrong type, size, composition etc. For example,
when your code is expecting a list containing single integer, but you are giving a plain
integer, then there will be an error.
When debugging a program to fix the bugs, following are the few things a programmer can
try –
Reading: Examine your code, read it again and check that it says what you meant
to say.
Running: Experiment by making changes and running different versions. Often if
you display the right thing at the right place in the program, the problem becomes
obvious, but sometimes you have to spend some time to build scaffolding.
Ruminating: Take some time to think! What kind of error is it: syntax, runtime,
semantic? What information can you get from the error messages, or from the output
of the program? What kind of error could cause the problem you‟re seeing? What did
you change last, before the problem appeared?
Retreating: At some point, the best thing to do is back off, undoing recent changes,
until you get back to a program that works and that you understand. Then you can
start rebuilding.
The regular expressions are themselves little programs to search and parse strings. To use
them in our program, the library/module re must be imported. There is a search() function
in this module, which is used to find particular substring within a string. Consider the
following example – myfile.txt
hello how are you
I am doing fine.
How about you?
import re
fhand = open('myfile.txt')
for line in fhand:
line = line.rstrip()
if re.search('how', line):
print(line)
By referring to file myfile.txt that has been discussed in previous Chapters, the output would
be –
In the above program, the search() function is used to search the lines containing a word
how.
*
+
One can observe that the above program is not much different from a program that uses
find() function of strings. But, regular expressions make use of special characters with
specific meaning. In the following example, we make use of caret (^) symbol, which
indicates beginning of the line.
import re
hand = open('myfile.txt')
for line in hand:
line = line.rstrip()
if re.search('^how', line):
print(line)
Here, we have searched for a line which starts with a string how. Again, this program will
not makes use of regular expression fully. Because, the above program would have been
written using a string function startswith(). Hence, in the next section, we will understand
the true usage of regular expressions.
Character Meaning
^ (caret) Matches beginning of the line
$ Matches end of the line
. (dot) Matches any single character except newline. Using option m, then
newline also can be matched
[…] Matches any single character in brackets
[^…] Matches any single character NOT in brackets
re* Matches 0 or more occurrences of preceding expression.
re+ Matches 1 or more occurrence of preceding expression.
re? Matches 0 or 1 occurrence of preceding expression.
re{ n} Matches exactly n number of occurrences of preceding expression.
re{ n,} Matches n or more occurrences of preceding expression.
re{ n, m} Matches at least n and at most m occurrences of preceding expression.
a| b Matches either a or b.
(re) Groups regular expressions and remembers matched text.
\d Matches digits. Equivalent to [0-9].
\D Matches non-digits.
\w Matches word characters.
\W Matches non-word characters.
\s Matches whitespace. Equivalent to [\t\n\r\f].
\S Matches non-whitespace.
\A Matches beginning of string.
\Z Matches end of string. If a newline exists, it matches just Before
newline.
\z Matches end of string.
\b Matches the empty string, but only at the start or end of a word.
\B Matches the empty string, but not at the start or end of a word.
( ) When parentheses are added to a regular expression, they are ignored
for the purpose of matching, but allow you to extract a particular subset
of the matched string rather than the whole string when using
findall()
Most commonly used metacharacter is dot, which matches any character. Consider the
following example, where the regular expression is for searching lines which starts with I
and has any two characters (any character represented by two dots) and then has a
character m.
hello how are you
I am doing fine.
How about you?
import re
fhand = open('myfile.txt')
for line in fhand:
line = line.rstrip()
if re.search('^I..m', line):
print(line)
Note that, the regular expression ^I..m not only matches „I am‟, but it can match „Isam‟,
„I*3m‟ and so on. That is, between I and m, there can be any two characters.
In the previous program, we knew that there are exactly two characters between I and m.
Hence, we could able to give two dots. But, when we don‟t know the exact number of
characters between two characters (or strings), we can make use of dot and + symbols
together. Consider the below given program –
import re
hand = open('myfile.txt')
for line in hand:
line = line.rstrip()
if re.search('^h.+u', line):
print(line)
Observe the regular expression ^h.+u here. It indicates that, the string should be starting
with h and ending with u and there may by any number of (dot and +) characters in-
between.
Pattern to extract lines starting with the word From (or from) and ending with edu:
import re
fhand = open('mbox-short.txt')
for line in fhand:
line = line.rstrip()
pattern = „^[Ff]rom.*edu$‟
if re.search(pattern, line):
print(line)
Here the pattern given for regular expression indicates that the line should start with
either From or from. Then there may be 0 or more characters, and later the line should
end with edu.
Using Not :
pattern = „^[^a-z0-9]+‟
Here, the first ^ indicates we want something to match in the beginning of a line. Then,
the ^ inside square-brackets indicate do not match any single character within bracket.
Hence, the whole meaning would be – line must be started with anything other than a
lower-case alphabets and digits. In other words, the line should not be started with
lowercase alphabet and digits.
Here, the line should start with capital letters, followed by 0 or more characters, but must
end with any digit.
import re
s = 'A message from csev@umich.edu to cwen@iupui.edu about meeting
@2PM'
lst = re.findall('\S+@\S+', s)
print(lst)
Here, the pattern indicates at least one non-white space characters (\S) before @ and at
least one non-white space after @. Hence, it will not match with @2pm, because of a white-
space before @.
Now, we can write a complete program to extract all email-ids from the file.
import re
fhand = open('mbox-short.txt')
for line in fhand:
line = line.rstrip()
x = re.findall('\S+@\S+', line)
if len(x) > 0:
print(x)
Here, the condition len(x) > 0 is checked because, we want to print only the line which
contain an email-ID. If any line do not find the match for a pattern given, the findall()
function will return an empty list. The length of empty list will be zero, and hence we would
like to print the lines only with length greater than 0.
['stephen.marquard@uct.ac.za']
['<postmaster@collab.sakaiproject.org>']
['<200801051412.m05ECIaH010327@nakamura.uits.iupui.edu>']
['<source@collab.sakaiproject.org>;']
['<source@collab.sakaiproject.org>;']
['<source@collab.sakaiproject.org>;']
['apache@localhost)']
……………………………….
………………………………..
Note that, apart from just email-ID‟s, the output contains additional characters (<, >, ; etc)
attached to the extracted pattern. To remove all that, refine the pattern. That is, we want
email-ID to be started with any alphabets or digits, and ending with only alphabets. Hence,
the statement would be –
x = re.findall('[a-zA-Z0-9]\S*@\S*[a-zA-Z]', line)
X-DSPAM-Confidence: 1.8475
X-DSPAM-Probability: 0.0000
The line should start with X-, followed by 0 or more characters. Then, we need a colon and
white-space. They are written as it is. Then there must be a number containing one or more
digits with or without a decimal point. Note that, we want dot as a part of our pattern string,
but not as meta character here. The pattern for regular expression would be –
^X-.*: [0-9.]+
When we add parentheses to a regular expression, they are ignored when matching the
string. But when we are using findall(), parentheses indicate that while we want the whole
expression to match, we only are interested in extracting a portion of the substring that
matches the regular expression.
import re
hand = open('mbox-short.txt')
for line in hand:
line = line.rstrip()
Because of the parentheses enclosing the pattern above, it will match the pattern starting
with X- and extracts only digit portion. Now, the output would be –
['0.8475']
['0.0000']
['0.6178']
['0.0000']
['0.6961']
…………………
………………..
Another example of similar form: The file mbox-short.txt contains lines like –
Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
We may be interested in extracting only the revision numbers mentioned at the end of
these lines. Then, we can write the statement –
x = re.findall('^Details:.*rev=([0-9.]+)', line)
The regex here indicates that the line must start with Details:, and has something with
rev= and then digits. As we want only those digits, we will put parenthesis for that portion
of expression. Note that, the expression [0-9] is greedy, because, it can display very
large number. It keeps grabbing digits until it finds any other character than the digit. The
output of above regular expression is a set of revision numbers as given below –
['39772']
['39771']
['39770']
['39769']
………………………
………………………
Consider another example – we may be interested in knowing time of a day of each email.
The file mbox-short.txt has lines like –
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
Here, we would like to extract only the hour 09. That is, we would like only two digits
representing hour. Hence, we need to modify our expression as –
x = re.findall('^From .* ([0-9][0-9]):', line)
Here, [0-9][0-9] indicates that a digit should appear only two times. The alternative way
of writing this would be -
The number 2 within flower-brackets indicates that the preceding match should appear
exactly two times. Hence [0-9]{2} indicates there can be exactly two digits. Now, the
output would be –
['09']
['18']
['16']
['15']
…………………
…………………
import re
x = 'We just received $10.00 for cookies.'
y = re.findall('\$[0-9.]+',x)
Output:
['$10.00']
Here, we want to extract only the price $10.00. As, $ symbol is a metacharacter, we need
to use \ before it. So that, now $ is treated as a part of matching string, but not as
metacharacter.
MODULE IV
MODULE IV
Programmer-defined Types
A class in Python can be created using a keyword class.
Here, we are creating an empty class without any members by just using the keyword passwithin it.
class Point:
pass
print(Point)
The term main indicates that the class Point is in the main scope of the current module.
In other words, this class is at the top level while executing the program.
Now, a user-defined data type Point got created, and this can be used to create any number of
objects of this class.
Observe the following statements:
p=Point()
Now, a reference (for easy understanding, treat reference as a pointer) to Point object is created
and is returned. This returned reference is assigned to the object p.
The process of creating a new object is called as instantiation and the object is instance of a
class.
When we print an object, Python tells which class it belongs to and where it is stored in the
memory.
print(p)
The output displays the address (in hexadecimal format) of the object in the memory.
It is now clear that, the object occupies the physical space, whereas the class does not.
Attributes
An object can contain named elements known as attributes.
One can assign values to these attributes using dot operator.
For example, keeping coordinate points in mind, we can assign two attributes x and y for the
object of a class Point as below
p.x =10.0
p.y =20.0
A state diagram that shows an object and its attributes is called as object diagram.
For the object p, the object diagram is shown in Figure below.
Point
p1 x 10 p2
y 20
The diagram indicates that a variable (i.e. object) p refers to a Point object, which contains two
attributes.
Each attributes refers to a floating point number.
One can access attributes of an object as shown –
>>> print(p.x)
10.0
>>> print(p.y)
20.0
Here, p.x means “Go to the object p refers to and get the value of x”.
Attributes of an object can be assigned to other variables
>>> x= p.x
>>> print(x)
10.0
A function distance() which takes two objects of Point class as arguments and computes the
Euclidean distance between them.
A function print_point()to display one point in the form of ordered-pair.
Program:
import math
class Point:
""" This is a class Point representing a coordinate point"""
def read_point(p):
p.x=float(input("x coordinate:"))
p.y=float(input("y coordinate:"))
def print_point(p):
print("(%g,%g)"%(p.x, p.y))
def distance(p1,p2):
d=math.sqrt((p1.x-p2.x)**2+(p1.y-p2.y)**2)
return d
Note that, you need to type two underscores, then the word doc and again two underscores.In
the above program, there is no need of docstring and we would have just used pass to
indicate an empty class. But, it is better to understand the professional way of writing user-
defined types and hence, introduced docstring.
The function read_point() take one argument of type Point object. When we use the
statements like,
read_point(p1)
the parameter p of this function will act as an alias for the argument p1. Hence, the modification
done to the alias p reflects the original argument p1. With the help of this function, we are
instructing Python that the object p1 has two attributes x and y.
The function print_point() also takes one argument and with the help of format- strings, we
are printing the attributes x and y of the Point object as an ordered-pair (x,y).
As we know, the Euclidean distance between two points (x1,y1) and (x2,y2) is
x1 x22 y1 y2 2
In this program, we have Point objects as (p1.x, p1.y) and (p2.x, p2.y). Apply the formula on
these points by passing objects p1 and p2 as parameters to the function distance(). And then
return the result.
Thus, the above program gives an idea of defining a class, instantiating objects, creating attributes,
defining functions that takes objects as arguments and finally, calling (or invoking) such functions
whenever and wherever necessary.
NOTE: User-defined classes in Python have two types of attributes viz. class attributes and instance
attributes. Class attributes are defined inside the class (usually, immediately after class header). They
are common to all the objects of that class. That is, they are shared by all the objects created from that
class. But, instance attributes defined for individual objects. They are available only for that instance
(or object). Attributes of one instance are not available for another instance of the same class.
For example, consider the class Point as discussed earlier –
class Point:
pass
This clearly indicates that the attributes x and y created are available only for the object p1, but not
for p2. Thus, x and y are instance attributes but not class attributes.
We will discuss class attributes late in-detail. But, for the understanding purpose, observe the
following example –
class Point:
x=2
y=3
Here, the attributes x and y are defined inside the definition of the class Point itself. Hence, they are
available to all the objects of that class.
Rectangles
It is possible to make an object of one class as an attribute to other class.
To illustrate this, consider an example of creating a class called as Rectangle.
A rectangle can be created using any of the following data –
By knowing width and height of a rectangle and one corner point (ideally, a bottom- left
corner) in a coordinate system
By knowing two opposite corner points
Let us consider the first technique and implement the task: Write a class Rectangle containing
class Point:
""" This is a class Point representing coordinate point"""
class Rectangle:
""" This is a class Rectangle. Attributes: width, height and Corner Point """
def find_center(rect):
p=Point()
p.x = rect.corner.x + rect.width/2
p.y = rect.corner.y + rect.height/2
return p
def print_point(p):
print("(%g,%g)"%(p.x, p.y))
center=find_center(box)
print("The center of rectangle is:")
print_point(center)
resize(box,50,70)
print("Rectangle after resize:")
print("width=%g, height=%g"%(box.width, box.height))
center=find_center(box)
print("The center of resized rectangle is:")
print_point(center)
The statement
box.corner=Point()
indicates that corner is an attribute for the object box and this attribute itself is an object of
the class Point. The following statements indicate that the object box has two more attributes
In this program, we are treating the corner point as the origin in coordinate system and
hence the following assignments –
box.corner.x=0 box.corner.y=0
(Note that, instead of origin, any other location in the coordinate system can be given as
corner point.) Based on all above statements, an object diagram can be drawn as –
Rectangle
The expression box.corner.x means, “Go to the object box refers to and select the attribute
named corner; then go to that object and select the attribute named x.”
The function find_center() takes an object rect as an argument. So, when a call is made using
the statement –
center=find_center(box)
A local object p of type Point has been created inside this function. The attributes of p are x
and y, which takes the values as the coordinates of center point of rectangle. Center of a
rectangle can be computed with the help of following diagram.
(x,y)
Half of width
The function find_center() returns the computed center point. Note that, the return value of a
function here is an instance of some class. That is, one can have an instance as return values
from a function.
The function resize() takes three arguments: rect – an instance of Rectangle class and two
numeric variables w and h. The values w and h are added to existing attributes width and
height. This clearly shows that objects are mutable. State of an object can be changed by
modifying any of its attributes. When this function is called with a statement –
resize(box,50,70)
the rect acts as an alias for box. Hence, width and height modified within the function will
reflect the original object box.
Thus, the above program illustrates the concepts: Object of one class is made as attribute for object of
another class, returning objects from functions and objects are mutable.
Copying
An object will be aliased whenever there an object is assigned to another object of same class.
This may happen in following situations –
Direct object assignment (like p2=p1)
When an object is passed as an argument to a function
When an object is returned from a function
The last two cases have been understood from the two programs in previous sections.
Let us understand the concept of aliasing more in detail using the following program
>>> class Point:
pass
>>> p1=Point()
>>> p1.x=10
>>> p1.y=20
>>> p2=p1
>>> print(p1)
< main .Point object at 0x01581BF0>
>>> print(p2)
< main .Point object at 0x01581BF0>
Observe that both p1 and p2 objects have same physical memory. It is clear now that the object p2
is an alias for p1.
So, we can draw the object diagram as below –
p1 x 10 p2
y 20
Hence, if we check for equality and identity of these two objects, we will get following result.
>>> p1 is p2
True
>>> p1==p2
True
But, the aliasing is not good always. For example, we may need to create a new object using an
existing object such that – the new object should have a different physical memory, but it must have
same attribute (and their values) as that of existing object. Diagrammatically, we need something
as below –
p1 x 10 x 10 p2
y 20 y 20
>>> p1=Point()
>>> p1.x=10
>>> p1.y=20
Observe that the physical address of the objects p1 and p3 are now different.
But, values of attributes xand y are same. Now, use the following statements –
>>> p1 is p3
False
>>> p1 == p3
False
Here, the is operator gives the result as False for the obvious reason of p1 and p3 are being two
different entities on the memory.
But, why == operator is generating False as the result, though the contents of two objects are
same? The reason is p1 and p3 are the objects of user-defined type.
And, Python cannot understand the meaning of equality on the new data type. The default
behavior of equality (==) is identity (is operator) itself. Hence, Python applies this default
behavior on p1 == p3and results in False.
NOTE: If we need to define the meaning of equality (==) operator explicitly on user-defined data
types (i.e. on class objects), then we need to override the method eq () inside the class. This will be
discussed later in detail.
The copy() method of copy module duplicates the object.
The content (i.e. attributes) of one object is copied into another object as we have discussed till
now.
But, when an object itself is an attribute inside another object, the duplication will result in a
strange manner.
To understand this concept, try to copy Rectangle object (created in previous section) as given
below
class Rectangle:
""" This is a class Rectangle.Attributes: width, height and Corner Point """
box1=Rectangle()
box1.corner=Point()
box1.width=100
box1.height=200
box1.corner.x=0
box1.corner.y=0
box2=copy.copy(box1)
print(box1 is box2) #prints False
print(box1.corner is box2.corner) #prints True
Now, the question is – why box1.corner and box2.corner are same objects, when box1 and box2
are different? Whenever the statement is executed,
box2=copy.copy(box1)
The contents of all the attributes of box1 object are copied into the respective attributes of box2
object.
That is, box1.width is copied into box2.width, box1.height is copied into box2.height.
Similarly, box1.corner is copied into box2.corner.
Now, recollect the fact that corner is not exactly the object itself, but it is a reference to the object
of type Point (Read the discussion done for Figure at the beginning of this Chapter).
Hence, the value of reference (that is, the physical address) stored in box1.corner is copied into
box2.corner.
Thus, the physical object to which box1.corner and box2.corner are pointing is only one.
This type of copying the objects is known as shallow copy.
To understand this behavior, observe the following diagram
Rectangle Rectangle
box1 width 100 width 100 box2
Now, the attributes width and height for two objects box1 and box2 are independent.
Whereas, the attribute corner is shared by both the objects.
Thus, any modification done to box1.corner will reflect box2.corner as well.
Obviously, we don’t want this to happen, whenever we create duplicate objects. That is, we want
two independent physical objects.
Python provides a method deepcopy() for doing this task.
This method copies not only the object but also the objects it refers to, and the objects they refer
to, and so on.
box3=copy.deepcopy(box1)
print(box1 is box3) #prints False
print(box1.corner is box3.corner) #prints False
Thus, the objects box1 and box3 are now completely independent.
Debugging
While dealing with classes and objects, we may encounter different types of errors.
For example, if we try to access an attribute which is not there for the object, we will get
AttributeError. For example –
>>> p= Point()
>>> p.x = 10
>>> p.y = 20
>>> print(p.z)
AttributeError: 'Point' object has no attribute 'z'
To avoid such error, it is better to enclose such codes within try/except as given below –
try:
z = p.x
except AttributeError: z = 0
When we are not sure, which type of object it is, then we can use type() as –
>>> type(box1)
<class ' main .Rectangle'>
Another method isinstance() helps to check whether an object is an instance of a particular class
>>> isinstance(box1,Rectangle)
True
When we are not sure whether an object has a particular attribute or not, use a function hasattr() –
>>> hasattr(box1, 'width')
True
Observe the string notation for second argument of the function hasattr(). Though the attribute
width is basically numeric, while giving it as an argument to function hasattr(), it must be
enclosed within quotes.
Pure Functions
To understand the concept of pure functions, let us consider an example of creating a class called
Time. An object of class Time contains hour, minutes and seconds as attributes.
Write a function to print time in HH:MM:SS format and another function to add two time objects.
Note that, adding two time objects should yield proper result and hence we need to check whether
number of seconds exceeds 60, minutes exceeds 60 etc, and take appropriate action.
class Time:
"""Represents the time of a day Attributes: hour, minute, second """
def printTime(t):
print("%.2d:%.2d:%.2d"%(t.hour,t.minute,t.second))
def add_time(t1,t2):
sum=Time()
sum.hour = t1.hour + t2.hour
sum.minute = t1.minute + t2.minute
sum.second = t1.second + t2.second
if sum.second >= 60:
sum.second -= 60
sum.minute += 1
if sum.minute >= 60:
sum.minute -= 60
sum.hour += 1
return sum
t1=Time()
t1.hour=10
t1.minute=34
t1.second=25
print("Time1 is:")
printTime(t1)
t2=Time()
t2.hour=2
t2.minute=12
t2.second=41
print("Time2 is :")
printTime(t2)
t3=add_time(t1,t2)
print("After adding two time objects:")
printTime(t3)
Here, the function add_time() takes two arguments of type Time, and returns a Time object,
whereas, it is not modifying contents of its arguments t1 and t2.
Such functions are called as pure functions.
Modifiers
Sometimes, it is necessary to modify the underlying argument so as to reflect the caller.
That is, arguments have to be modified inside a function and these modifications should be
available to the caller.
The functions that perform such modifications are known as modifier function.
Assume that, we need to add few seconds to a time object, and get a new time.
Then, we can write a function as below
etc. as a problem involving numbers with base 60 (as every hour is 60 minutes and every minute is
60 seconds), then our code can be improved.
Such improved versions are discussed later in this chapter.
Debugging
In the program written inabove, we have treated time objects as valid values.
But, what if the attributes (second, minute, hour) of time object are given as wrong values like
negative number, or hours with value more than 24, minutes/seconds with more than 60 etc? So, it
is better to write error-conditions in such situations to verify the input.
We can write a function similar to as given below –
def valid_time(time):
if time.hour < 0 or time.minute < 0 or time.second < 0:
return False
return True
The assert statement clearly distinguishes the normal conditional statements as a part of the logic
of the program and the code that checks for errors.
There will be a tight relationship between the object of the class and the function that operate on
those objects. Hence, the object oriented nature of Python classes will be discussed here.
Object-Oriented Features
As an object oriented programming language, Python possess following characteristics:
Programs include class and method definitions.
Most of the computation is expressed in terms of operations on objects.
Objects often represent things in the real world, and methods often correspond to the ways
objects in the real world interact.
To establish relationship between the object of the class and a function, we must define a function
as a member of the class. \
function which is associated with a particular class is known as a method.
Methods are semantically the same as functions, but there are two syntactic differences:
Methods are defined inside a class definition in order to make the relationship
between the class and the method explicit.
The syntax for invoking a method is different from the syntax for calling a function.
Now onwards, we will discuss about classes and methods.
import math
class Point:
def init (self,a,b):
self.x=a
self.y=b
def dist(self,p2):
d=math.sqrt((self.x-p2.x)**2 + (self.y-p2.y)**2)
return d
Let us understand the working of this program and the concepts involved:
Keep in mind that every method of any class must have the first argument as self. The
argument self is a reference to the current object. That is, it is reference to the object which
invoked the method. (Those who know C++, can relate self with this pointer). The object
which invokes a method is also known as subject.
The method init () inside the class is an initialization method, which will be invoked
automatically when the object gets created. When the statement like –
p1=Point(10,20)
is used, the init () method will be called automatically. The internal meaning of the above
line is –
p1. init (10,20)
Here, p1 is the object which is invoking a method. Hence, reference to this object is created
and passed to init () as self. The values 10 and 20 are passed to formal parameters a and b of
init () method. Now, inside init () method, we have statements
self.x=10
self.y=20
This indicates, x and y are instance attributes. The value of x for the object p1 is 10 and, the
value of y for the object p1is 20.
When we create another object p2, it will have its own set of x and y. That is, memory locations
of instance attributes are different for every object.
d=p1.dist(p2)
a reference to the object p1 is passed as self to dist() method and p2 is passed explicitly as a
second argument. Now, inside the dist()method, we are calculating distance between two
point (Euclidian distance formula is used) objects. Note that, in this method, we cannot use
the name p1, instead we will use self which is a reference (alias) to p1.
The next method inside the class is str (). It is a special method used for string
representation of user-defined object. Usually, print() is used for printing basic types in
Python. But, user-defined types (class objects) have their own meaning and a way of
representation. To display such types, we can write functions or methods like print_point() as
we did in previous section But, more polymorphic way is to use str () so that, when we
write just print() in the main part of the program, the str () method will be invoked
automatically. Thus, when we use the statement like –
print("P1 is:",p1)
the ordinary print() method will print the portion “P1 is:” and the remaining portion is taken
care by str () method. In fact, str () method will return the string format what we have given
inside it, and that string will be printed by print() method.
Operator Overloading
Ability of an existing operator to work on user-defined data type (class) is known as operator
overloading.
It is a polymorphic nature of any object oriented programming.
Basic operators like +, -, * etc. can be overloaded.
To overload an operator, one needs to write a method within user-defined class.
Python provides a special set of methods which have to be used for overloading required
operator.
The method should consist of the code what the programmer is willing to do with the operator.
Following table shows gives a list of operators and their respective Python methods for
overloading.
+ add () <= le ()
- sub () >= ge ()
* mul () == eq ()
/ truediv () != ne ()
% mod () in contains ()
class Point:
def _init_ (self,a=0,b=0):
self.x=a
self.y=b
p1=Point(10,20)
p2=Point(4,5)
print("P1 is:",p1)
print("P2 is:",p2)
p4=p1+p2 #call for add () method
print("Sum is:",p4)
In the above program, when the statement p4 = p1+p2 is used, it invokes a special method _add
() written inside the class. Because, internal meaning of this statement is–
p4 = p1. add (p4)
Here, p1 is the object invoking the method. Hence, self inside _add () is the reference (alias) of p1.
And, p4 is passed as argument explicitly.
In the definition of add (), we are creating an object p3with the statement –
p3=Point()
The object p3 is created without initialization. Whenever we need to create an object with and without
initialization in the same program, we must set arguments of init () for some default values. Hence, in
the above program arguments a and b of init () are made as default arguments with values as zero.
Thus, x and y attributes of p3will be now zero. In the add () method, we are adding respective attributes
of self and p2 and storing in p3.x and p3.y. Then the object p3 is returned. This returned object is received
as p4and is printed.
NOTE that, in a program containing operator overloading, the overloaded operator behaves in a
normal way when basic types are given. That is, in the above program, if we use the statements
m= 3+4
print(m)
it will be usual addition and gives the result as 7. But, when user-defined types are used as operands,
then the overloaded method is invoked.
class Time:
def init (self, h=0,m=0,s=0):
self.hour=h
self.min=m
self.sec=s
def time_to_int(self):
minute=self.hour*60+self.min
seconds=minute*60+self.sec
return seconds
def _ eq (self,t):
return self.hour==t.hour and self.min==t.min and self.sec==t.sec
self.int_to_time(seconds)
T1=Time(3,40)
T2=Time(5,45)
print("T1 is:",T1)
print("T2 is:",T2)
print("Whether T1 is same as T2?",T1==T2) #call for eq ()
print("T1+T2 is:",T3)
T6=sum([T1,T2,T3,T4])
print("Using sum([T1,T2,T3,T4]):",T6)
Debugging
We have seen earlier that hasattr() method can be used to check whether an object has particular
attribute.
There is one more way of doing it using a method vars(). This method maps attribute names and
their values as a dictionary.
For example, for the Point class defined earlier, use the statements
>>> p = Point(3, 4)
>>> vars(p) #output is {'y': 4, 'x': 3}
For purposes of debugging, you might find it useful to keep this function handy:
def print_attributes(obj):
for attr in vars(obj):
print(attr, getattr(obj, attr))
Here, print_attributes() traverses the dictionary and prints each attribute name and its
corresponding value.
The built-in function getattr() takes an object and an attribute name (as a string) and returns the
attribute values
MODULE V
NOTE: To test all the programs in this section, you must be connected to internet.
AF_INET is an address family (IP) that is used to designate the type of addresses that your
socket can communicate with.When you create a socket, you have to specify its address
family, and then you can use only addresses of that type with the socket.
SOCK_STREAM is a constant indicating the type of socket (TCP). It works as a file stream
and is most reliable over the network.
Port is a logical end-point. Port 80 is one of the most commonly used port numbers in the
Transmission Control Protocol (TCP) suite.
The command to retrieve the data must use CRLF(Carriage Return Line Feed) line endings, and
it must end in \r\n\r\n (line break in protocol specification).
encode() method applied on strings will return bytes-representation of the string. Instead of
encode() method, one can attach a character b at the beginning of the string for the same effect.
decode() method returns a string decoded from the given bytes.
Figure : A Socket Connection
A socket connection between the user program and the webpage is shown in Figure below
while True:
data = mysock.recv(512)
if (len(data) < 1):
break
print(data.decode(),end='')
mysock.close()
When we run above program, we will get some information related to web-server of the website
which we are trying to scrape.
Then, we will get the data written in that web-page. In this program, we are extracting 512 bytes
of data at a time. (One can use one‟s convenient number here). The extracted data is decoded and
printed. When the length of data becomes less than one (that is, no more data left out on the web
page), the loop is terminated.
import socket
import time
count = 0
picture = b"" #empty string in binary format
while True:
data = mysock.recv(5120) #retrieve 5120 bytes at a time
if (len(data) < 1):
break
mysock.close()
When we run the above program, the amount of data (in bytes) retrieved from the internet is
displayed in a cumulative format.
At the end, the image file „stuff.jpg‟ will be stored in the current working directory. (One has to
verify it by looking at current working directory of the program).
import urllib.request
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
print(line.decode().strip())
Once the web page has been opened with urllib.urlopen, we can treat it like a file and read through
it using a for-loop.
When the program runs, we only see the output of the contents of the file.
The headers are still sent, but the urllib code consumes theheaders and only returns the data to us.
Following is the program to retrieve the data from the file romeo.txt which is residing at
www.data.pr4e.org, and then to count number of words in it.
import urllib.request
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
counts = dict()
Once we execute the above program, we can see a file cover3.jpg in the current working
directory in our computer.
The program reads all of the data in at once across the network and stores it in the variable img
in the main memory of your computer, then opens the file cover.jpg and writes the data out to
your disk.
This will work if the size of the file is less than the size of the memory (RAM) of your
computer.
However, if this is a large audio or video file, this program may crash or at least run extremely
slowly when your computer runs out of memory.
In order to avoid memory overflow, we retrieve the data in blocks (or buffers) and then write
each block to your disk before retrieving the next block.
This way the program can read any size file without using up all of the memory you have in
your computer.
Following is another version of above program, where data is read in chunks and then stored
onto the disk.
import urllib.request
img=urllib.request.urlopen('http://data.pr4e.org/cover3.jpg')
fhand = open('cover3.jpg', 'wb')
size = 0
while True:
info = img.read(100000) if
len(info) < 1:
break
size = size + len(info)
fhand.write(info)
Once we run the above program, an image file cover3.jpg will be stored on to the current
working directory.
Parsing HTML and Scraping the Web
One of the common uses of the urllib capability in Python is to scrape the web.
Web scraping is when we write a program that pretends to be a web browser and retrieves pages,
then examines the data in those pages looking for patterns.
Example: a search engine such as Google will look at the source of one web page and extract the
links to other pages and retrieve those pages, extracting links, and so on.
Using this technique, Google spiders its way through nearly all of the pages on the web.
Google also uses the frequency of links from pages it finds to a particular page as one measure of
how “important” a page is and how high the page should appear in its search results.
</p>
Here,
<h1> and </h1>are the beginning and end of header tags
<p>and </p>are the beginning and end of paragraph tags
<a>and </a>are the beginning and end of anchor tag which is used for giving links
href is the attribute for anchor tag which takes the value as the link for another page.
The above information clearly indicates that if we want to extract all the hyperlinks in a webpage,
we need a regular expression which matches the href attribute. Thus, we can create a regular
expression as –
href="http://.+?"
Here, the question mark in .+? indicate that the match should find smallest possible matching
string.
Now, consider a Python program that uses the above regular expression to extract all hyperlinks
<person>
<name>Chuck</name>
<phone type="intl"> +1 734 303 4456
</phone>
<email hide="yes"/>
</person>
Often it is helpful to think of an XML document as a tree structure where there is a top tag person
and other tags such as phone are drawn as children of their parent nodes.
Figure is the tree structure for above given XML code.
Parsing XML
Figure : Tree Representation of XML
Python provides library xml.etree.ElementTree to parse the data from XML files.
One has to provide XML code as a string to built-in method fromstring() of ElementTree class.
ElementTree acts as a parser and provides a set of relevant methods to extract the data.
Hence, the programmer need not know the rules and the format of XML document syntax.
The fromstring() method will convert XML code into a tree-structure of XML nodes.
When the XML is in a tree format, Python provides several methods to extract data from XML.
Consider the following program.
import xml.etree.ElementTree as ET
tree = ET.fromstring(data)
print('Attribute for tag email:', tree.find('email').get('hide'))
print('Attribute for tag phone:', tree.find('phone').get('type'))
http://www.crummy.com/software/
Consider the following program which uses urllib to read the page and uses BeautifulSoup to
extract href attribute from the anchor tag.
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
tags = soup('a')
The above program prompts for a web address, then opens the web page, reads the data and
passes the data to the BeautifulSoup parser, and then retrieves all of the anchor tags and prints out
the href attribute for each tag.
The BeautifulSoup can be used to extract various parts of each tag as shown below –
ctx = ssl.create_default_context()
In the above example, fromstring() is used to convert XML code into a tree.
The find() method searches XML tree and retrieves a node that matches the specified tag.
The get() method retrieves the value associated with the specified attribute of that tag. Each node
can have some text, some attributes (like hide), and some “child” nodes. Each node can be the parent
for a tree of nodes.
</users>
</stuff>'''
stuff = ET.fromstring(input)
lst = stuff.findall('users/user')
print('User count:', len(lst))
{
"name" : "Chuck",
"phone": {"type" : "intl", "number" : "+1 734 303 4456"}, "email": {"hide" : "yes"}
}
Observe the differences between XML code and JSON code:
In XML, we can add attributes like “intl” to the “phone” tag. In JSON, we simply have key-
value pairs.
XML uses tag “person”, which is replaced by a set of outer curly braces in JSON.
In general, JSON structures are simpler than XML because JSON has fewer capabilities than
XML.
But JSON has the advantage that it maps directly to some combination of dictionaries and lists.
And since nearly all programming languages have something equivalent to Python‟s dictionaries
and lists
JSON is a very natural format to have two compatible programs exchange data. JSON is quickly
becoming the format of choice for nearly all data exchange between applications because of its
relative simplicity compared to XML.
Parsing JSON
Python provides a module json to parse the data in JSON pages.
Consider the following program which uses JSON equivalent of XML string written in previous
Section.
Note that, the JSON string has to embed a list of dictionaries.
import json
data = ''' [
{ "id" : "001",
"x" : "2",
"name" : "Chuck" } ,
{ "id" : "009",
"x" : "7",
"name" : "Chuck"
}
]'''
Id 001
Attribute 2
Name Chuck Id
009
Attribute 7
Here, the string data contains a list of users, where each user is a key-value pair. The method
loads() in the json module converts the string into a list of dictionaries.
Now onwards, we don‟t need anything from json, because the parsed data is available in Python
native structures.
Using a for-loop, we can iterate through the list of dictionaries and extract every element (in the
form of key-value pair) as if it is a dictionary object. That is, we use index operator (a pair of
square brackets) to extract value for a particular key.
NOTE: Current IT industry trend is to use JSON for web services rather than XML. Because, JSON
is simpler than XML and it directly maps to native data structures we already have in the programming
languages. This makes parsing and data extraction simpler compared to XML. But XML is more self
descriptive than JSON and so there are some applications where XML retains an advantage. For
example, most word processors store documents internally using XML rather than JSON.
With these advantages, an SOA system must be carefully designed to have good performance and
meet the user‟s needs. When an application makes a set of services in its API available over the
web, then it is called as web services.
serviceurl = 'http://maps.googleapis.com/maps/api/geocode/json?'
address = input('Enter location: ')
if len(address) < 1:
exit()
try:
js = json.loads(data)
except:
js = None
print(json.dumps(js, indent=4))
lat = js["results"][0]["geometry"]["location"]["lat"]
lng = js["results"][0]["geometry"]["location"]["lng"]
print('lat', lat, 'lng', lng)
location = js['results'][0]['formatted_address']
print(location)
(Students are advised to run the above program and check the output, which will contain
several lines of Google geographical data).
The above program retrieves the search string and then encodes it. This encoded string along with
Google API link is treated as a URL to fetch the data from the internet. The data retrieved from
the internet will be now passed to JSON to put it in JSON object format.
If the input string (which must be an existing geographical location like Channasandra,
Malleshwaram etc!!) cannot be located by Google API either due to bad internet or due to unknown
location, we just display the message as „Failure to Retrieve‟.
If Google successfully identifies the location, then we will dump that data in JSON object.
Then, using indexing on JSON (as JSON will be in the form of dictionary), we can retrieve the
location address, longitude, latitude etc.
Database Concepts
For the first look, database seems to be a spreadsheet consisting of multiple sheets.
The primary data structures in a database are tables, rows and columns.
In a relational database terminology, tables, rows and columns are referred as relation, tuple and
attribute respectively.
Consider the problem of storing details of students in a database table. The format may look like –
Thus, table columns indicate the type of information to be stored, and table rows gives record
pertaining to every student.
We can create one more table say addressTable consisting of attributes like DoorNo,
StreetName, Locality, City, PinCode. To relate this table with a respective student stored in
studentTable, we need to store RollNo also in addressTable (Note that, RollNo will be unique for
every student, and hence there won‟t be any confusion).
Thus, there is a relationship between two tables in a single database. There are softwares that can
maintain proper relationships between multiple tables in a single database and are known as
Relational Database Management Systems (RDBMS).
As mentioned earlier, every RDBMS has its own way of storing the data in tables. Each of
RDBMS uses its own set of data types for the attribute values to be used. SQLite uses the data types
as mentioned in the following table –
TEXT The value is a text string, stored using the database encoding (UTF- 8,
UTF-16BE or UTF-16LE)
BLOB The value is a blob (Binary Large Object) of data, stored exactly as it was
input
Note that, SQL commands are case-insensitive. But, it is a common practice to write commands
and clauses in uppercase alphabets just to differentiate them from table name and attribute names.
Now, let us see some of the examples to understand the usage of SQL statements –
CREATE TABLE Tracks (title TEXT, plays INTEGER)
This command creates a table called as Tracks with the attributes title and plays
where title can store data of type TEXT and playscan store data of type INTEGER.
Using this browser, one can easily create tables, insert data, edit data, or run simple SQL queries
on the data in the database.
This database browser is similar to a text editor when working with text files.
When you want to do one or very few operations on a text file, you can just open it in a text editor
and make the changes you want.
When you have many changes that you need to do to a text file, often you will write a simple
Python program.
You will find the same pattern when working with databases. You will do simple operations in the
database manager and more complex operations will be most conveniently done in Python.
Ex1.
import sqlite3
conn = sqlite3.connect('music.sqlite')
cur = conn.cursor()
The connect() method of sqlite3 makes a “connection” to the database stored in the file
music.sqlite3 in the current directory.
If the file does not exist, it will be created.
Sometimes, the database is stored on a different database server from the server on which we are
running our program.
But, all the examples that we consider here will be local file in the current working directory of
Python code.
A cursor() is like a file handle that we can use to perform operations on the data stored in the
database. Calling cursor() is very similar conceptually to calling open() when dealing with text
files.
Hence, once we get a cursor, we can execute the commands on the contents of database using
execute()method.
In the above program, we are trying to remove the database table Tracks, if at all it existed in the
current working directory.
The DROP TABLE command deletes the table along with all its columns and rows.
This procedure will help to avoid a possible error of trying to create a table with same name.
Then, we are creating a table with name Tracks which has two columns viz. title, which can take
TEXT type data and plays, which can take INTEGER type data.
Once our job with the database is over, we need to close the connection using close()method.
In the previous example, we have just created a table, but not inserted any records into it
So, consider below given program, which will create a table and then inserts two rows and finally
delete records based on some condition.
Ex2.
import sqlite3
conn = sqlite3.connect('music.sqlite')
cur = conn.cursor()
cur.execute('DROP TABLE IF EXISTS Tracks')
cur.execute('CREATE TABLE Tracks (title TEXT, plays INTEGER)')
print('Tracks:')
cur.execute('SELECT title, plays FROM Tracks')
for row in cur:
print(row)
In the above program, we are inserting first record with the SQL command –
“INSERT INTO Tracks (title, plays) VALUES('Thunderstruck', 20)”
Note that, execute() requires SQL command to be in string format. But, if the value to be store in
the table is also a string (TEXT type), then there may be a conflict of string representation using
quotes.
Hence, in this example, the entire SQL is mentioned within double-quotes and the value to be
inserted in single quotes. If we would like to use either single quote or double quote everywhere,
then we need to use escape-sequences like \‟ or \”.
While inserting second row in a table, SQL statement is used with a little different syntax –
“INSERT INTO Tracks (title, plays) VALUES (?, ?)”,('My Way', 15)
Here, the question mark acts as a place-holder for particular value.
This type of syntax is useful when we would like to pass user-input values into database table.
After inserting two rows, we must use commit() method to store the inserted records permanently
on the database table.
If this method is not applied, then the insertion (or any other statement execution) will be
temporary and will affect only the current run of the program.
Later, we use SELECT command to retrieve the data from the table and then use for-loop to
display all records.
When data is retrieved from database using SELECT command, the cursor object gets those data as
a list of records.
Hence, we can use for-loop on the cursor object. Finally, we have used a DELETE command to
delete all the records WHERE plays is less than 100.
Ex3.
import sqlite3
from sqlite3 import Error
def create_connection():
""" create a database connection to a database that resides in the memory"""
try:
conn = sqlite3.connect(':memory:')
print("SQLite Version:",sqlite3.version)
except Error as e:
print(e)
finally:
conn.close()
create_connection()
Few points about above program:
Whenever we try to establish a connection with database, there is a possibility of error due to
non-existing database, authentication issues etc. So, it is always better to put the code for
connection inside try-except block.
While developing real time projects, we may need to create database connection and close it
every now-and-then. Instead of writing the code for it repeatedly, it is better to write a
separate function for establishing connection and call that function whenever and wherever
required.
If we give the term :memory: as an argument to connect() method, then the further operations
(like table creation, insertion into tables etc) will be on memory (RAM) of the computer, but
not on the hard disk.
Ex4. Write a program to create a Student database with a table consisting of student name and age.
Read n records from the user and insert them into database. Write queries to display all records and
to display the students whose age is 20.
conn.commit()
c.execute("select * from tblStudent ") print(c.fetchall())
conn.close()
In the above program we take a for-loop to get user-input for student‟s name and age. These data are
inserted into the table. Observe the question mark acting as a placeholder for user-input variables.
Later we use a method fetchall() that is used to display all the records form the table in the form of a
list of tuples. Here, each tuple is one record from the table.
Consider a table consisting of student details like RollNo, name, age, semester and address as
shown below –
In this table, RollNo can be considered as a primary key because it is unique for every student in
that table. Consider another table that is used for storing marks of students in all the three tests as
below
RollNo Sem M1 M2 M3
1 6 34 45 42.5
2 6 42.3 44 25
3 4 38 44 41.5
4 6 39.4 43 40
2 8 37 42 41
To save the memory, this table can have just RollNo and marks in all the tests. There is no need to
store the information like name, age etc of the students as these information can be retrieved from
first table. Now, RollNo is treated as a foreign key in the second table.
Here, RollNo is a primary key and by default it will be unique in one table. Now, another take can
be created as –
Now, in the tblMarks consisting of marks of 3 tests of all the students, RollNo and sem are
together unique. Because, in one semester, only one student can be there having a particular RollNo.
Whereas in another semester, same RollNo may be there.
Such types of relationships are established between various tables in RDBMS and that will help
better management of time and space.
Consider the following program which creates two tables tblStudent and tblMarks as discussed in
the previous section.
Few records are inserted into both the tables. Then we extract the marks of students who are
studying in 6th semester.
import sqlite3
conn=sqlite3.connect('StudentDB.db')
c=conn.cursor()
conn.commit()
c.execute(query)
for row in c:
print(row)
conn.close()
The query joins two tables and extracts the records where RollNo and sem matches in both the tables,
and sem must be 6.