Module 5
Module 5
Python modules: their rationale, function, how to import them in different ways, and present
the content of some standard modules provided by Python;
the concept of an exception and Python's implementation of it, including the try-except
instruction, with its applications, and the raise instruction.
strings and their specific methods, together with their similarities and differences compared to
lists.
1
Modules
What is a module?
Computer code has a tendency to grow. We can say that code that doesn't grow is probably completely
unusable or abandoned. A real, wanted, and widely used code develops continuously, as both users'
demands and users' expectations develop in their own rhythms.
A code which is not able to respond to users' needs will be forgotten quickly, and instantly replaced with
a new, better, and more flexible code. Be prepared for this, and never think that any of your programs is
eventually completed. The completion is a transition state and usually passes quickly, after the first bug
report. Python itself is a good example how the rule acts.
Growing code is in fact a growing problem. A larger code always means tougher maintenance. Searching
for bugs is always easier where the code is smaller (just as finding a mechanical breakage is simpler
when the machinery is simpler and smaller).
Moreover, when the code being created is expected to be really big (you can use a total number of
source lines as a useful, but not very accurate, measure of a code's size) you may want (or rather, you
will be forced) to divide it into many parts, implemented in parallel by a few, a dozen, several dozen, or
even several hundred individual developers.
Of course, this cannot be done using one large source file, which is edited by all programmers at the
same time. This will surely lead to a spectacular disaster.
If you want such a software project to be completed successfully, you have to have the means allowing
you to:
For example, a certain project can be divided into two main parts:
the user interface (the part that communicates with the user using widgets and a graphical
screen)
Each of these parts can be (most likely) divided into smaller ones, and so on. Such a process is often
called decomposition.
For example, if you were asked to arrange a wedding, you wouldn't do everything yourself - you would
find a number of professionals and split the task between them all.
How do you divide a piece of software into separate but cooperating parts? This is the
question. Modules are the answer.
2
Using Modules
the first (probably the most common) happens when you want to use an already existing
module, written by someone else, or created by yourself during your work on some complex
project - in this case you are the module's user;
the second occurs when you want to create a brand new module, either for your own use, or to
make other programmers' lives easier - you are the module's supplier.
First of all, a module is identified by its name. If you want to use any module, you need to know the
name. A (rather large) number of modules is delivered together with Python itself. You can think of
them as a kind of "Python extra equipment".
All these modules, along with the built-in functions, form the Python standard library - a special sort of
library where modules play the roles of books (we can even say that folders play the roles of shelves). If
you want to take a look at the full list of all "volumes" collected in that library, you can find it here:
https://docs.python.org/3/library/index.html.
3
Each module consists of entities (like a book consists of chapters). These entities can be functions,
variables, constants, classes, and objects. If you know how to access a particular module, you can make
use of any of the entities it stores.
Let's start the discussion with one of the most frequently used modules, named math. Its name speaks
for itself - the module contains a rich collection of entities (not only functions) which enable a
programmer to effectively implement calculations demanding the use of mathematical functions,
like sin() or log().
Importing a module
To make a module usable, you must import it (think of it like of taking a book off the shelf). Importing a
module is done by an instruction named import. Note: import is also a keyword (with all the
consequences of this fact).
4
Let's assume that you want to use two entities provided by the math module:
a symbol (constant) representing a precise (as precise as possible using double floating-point
arithmetic) value of π (although using a Greek letter to name a variable is fully possible in
Python, the symbol is named pi - it's a more convenient solution, especially for that part of the
world which neither has nor is going to use a Greek keyboard)
a function named sin() (the computer equivalent of the mathematical sine function)
Both these entities are available through the math module, but the way in which you can use them
strongly depends on how the import has been done.
The simplest way to import a particular module is to use the import instruction as follows:
import math
The instruction may be located anywhere in your code, but it must be placed before the first use of any
of the module's entities.
If you want to (or have to) import more than one module, you can do it by repeating the importclause,
or by listing the modules after the import keyword, like here:
import math, sys
The instruction imports two modules, first the one named math and then the second named sys.
Don't worry, we won't go into great detail - this explanation is going to be as short as possible.
A namespace is a space (understood in a non-physical context) in which some names exist and the
names don't conflict with each other (i.e., there are not two different objects of the same name). We
can say that each social group is a namespace - the group tends to name each of its members in a
unique way (e.g., parents won't give their children the same first names).
5
This uniqueness may be achieved in many ways, e.g., by using nicknames along with the first names (it
will work inside a small group like a class in a school) or by assigning special identifiers to all members of
the group (the US Social Security Number is a good example of such practice).
Inside a certain namespace, each name must remain unique. This may mean that some names may
disappear when any other entity of an already known name enters the namespace. We'll show you how
it works and how to control it, but first, let's return to imports.
If the module of a specified name exists and is accessible (a module is in fact a Python source file),
Python imports its contents, i.e., all the names defined in the module become known, but they don't
enter your code's namespace.
This means that you can have your own entities named sin or pi and they won't be affected by the
import in any way.
At this point, you may be wondering how to access the pi coming from the math module.
To do this, you have to qualify the pi with the name of its original module.
6
Importing a module | math
a dot;
Such a form clearly indicates the namespace in which the name exists.
Note: using this qualification is compulsory if a module has been imported by the import module
instruction. It doesn't matter if any of the names from your code and from the module's namespace are
in conflict or not.
This first example won't be very advanced - we just want to print the value of sin(1/2π).
Note: removing any of the two qualifications will make the code erroneous. There is no other way to
enter math's namespace if you did the following:
import math
7
Importing a module: continued
Now we're going to show you how the two namespaces (yours and the module's one) can coexist.
Run the program. The code should produce the following output:
0.99999999 1.0
8
Importing a module: continued
In the second method, the import's syntax precisely points out which module's entity (or entities) are
acceptable in the code:
from math import pi
the name or list of names of the entity/entities which are being imported into the namespace.
the listed entities (and only those ones) are imported from the indicated module;
Note: no other entities are imported. Moreover, you cannot import additional entities using a
qualification - a line like this one:
9
print(math.e)
Here it is:
from math import sin, pi
print(sin(pi/2))
The output should be the same as previously, as in fact we've used the same entities as before: 1.0.
Copy the code, paste it in the editor, and run the program.
Does the code look simpler? Maybe, but the look is not the only effect of this kind of import. Let's show
you that.
line 03: make use of the imported entities and get the expected result (1.0)
lines 05 through 11: redefine the meaning of pi and sin - in effect, they supersede the original
(imported) definitions within the code's namespace;
10
print(sin(pi/2)) # line 14
line 12: carry out the import - the imported symbols supersede their previous definitions within
the namespace;
11
Importing a module | * and as
Importing a module: *
In the third method, the import's syntax is a more aggressive form of the previously presented one:
from module import *
As you can see, the name of an entity (or the list of entities' names) is replaced with a single asterisk (*).
Is it convenient? Yes, it is, as it relieves you of the duty of enumerating all the names you need.
Is it unsafe? Yes, it is - unless you know all the names provided by the module, you may not be able to
avoid name conflicts. Treat this as a temporary solution, and try not to use it in regular code.
Aliasing causes the module to be identified under a different name than the original. This may shorten
the qualified names, too.
Creating an alias is done together with importing the module, and demands the following form of the
import instruction:
import module as alias
The "module" identifies the original module's name while the "alias" is the name you wish to use instead
of the original.
Note: as is a keyword.
12
import math as m
print(m.sin(m.pi/2))
Note: after successful execution of an aliased import, the original module name becomes
inaccessible and must not be used.
In turn, when you use the from module import name variant and you need to change the entity's name,
you make an alias for the entity. This will cause the name to be replaced by the alias you choose.
The phrase name as alias can be repeated - use commas to separate the multiplied phrases, like this:
from module import n as a, m as b, o as c
Now you're familiar with the basics of using modules. Let us show you some modules and some of their
useful entities.
13
Useful Modules
There is one condition: the module has to have been previously imported as a whole (i.e., using
the import module instruction - from module is not enough).
The function returns an alphabetically sorted list containing all entities' names available in the module
identified by a name passed to the function as an argument:
dir(module)
Note: if the module's name has been aliased, you must use the alias, not the original name.
Using the function inside a regular script doesn't make much sense, but it is still possible.
For example, you can run the following code to print the names of all entities within the math module:
import math
14
for name in dir(math):
print(name, end="\t")
Have you noticed these strange names beginning with __ at the top of the list? We'll tell you more about
them when we talk about the issues related to writing your own modules.
Some of the names might bring back memories from math lessons, and you probably won't have any
problems guessing their meanings.
Using the dir() function inside a code may not seem very useful - usually you want to know a particular
module's contents before you write and run the code.
Fortunately, you can execute the function directly in the Python console (IDLE), without needing to
write and run a separate script.
import math
dir(math)
15
Useful modules | math
We've chosen them arbitrarily, but that doesn't mean that the functions we haven't mentioned here are
any less significant. Dive into the modules' depths yourself - we don't have the space or the time to talk
about everything in detail here.
The first group of the math's functions are connected with trigonometry:
All these functions take one argument (an angle measurement expressed in radians) and return the
appropriate result (be careful with tan() - not all arguments are accepted).
16
These functions take one argument (mind the domains) and return a measure of an angle in radians.
To effectively operate on angle measurements, the math module provides you with the following
entities:
Now look at the code in the editor. The example program isn't very sophisticated, but can you predict its
results?
Apart from the circular functions (listed above) the math module also contains a set of their hyperbolic
analogues:
17
Selected functions from the math module: continued
Another group of the math's functions is formed by functions which are connected with exponentiation:
Look at the code in the editor. Can you predict its output?
18
Selected functions from the math module: continued
The last group consists of some general-purpose functions like:
trunc(x) → the value of x truncated to an integer (be careful - it's not an equivalent either of ceil
or floor)
hypot(x, y) → returns the length of the hypotenuse of a right-angle triangle with the leg lengths
equal to x and y (the same as sqrt(pow(x, 2) + pow(y, 2)) but more precise)
19
Useful Modules | random
Note the prefix pseudo - the numbers generated by the modules may look random in the sense that you
cannot predict their subsequent values, but don't forget that they all are calculated using very refined
algorithms.
20
The algorithms aren't random - they are deterministic and predictable. Only those physical processes
which run completely out of our control (like the intensity of cosmic radiation) may be used as a source
of actual random data. Data produced by deterministic computers cannot be random in any way.
A random number generator takes a value called a seed, treats it as an input value, calculates a
"random" number based on it (the method depends on a chosen algorithm) and produces a new seed
value.
The length of a cycle in which all seed values are unique may be very long, but it isn't infinite - sooner or
later the seed values will start repeating, and the generating values will repeat, too. This is normal. It's a
feature, not a mistake, or a bug.
The initial seed value, set during the program start, determines the order in which the generated values
will appear.
The random factor of the process may be augmented by setting the seed with a number taken from
the current time - this may ensure that each program launch will start from a different seed value (ergo,
it will use different random numbers).
The example program in the editor will produce five pseudorandom values - as their values are
determined by the current (rather unpredictable) seed value, you can't guess them. Run the program.
The seed() function is able to directly set the generator's seed. We'll show you two of its variants:
We've modified the previous program - in effect, we've removed any trace of randomness from the
code:
from random import random, seed
21
seed(0)
for i in range(5):
print(random())
Due to the fact that the seed is always set with the same value, the sequence of generated values
always looks the same.
And you?
Note: your values may be slightly different than ours if your system uses more precise or less precise
floating-point arithmetic, but the difference will be seen quite far from the decimal point.
22
Selected functions from the random module: continued
If you want integer random values, one of the following functions would fit better:
randrange(end)
randrange(beg, end)
randint(left, right)
The first three invocations will generate an integer taken (pseudorandomly) from the range
(respectively):
range(end)
range(beg, end)
23
The last function is an equivalent of randrange(left, right+1) - it generates the integer value i, which falls
in the range [left, right] (no exclusion on the right side).
Look at the code in the editor. This sample program will consequently output a line consisting of three
zeros and either a zero or one at the fourth place.
Look at the code in the editor. The program very likely outputs a set of numbers in which some elements
are not unique.
As you can see, this is not a good tool for generating numbers in a lottery. Fortunately, there is a better
solution than writing your own code to check the uniqueness of the "drawn" numbers.
choice(sequence)
sample(sequence, elements_to_choose=1)
24
The first variant chooses a "random" element from the input sequence and returns it.
The second one builds a list (a sample) consisting of the elements_to_chooseelement (which defaults
to 1) "drawn" from the input sequence.
In other words, the function chooses some of the input elements, returning a list with the choice. The
elements in the sample are placed in random order. Note: the elements_to_choose must not be greater
than the length of the input sequence.
Again, the output of the program is not predictable. Our results looked like this:
4
[3, 1, 8, 9, 10]
[10, 8, 5, 1, 6, 4, 3, 9, 7, 2]
25
Sometimes, it may be necessary to find out information unrelated to Python. For example, you may
need to know the location of your program within the greater environment of the computer.
Python (more precisely - its runtime environment) lies directly below it;
the next layer of the pyramid is filled with the OS (operating system) - Python's environment
provides some of its functionalities using the operating system's services; Python, although very
powerful, isn't omnipotent - it's forced to use many helpers if it's going to process files or
communicate with physical devices;
the bottom-most layer is hardware - the processor (or processors), network interfaces, human
interface devices (mice, keyboards, etc.) and all other machinery needed to make the computer
run; the OS knows how to drive it, and uses lots of tricks to conduct all parts in a consistent
rhythm.
This means than some of your (or rather your program's) actions have to travel a long way to be
successfully performed - imagine that:
Python accepts the order, rearranges it to meet local OS requirements (it's like putting the
stamp "approved" on your request) and sends it down (this may remind you of a chain of
command)
the OS checks if the request is reasonable and valid (e.g., whether the file name conforms to
some syntax rules) and tries to create the file; such an operation, seemingly very simple, isn't
atomic - it consists of many minor steps taken by...
the hardware, which is responsible for activating storage devices (hard disk, solid state devices,
etc.) to satisfy the OS's needs.
26
Usually, you're not aware of all that fuss - you want the file to be created and that's that.
But sometimes you want to know more - for example, the name of the OS which hosts Python, and
some characteristics describing the hardware that hosts the OS.
There is a module providing some means to allow you to know where you are and what components
work for you. The module is named platform. We'll show you some of the functions it provides to you.
There is a function that can show you all the underlying layers in one glance, named platform, too. It just
returns a string describing the environment; thus, its output is rather addressed to humans than to
automated processing (you'll see it soon).
This is how you can invoke it: platform(aliased = False, terse = False)
And now:
aliased → when set to True (or any non-zero value) it may cause the function to present the
alternative underlying layer names instead of the common ones;
terse → when set to True (or any non-zero value) it may convince the function to present a
briefer form of the result (if possible)
We ran our sample program using three different platforms - this is what we got:
27
Linux-3.18.62-g6-x86_64-Intel-R-_Core-TM-_i3-2330M_CPU_@_2.20GHz-with-
glibc2.3.4
You can also run the sample program in IDLE on your local machine to check what output you will have.
x86
x86_64
armv7l
x86
28
Intel(R) Core(TM) i3-2330M CPU @ 2.20GHz
armv7l
Windows
Linux
Linux
Run the code and check its output. This is what we got:
6.0.6002
29
Selected functions from the platform module: continued
If you need to know what version of Python is running your code, you can check it using a number of
dedicated functions - here are two of them:
30
Useful Modules
Moreover, the Python community all over the world creates and maintains hundreds of additional
modules used in very niche applications like genetics, psychology, or even astrology.
These modules aren't (and won't be) distributed along with Python, or through official channels, which
makes the Python universe broader - almost infinite.
You can read about all standard Python modules here: https://docs.python.org/3/py-modindex.html.
Don't worry - you won't need all these modules. Many of them are very specific.
All you need to do is find the modules you want, and teach yourself how to use them. It's easy.
In the next section we'll take a look at something else. We're going to show you how to write your own
module.
31
Modules and Packages
What is a package?
Writing your own modules doesn't differ much from writing ordinary scripts.
There are some specific aspects you must be aware of, but it definitely isn't rocket science. You'll see
this soon enough.
a module is a kind of container filled with functions - you can pack as many functions as you
want into one module and distribute it across the world;
of course, it's generally a good idea not to mix functions with different application areas within
one module (just like in a library - nobody expects scientific works to be put among comic
books), so group your functions carefully and name the module containing them in a clear and
intuitive way (e.g., don't give the name arcade_games to a module containing functions
intended to partition and format hard disks)
making many modules may cause a little mess - sooner or later you'll want to group your
modules exactly in the same way as you've previously grouped functions - is there a more
general container than a module?
yes, there is - it's a package; in the world of modules, a package plays a similar role to a
folder/directory in the world of files.
In this section you're going to be working locally on your machine. Let's start from scratch, just like this:
32
You need two files to repeat these experiments. One of them will be the module itself. It's empty now.
Don't worry, you're going to fill it with actual code.
We've named the file module.py. Not very creative, but simple and clear.
The second file contains the code using the new module. Its name is main.py.
Note: both files have to be located in the same folder. We strongly encourage you to create an empty,
new folder for both files. Some things will be easier then.
Launch IDLE and run the main.py file. What do you see?
You should see nothing. This means that Python has successfully imported the contents of
the module.py file. It doesn't matter that the module is empty for now. The very first step has been
done, but before you take the next step, we want you to take a look into the folder in which both files
exist.
A new subfolder has appeared - can you see it? Its name is __pycache__. Take a look inside. What do
you see?
There is a file named (more or less) module.cpython-xy.pyc where x and y are digits derived from your
version of Python (e.g., they will be 3 and 4 if you use Python 3.4).
33
The name of the file is the same as your module's name (module here). The part after the first dot says
which Python implementation has created the file (CPython here) and its version number. The last part
(pyc) comes from the words Python and compiled.
You can look inside the file - the content is completely unreadable to humans. It has to be like that, as
the file is intended for Python's use only.
When Python imports a module for the first time, it translates its contents into a somewhat compiled
shape. The file doesn't contain machine code - it's internal Python semi-compiled code, ready to be
executed by Python's interpreter. As such a file doesn't require lots of the checks needed for a pure
source file, the execution starts faster, and runs faster, too.
Thanks to that, every subsequent import will go quicker than interpreting the source text from scratch.
Python is able to check if the module's source file has been modified (in this case, the pyc file will be
rebuilt) or not (when the pyc file may be run at once). As this process is fully automatic and transparent,
you don't have to keep it in mind.
Can you notice any differences between a module and an ordinary script? There are none so far.
It's possible to run this file like any other script. Try it for yourself.
What happens? You should see the following line inside your console:
I like to be a module.
34
Run it. What do you see? Hopefully, you see something like this:
I like to be a module.
When a module is imported, its content is implicitly executed by Python. It gives the module the chance
to initialize some of its internal aspects (e.g., it may assign some variables with useful values). Note:
the initialization takes place only once, when the first import occurs, so the assignments done by the
module aren't repeated unnecessarily.
there is a module named mod2 which contains the import mod1 instruction;
there is a main file containing the import mod1 and import mod2 instructions.
At first glance, you may think that mod1 will be imported twice - fortunately, only the first import
occurs. Python remembers the imported modules and silently omits all subsequent imports.
Moreover, each source file uses its own, separate version of the variable - it isn't shared between
modules.
We'll show you how to use it. Modify the module a bit:
35
Now run the module.py file. You should see the following lines:
I like to be a module
__main__
Now run the main.py file. And? Do you see the same as us?
I like to be a module
module
when you run a file directly, its __name__ variable is set to __main__;
when a file is imported as a module, its __name__ variable is set to the file's name
(excluding .py)
This is how you can make use of the __main__ variable in order to detect the context in which your code
has been activated:
There's a cleverer way to utilize the variable, however. If you write a module filled with a number of
complex functions, you can use it to place a series of tests to check if the functions work properly.
Each time you modify any of these functions, you can simply run the module to make sure that your
amendments didn't spoil the code. These tests will be omitted when the code is imported as a module.
36
Introducing such a variable is absolutely correct, but may cause important side effects that you must be
aware of.
As you can see, the main file tries to access the module's counter variable. Is this legal? Yes, it is. Is it
usable? It may be very usable. Is it safe? That depends - if you trust your module's users, there's no
problem; however, you may not want the rest of the world to see your personal/private variable.
Unlike many other programming languages, Python has no means of allowing you to hide such variables
from the eyes of the module's users. You can only inform your users that this is your variable, that they
may read it, but that they should not modify it under any circumstances.
This is done by preceding the variable's name with _ (one underscore) or __ (two underscores), but
remember, it's only a convention. Your module's users may obey it or they may not.
Of course, we'll follow the convention. Now let's put two functions into the module - they'll evaluate the
sum and product of the numbers collected in a list.
In addition, let's add some ornaments there and remove any superfluous remnants.
37
A few elements need some explanation, we think:
a string (maybe a multiline) placed before any module instructions (including imports) is called
the doc-string, and should briefly explain the purpose and contents of the module;
the functions defined inside the module (suml() and prodl()) are available for import;
we've used the __name__ variable to detect when the file is run stand-alone, and seized this
opportunity to perform some simple tests.
38
Now it's possible to use the new module - this is one way:
Let's give up this assumption and conduct the following thought experiment:
we are using Windows ® OS (this assumption is important, as the file name's shape depends on
it)
39
How to deal with it?
To answer this question, we have to talk about how Python searches for modules. There's a special
variable (actually a list) storing all locations (folders/directories) that are searched in order to find a
module which has been requested by the import instruction.
Python browses these folders in the order in which they are listed in the list - if the module cannot be
found in any of these directories, the import fails.
Otherwise, the first folder containing a module with the desired name will be taken into consideration (if
any of the remaining folders contains a module of that name, it will be ignored).
The variable is named path, and it's accessible through the module named sys. This is how you can check
its regular value:
We've launched the code inside the C:\User\user folder, and we've got:
C:\Users\user C:\Users\user\AppData\Local\Programs\Python\Python36-
32\python36.zip C:\Users\user\AppData\Local\Programs\Python\Python36-
32\DLLs C:\Users\user\AppData\Local\Programs\Python\Python36-32\lib
C:\Users\user\AppData\Local\Programs\Python\Python36-32
C:\Users\user\AppData\Local\Programs\Python\Python36-32\lib\site-
packages
Note: the folder in which the execution starts is listed in the first path's element.
Note once again: there is a zip file listed as one of the path's elements - it's not an error. Python is able
to treat zip files as ordinary folders - this can save lots of storage.
You can solve it by adding a folder containing the module to the path variable (it's fully modifiable).
40
Note:
Check
Because a backslash is used to escape other characters - if you want to get just a backslash, you
have to escape it.
we've used the relative name of the folder - this will work if you start the main.py file directly
from its home folder, and won't work if the current directory doesn't fit the relative path; you
can always use an absolute path, like this:
path.append('C:\\Users\\user\\py\\modules')
we've used the append() method - in effect, the new path will occupy the last element in the
path list; if you don't like the idea, you can use insert() instead.
Imagine that in the not-so-distant future you and your associates write a large number of Python
functions.
Your team decides to group the functions in separate modules, and this is the final result of the
ordering:
41
Note: we've presented the whole content for the omega module only - assume that all the modules look
similar (they contain one function named funX, where X is the first letter of the module's name).
Suddenly, somebody notices that these modules form their own hierarchy, so putting them all in a flat
structure won't be a good idea.
42
After some discussion, the team comes to the conclusion that the modules have to be grouped. All
participants agree that the following tree structure perfectly reflects the mutual relationships between
the modules:
the good group contains two modules (alpha and beta) and one subgroup (best)
the extra group contains two subgroups (good and bad) and one module (iota)
Does it look bad? Not at all - analyze the structure carefully. It resembles something, doesn't it?
43
Your first package: continued
This is how the tree currently looks:
Such a structure is almost a package (in the Python sense). It lacks the fine detail to be both functional
and operative. We'll complete it in a moment.
If you assume that extra is the name of a newly created package (think of it as the package's root), it
will impose a naming rule which allows you to clearly name every entity from the tree.
For example:
the location of a function named funT() from the tau package may be described as:
extra.good.best.tau.funT()
how do you transform such a tree (actually, a subtree) into a real Python package (in other
words, how do you convince Python that such a tree is not just a bunch of junk files, but a set of
modules)?
The first question has a surprising answer: packages, like modules, may require initialization.
44
The initialization of a module is done by an unbound code (not a part of any function) located inside the
module's file. As a package is not a file, this technique is useless for initializing packages.
You need to use a different trick instead - Python expects that there is a file with a very unique name
inside the package's folder: __init__.py.
The content of the file is executed when any of the package's modules is imported. If you don't want any
special initializations, you can leave the file empty, but you mustn't omit it.
Note: it's not only the root folder that can contain __init.py__ file - you can put it inside any of its
subfolders (subpackages) too. It may be useful if some of the subpackages require individual treatment
and special kinds of initialization.
Now it's time to answer the second question - the answer is simple: anywhere. You only have to ensure
that Python is aware of the package's location. You already know how to do that.
45
We've prepared a zip file containing all the files from the packages branch. You can download it and use
it for your own experiments, but remember to unpack it in the folder presented in the scheme,
otherwise, it won't be accessible to the code from the main file.
46
Note:
the import doesn't point directly to the module, but specifies the fully qualified path from the
top of the package;
47
You can make your life easier by using aliasing:
Let's assume that we've zipped the whole subdirectory, starting from the extra folder (including it), and
let's get a file named extrapack.zip. Next, we put the file inside the packages folder.
48
If you want to conduct your own experiments with the package we've created, you can download it
below. We encourage you to do so.
Now you can create modules and combine them into packages. It's time to start a completely different
discussion - about errors, failures and crashes.
49
Errors - the programmer
This is Murphy 's law, and it works everywhere and always. Your code's execution can go wrong, too. If
it can, it will.
Look the code in the editor. There are at least two possible ways it can "go wrong". Can you see them?
As a user is able to enter a completely arbitrary string of characters, there is no guarantee that
the string can be converted into a float value - this is the first vulnerability of the code;
the second is that the sqrt() function fails if it gets a negative argument.
Can you protect yourself from such surprises? Of course you can. Moreover, you have to do it in order to
be considered a good programmer.
50
Exceptions
Each time your code tries to do something wrong/foolish/irresponsible/crazy/unenforceable, Python
does two things:
Both of these activities are called raising an exception. We can say that Python always raises an
exception (or that an exception has been raised) when it has no idea what do to with your code.
the raised exception expects somebody or something to notice it and take care of it;
if nothing happens to take care of the raised exception, the program will be forcibly terminated,
and you will see an error message sent to the console by Python;
otherwise, if the exception is taken care of and handled properly, the suspended program can
be resumed and its execution can continue.
Python provides effective tools that allow you to observe exceptions, identify them and handle them
efficiently. This is possible due to the fact that all potential exceptions have their unambiguous names,
so you can categorize them and react appropriately.
51
You know some exception names already.
The word in red is just the exception name. Let's get familiar with some other exceptions.
Exceptions: continued
Look at the code in the editor. Run the (obviously incorrect) program.
52
You will see the following message in reply:
Traceback (most recent call last):
File "div.py", line 2, in
value /= 0
ZeroDivisionError: division by zero
Exceptions: continued
Look at the code in the editor. What will happen when you run it? Check.
53
Exceptions: continued
How do you handle exceptions? The word try is key to the solution.
But wouldn't it be better to check all circumstances first and then do something only if it's safe?
Admittedly, this way may seem to be the most natural and understandable, but in reality, this method
doesn't make programming any easier. All these checks can make your code bloated and illegible.
Exceptions: continued
Look at the code in the editor. This is the favorite Python approach.
Note:
the try keyword begins a block of the code which may or may not be performing correctly;
next, Python tries to perform the risky action; if it fails, an exception is raised and Python starts
to look for a solution;
the except keyword starts a piece of code which will be executed if anything inside the try block
goes wrong - if an exception is raised inside a previous try block, it will fail here, so the code
located after the except keyword should provide an adequate reaction to the raised exception;
54
returning to the previous nesting level ends the try-except section.
in the first step, Python tries to perform all instructions placed between
the try:and except: statements;
if nothing is wrong with the execution and all instructions are performed successfully, the
execution jumps to the point after the last line of the except:block, and the block's execution is
considered complete;
if anything goes wrong inside the try: and except: block, the execution immediately jumps out of
the block and into the first instruction located after the except: keyword; this means that some
of the instructions from the block may be silently omitted.
55
Exceptions: continued
Look at the code in the editor. It will help you understand this mechanism.
Exceptions: continued
This approach has one important disadvantage - if there is a possibility that more than one exception
may skip into an except: branch, you may have trouble figuring out what actually happened.
Just like in our code in the editor. Run it and see what happens.
56
The message: Oh dear, something went wrong... appearing in the console says nothing about the
reason, while there are two possible causes of the exception:
build two consecutive try-except blocks, one for each possible exception reason (easy, but will
cause unfavorable code growth)
if the try branch raises the exc1 exception, it will be handled by the except exc1: block;
57
similarly, if the try branch raises the exc2 exception, it will be handled by the except exc2: block;
if the try branch raises any other exception, it will be handled by the unnamed except block.
Let's move on to the next part of the course and see it in action.
Exceptions: continued
Look at the code in the editor. Our solution is there.
The code, when run, produces one of the following four variants of output:
0.2
THE END.
(locally on your machine) if you press Ctrl-C while the program is waiting for the user's input
(which causes an exception named KeyboardInterrupt), the program says:
58
Exceptions: continued
Don't forget that:
the except branches are searched in the same order in which they appear in the code;
you must not use more than one except branch with a certain exception name;
the number of different except branches is arbitrary - the only condition is that if you use try,
you must put at least one except (named or not) after it;
if none of the specified except branches matches the raised exception, the exception remains
unhandled (we'll discuss it soon)
if an unnamed except branch exists (one without an exception name), it has to be specified as
the last.
try:
:
except exc1:
:
except exc2:
:
except:
:
59
Look at the code in the editor. We've modified the previous program - we've removed
the ZeroDivisionError branch.
As there are no dedicated branches for division by zero, the raised exception falls into the general
(unnamed) branch; this means that in this case, the program will say:
Oh dear, something went wrong...
THE END.
Exceptions: continued
Let's spoil the code once again.
Look at the program in the editor. This time, we've removed the unnamed branch.
the exception raised won't be handled by ValueError - it has nothing to do with it;
Traceback (most recent call last): File "exc.py", line 3, in y = 1 / x ZeroDivisionError: division by
zero
You've learned a lot about exception handling in Python. In the next section, we will focus on Python
built-in exceptions and their hierarchies.
60
The anatomy of exceptions
Exceptions
Python 3 defines 63 built-in exceptions, and all of them form a tree-shaped hierarchy, although the tree
is a bit weird as its root is located on top.
Some of the built-in exceptions are more general (they include other exceptions) while others are
completely concrete (they represent themselves only). We can say that the closer to the root an
exception is located, the more general (abstract) it is. In turn, the exceptions located at the branches'
ends (we can call them leaves) are concrete.
It shows a small section of the complete exception tree. Let's begin examining the tree from
the ZeroDivisionError leaf.
Note:
ArithmeticError is a special case of a more general exception class named just Exception;
We can describe it in the following way (note the direction of the arrows - they always point to the more
general entity):
61
We're going to show you how this generalization works. Let's start with some really simple code.
Exceptions: continued
Look at the code in the editor. It is a simple example to start with. Run it.
62
You already know that ArithmeticError is a general class including (among others)
the ZeroDivisionError exception.
This also means that replacing the exception's name with either Exception or BaseException won't
change the program's behavior.
Let's summarize:
the matching branch doesn't have to specify the same exception exactly - it's enough that the
exception is more general (more abstract) than the raised one.
Exceptions: continued
Look at the code in the editor. What will happen here?
The first matching branch is the one containing ZeroDivisionError. It means that the console will show:
Zero division!
THE END.
Will it change anything if we swap the two except branches around? Just like here below:
try:
y = 1 / 0
except ArithmeticError:
print("Arithmetic problem!")
except ZeroDivisionError:
63
print("Zero Division!")
print("THE END.")
The exception is the same, but the more general exception is now listed first - it will catch all zero
divisions too. It also means that there's no chance that any exception hits the ZeroDivisionError branch.
This branch is now completely unreachable.
Remember:
Exceptions: continued
If you want to handle two or more exceptions in the same way, you can use the following syntax:
try:
:
except (exc1, exc2):
:
You simply have to put all the engaged exception names into a comma-separated list and not to forget
the parentheses.
Let's start with the first variant - look at the code in the editor.
64
The ZeroDivisionError exception (being a concrete case of the ArithmeticErrorexception class) is raised
inside the badfun() function, and it doesn't leave the function - the function itself takes care of it.
It's also possible to let the exception propagate outside the function. Let's test it now.
The problem has to be solved by the invoker (or by the invoker's invoker, and so on).
Note: the exception raised can cross function and module boundaries, and travel through the
invocation chain looking for a matching except clause able to handle it.
65
If there is no such clause, the exception remains unhandled, and Python solves the problem in its
standard way - by terminating your code and emitting a diagnostic message.
Now we're going to suspend this discussion, as we want to introduce you to a brand new Python
instruction.
Exceptions: continued
The raise instruction raises the specified exception named exc as if it was raised in a normal (natural)
way:
raise exc
partially handle an exception and make another part of the code responsible for completing the
handling (separation of concerns).
Look at the code in the editor. This is how you can use it in practice.
In this way, you can test your exception handling routine without forcing the code to do stupid things.
66
Exceptions: continued
The raise instruction may also be utilized in the following way (note the absence of the exception's
name):
raise
There is one serious restriction: this kind of raise instruction may be used inside the except branch only;
using it in any other context causes an error.
The instruction will immediately re-raise the same exception as currently handled.
Thanks to this, you can distribute the exception handling among different parts of the code.
first, inside the try part of the code (this is caused by actual zero division)
67
Exceptions: continued
Now is a good moment to show you another Python instruction, named assert. This is a keyword.
assert expression
if the expression evaluates to True, or a non-zero numerical value, or a non-empty string, or any
other value different than None, it won't do anything else;
otherwise, it automatically and immediately raises an exception named AssertionError (in this
case, we say that the assertion has failed)
you may want to put it into your code where you want to be absolutely safe from evidently
wrong data, and where you aren't absolutely sure that the data has been carefully examined
before (e.g., inside a function used by someone else)
raising an AssertionError exception secures your code from producing invalid results, and clearly
shows the nature of the failure;
assertions don't supersede exceptions or validate the data - they are their supplements.
If exceptions and data validation are like careful driving, assertion can play the role of an airbag.
Let's see the assert instruction in action. Look at the code in the editor. Run it.
68
The program runs flawlessly if you enter a valid numerical value greater than or equal to zero;
otherwise, it stops and emits the following message:
Traceback (most recent call last):
File ".main.py", line 4,
in assert x >= 0.0
AssertionError
69
Useful exceptions
Built-in exceptions
We're going to show you a short list of the most useful exceptions. While it may sound strange to call
"useful" a thing or a phenomenon which is a visible sign of failure or setback, as you know, to err is
human and if anything can go wrong, it will go wrong.
Exceptions are as routine and normal as any other aspect of a programmer's life.
its name;
a short description;
a concise snippet of code showing the circumstances in which the exception may be raised.
There are lots of other exceptions to explore - we simply don't have the space to go through them all
here.
ArithmeticError
Location:
Description:
an abstract exception including all exceptions caused by arithmetic operations like zero division or an
argument's invalid domain
AssertionError
Location:
Description:
a concrete exception raised by the assert instruction when its argument evaluates to False, None, 0, or
an empty string
Code:
from math import tan, radians
angle = int(input('Enter integral angle in degrees: '))
70
# we must be sure that angle != 90 + k * 180
assert angle % 180 != 90
print(tan(radians(angle)))
BaseException
Location:
BaseException
Description:
the most general (abstract) of all Python exceptions - all other exceptions are included in this one; it can
be said that the following two except branches are equivalent: except: and except BaseException:.
IndexError
Location:
Description:
a concrete exception raised when you try to access a non-existent sequence's element (e.g., a list's
element)
Code:
# the code shows an extravagant way
# of leaving the loop
list = [1, 2, 3, 4, 5]
ix = 0
doit = True
while doit:
try:
print(list[ix])
ix += 1
except IndexError:
doit = False
print('Done')
71
KeyboardInterrupt
Location:
BaseException ← KeyboardInterrupt
Description:
a concrete exception raised when the user uses a keyboard shortcut designed to terminate a program's
execution (Ctrl-C in most OSs); if handling this exception doesn't lead to program termination, the
program continues its execution. Note: this exception is not derived from the Exception class. Run the
program in IDLE.
Code:
# this code cannot be terminated
# by pressing Ctrl-C
from time import sleep
seconds = 0
while True:
try:
print(seconds)
seconds += 1
sleep(1)
except KeyboardInterrupt:
print("Don't do that!")
LookupError
Location:
Description:
an abstract exception including all exceptions caused by errors resulting from invalid references to
different collections (lists, dictionaries, tuples, etc.)
MemoryError
Location:
72
BaseException ← Excep on ← MemoryError
Description:
a concrete exception raised when an operation cannot be completed due to a lack of free memory
Code:
# this code causes the MemoryError exception
# warning: executing this code may be crucial
# for your OS
# don't run it in production environments!
string = 'x'
try:
while True:
string = string + string
print(len(string))
except MemoryError:
print('This is not funny!')
OverflowError
Location:
Description:
a concrete exception raised when an operation produces a number too big to be successfully stored
Code:
# the code prints subsequent
# values of exp(k), k = 1, 2, 4, 8, 16, ...
from math import exp
ex = 1
try:
while True:
print(exp(ex))
ex *= 2
73
except OverflowError:
print('The number is too big.')
ImportError
Location:
Description:
Code:
# one of this imports will fail - which one?
try:
import math
import time
import abracadabra
except:
print('One of your imports has failed.')
KeyError
Location:
Description:
a concrete exception raised when you try to access a collection's non-existent element (e.g., a
dictionary's element)
Code:
# how to abuse the dictionary
# and how to deal with it
dict = { 'a' : 'b', 'b' : 'c', 'c' : 'd' }
ch = 'a'
try:
74
while True:
ch = dict[ch]
print(ch)
except KeyError:
print('No such key:', ch)
We are done with exceptions for now, but they'll return when we discuss object-oriented programming
in Python. You can use them to protect your code from bad accidents, but you also have to learn how to
dive into them, exploring the information they carry.
Exceptions are in fact objects - however, we can tell you nothing about this aspect until we present you
with classes, objects, and the like.
For the time being, if you'd like to learn more about exceptions on your own, you look into Standard
Python Library at https://docs.python.org/3.6/library/exceptions.html.
75
Reading ints safely
Estimated time
15-25 minutes
Level of difficulty
Medium
Objectives
Scenario
Your task is to write a function able to input integer values and to check if they are within a specified
range.
accept three arguments: a prompt, a low acceptable limit, and a high acceptable limit;
if the user enters a string that is not an integer value, the function should emit the
message Error: wrong input, and ask the user to input the value again;
if the user enters a number which falls outside the specified range, the function should emit the
message Error: the value is not within permitted range (min..max) and ask the user to input the
value again;
Test data
Test your code carefully.
This is how the function should react to the user's input:
Enter a number from -10 to 10: 100
Error: the value is not within permitted range (-10..10)
Enter a number from -10 to 10: asd
Error: wrong input
Enter number from -10 to 10: 1
The number is: 1
76
Characters and Strings vs. Computers
All these data must be stored, input, output, searched, and transformed by contemporary computers
just like any other data, no matter if they are single characters or multi-volume encyclopedias.
How is it possible?
How can you do it in Python? This is what we'll discuss now. Let's start with how computers understand
single characters.
Computers store characters as numbers. Every character used by a computer corresponds to a unique
number, and vice versa. This assignment must include more characters than you might expect. Many of
them are invisible to humans, but essential to computers.
Some of these characters are called whitespaces, while others are named control characters, because
their purpose is to control input/output devices.
An example of a whitespace that is completely invisible to the naked eye is a special code, or a pair of
codes (different operating systems may treat this issue differently), which are used to mark the ends of
the lines inside text files.
People do not see this sign (or these signs), but are able to observe the effect of their application where
the lines are broken.
We can create virtually any number of character-number assignments, but life in a world in which every
type of computer uses a different character encoding would not be very convenient. This system has led
to a need to introduce a universal and widely accepted standard implemented by (almost) all computers
and operating systems all over the world.
The one named ASCII (short for American Standard Code for Information Interchange) is the most
widely used, and you can assume that nearly all modern devices (like computers, printers, mobile
phones, tablets, etc.) use that code.
77
The code provides space for 256 different characters, but we are interested only in the first 128. If you
want to see how the code is constructed, look at the table below. Click the table to enlarge it. Look at it
carefully - there are some interesting facts. Look at the code of the most common character - the space.
This is 32.
78
Now check the code of the lower-case letter a. This is 97. And now find the upper-case A. Its code is 65.
Now work out the difference between the code of a and A. It is equal to 32. That's the code of a space.
Interesting, isn't it?
Also note that the letters are arranged in the same order as in the Latin alphabet.
I18N
Of course, the Latin alphabet is not sufficient for the whole of mankind. Users of that alphabet are in the
minority. It was necessary to come up with something more flexible and capacious than ASCII,
something able to make all the software in the world amenable to internationalization, because
different languages use completely different alphabets, and sometimes these alphabets are not as
simple as the Latin one.
Why? Look carefully - there is an I at the front of the word, next there are 18 different letters, and
an N at the end.
Despite the slightly humorous origin, the term is officially used in many documents and standards.
The software I18N is a standard in present times. Each program has to be written in a way that enables
it to be used all around the world, among different cultures, languages and alphabets.
A classic form of ASCII code uses eight bits for each sign. Eight bits mean 256 different characters. The
first 128 are used for the standard Latin alphabet (both upper-case and lower-case characters). Is it
possible to push all the other national characters used around the world into the remaining 128
locations?
No. It isn't.
A code point is a number which makes a character. For example, 32 is a code point which makes
a space in ASCII encoding. We can say that standard ASCII code consists of 128 code points.
As standard ASCII occupies 128 out of 256 possible code points, you can only make use of the remaining
128.
79
It's not enough for all possible languages, but it may be sufficient for one language, or for a small group
of similar languages.
Can you set the higher half of the code points differently for different languages? Yes, you can. Such a
concept is called a code page.
A code page is a standard for using the upper 128 code points to store specific national characters. For
example, there are different code pages for Western Europe and Eastern Europe, Cyrillic and Greek
alphabets, Arabic and Hebrew languages, and so on.
This means that the one and same code point can make different characters when used in different code
pages.
For example, the code point 200 makes Č (a letter used by some Slavic languages) when utilized by the
ISO/IEC 8859-2 code page, and makes Ш (a Cyrillic letter) when used by the ISO/IEC 8859-5 code page.
In consequence, to determine the meaning of a specific code point, you have to know the target code
page.
In other words, the code points derived from code the page concept are ambiguous.
Unicode
Code pages helped the computer industry to solve I18N issues for some time, but it soon turned out that
they would not be a permanent solution.
The concept that solved the problem in the long term was Unicode.
Unicode assigns unique (unambiguous) characters (letters, hyphens, ideograms, etc.) to more than a
million code points. The first 128 Unicode code points are identical to ASCII, and the first 256 Unicode
code points are identical to the ISO/IEC 8859-1 code page (a code page designed for western European
languages).
UCS-4
The Unicode standard says nothing about how to code and store the characters in the memory and files.
It only names all available characters and assigns them to planes (a group of characters of similar origin,
application, or nature).
80
There is more than one standard describing the techniques used to implement Unicode in actual
computers and computer storage systems. The most general of them is UCS-4.
UCS-4 uses 32 bits (four bytes) to store each character, and the code is just the Unicode code points'
unique number. A file containing UCS-4 encoded text may start with a BOM (byte order mark), an
unprintable combination of bits announcing the nature of the file's contents. Some utilities may require
it.
As you can see, UCS-4 is a rather wasteful standard - it increases a text's size by four times compared to
standard ASCII. Fortunately, there are smarter forms of encoding Unicode texts.
UTF-8
One of the most commonly used is UTF-8.
The concept is very smart. UTF-8 uses as many bits for each of the code points as it really needs to
represent them.
81
For example:
all Latin characters (and all standard ASCII characters) occupy eight bits;
Due to features of the method used by UTF-8 to store the code points, there is no need to use the BOM,
but some of the tools look for it when reading the file, and many editors set it up during the save.
you can use Unicode/UTF-8 encoded characters to name variables and other entities;
82
The nature of strings in Python
First of all, Python's strings (or simply strings, as we're not going to discuss any other language's strings)
are immutable sequences.
It's very important to note this, because it means that you should expect some familiar behavior from
them.
For example, the len() function used for strings returns a number of characters contained by the
arguments.
Any string can be empty. Its length is 0 then - just like in Example 2.
Don't forget that a backslash (\) used as an escape character is not included in the string's total length.
83
Multiline strings
Now is a very good moment to show you another way of specifying strings inside the Python source
code. Note that the syntax you already know won't let you use a string occupying more than one line of
text.
Fortunately, for these kinds of strings, Python offers separate, convenient, and simple syntax.
As you can see, the string starts with three apostrophes, not one. The same tripled apostrophe is used
to terminate it.
Count the characters carefully. Is this result correct or not? It looks okay at first glance, but when you
count the characters, it doesn't.
Line #1 contains seven characters. Two such lines comprise 14 characters. Did we lose a character?
Where? How?
84
No, we didn't.
The missing character is simply invisible - it's a whitespace. It's located between the two text lines.
Do you remember? It's a special (control) character used to force a line feed (hence its name: LF). You
can't see it, but it counts.
The multiline strings can be delimited by triple quotes, too, just like here:
multiLine = """Line #1
Line #2"""
print(len(multiLine))
Choose the method that is more comfortable for you. Both work the same.
Operations on strings
Like other kinds of data, strings have their own set of permissible operations, although they're rather
limited compared to numbers.
concatenated (joined)
replicated.
The first operation is performed by the + operator (note: it's not an addition) while the second by
the * operator (note again: it's not a multiplication).
The ability to use the same operator against completely different kinds of data (like numbers vs. strings)
is called overloading (as such an operator is overloaded with different duties).
The + operator used against two or more strings produces a new string containing all the
characters from its arguments (note: the order matters - this overloaded +, in contrast to its
numerical version, is not commutative)
the * operator needs a string and a number as arguments; in this case, the order doesn't matter
- you can put the number before the string, or vice versa, the result will be the same - a new
string created by the nth replication of the argument's string.
85
aaaaa
bbbb
Note: shortcut variants of the above operators are also applicable for strings (+= and *=).
The function needs a >strong>one-character string as its argument - breaching this requirement causes
a TypeError exception, and returns a number representing the argument's code point.
Look at the code in the editor, and run it. The snippet outputs:
97
32
Now assign different values to ch1 and ch2, e.g., α (Greek alpha), and ę (a letter in the Polish alphabet);
then run the code and see what result it outputs. Carry out your own experiments.
86
Operations on strings: chr()
If you know the code point (number) and want to get the corresponding character, you can use a
function named chr().
87
Note:
chr(ord(x)) == x
ord(chr(x)) == x
Strings aren't lists, but you can treat them like lists in many particular cases.
For example, if you want to access any of a string's characters, you can do it using indexing, just like in
the example in the editor. Run the program.
By the way, negative indices behave as expected, too. Check this yourself.
88
print(ch, end=' ')
print()
Slices
Moreover, everything you know about slices is still usable.
We've gathered some examples showing how slices work in the string world. Look at the code in the
editor, analyze it, and run it.
You won't see anything new in the example, but we want you to be sure that you can explain all the
lines of the code.
89
The result of the check is simply True or False.
Look at the example program in the editor. This is how the in operator works.
90
Python strings are immutable
We've also told you that Python's strings are immutable. This is a very important feature. What does it
mean?
This primarily means that the similarity of strings and lists is limited. Not everything you can do with a
list may be done with a string.
The first important difference doesn't allow you to use the del instruction to remove anything from a
string.
The only thing you can do with del and a string is to remove the string as a whole. Try to do it.
Python strings don't have the append() method - you cannot expand them in any way.
with the absence of the append() method, the insert() method is illegal, too:
alphabet = "abcdefghijklmnopqrstuvwxyz"
alphabet.insert(0, "A")
The only consequence is that you have to remember about it, and implement your code in a slightly
different way - look at the example code in the editor.
This form of code is fully acceptable, will work without bending Python's rules, and will bring the full
Latin alphabet to your screen:
91
You may want to ask if creating a new copy of a string each time you modify its contents worsens the
effectiveness of the code.
The function finds the minimum element of the sequence passed as an argument. There is one
condition - the sequence (string, list, it doesn't matter) cannot be empty, or else you'll get
a ValueError exception.
Note: It's an upper-case A. Why? Recall the ASCII table - which letters occupy first locations - upper or
lower?
As you can see, they present more than just strings. The expected output looks as follows:
[ ]
0
92
Note: we've used the square brackets to prevent the space from being overlooked on your screen.
Now let's see the max() function applied to the same data as previously. Look at Examples 2 & 3 in the
editor.
93
Operations on strings: the index() method
The index() method (it's a method, not a function) searches the sequence from the beginning, in order
to find the first element of the value specified in its argument.
Note: the element searched for must occur in the sequence - its absence will cause
a ValueError exception.
The method returns the index of the first occurrence of the argument (which means that the lowest
possible result is 0, while the highest is the length of argument decremented by 1).
94
Operations on strings: the list() function
The list() function takes its argument (a string) and creates a new list containing all the string's
characters, one per list element.
Note: it's not strictly a string function - list() is able to create a new list from many other entities (e.g.,
from tuples and dictionaries).
95
Look at the second example in the editor. Can you guess its output?
It is:
2
0
Moreover, Python strings have a significant number of methods intended exclusively for processing
characters. Don't expect them to work with any other collections. The complete list of is presented
here: https://docs.python.org/3.4/library/stdtypes.html#string-methods.
We're going to show you the ones we consider the most useful.
96
String methods
The capitalize() method does exactly what it says - it creates a new string filled with characters taken
from the source string, but it tries to modify them in the following way:
if the first character inside the string is a letter (note: the first character is an element with an
index equal to 0, not just the first visible character), it will be converted to upper-case;
the original string (from which the method is invoked) is not changed in any way (a string's
immutability must be obeyed without reservation)
the modified (capitalized in this case) string is returned as a result - if you don't use it in any way
(assign it to a variable, or pass it to a function/method) it will disappear without a trace.
Note: methods don't have to be invoked from within variables only. They can be invoked directly from
within string literals. We're going to use that convention regularly - it will simplify the examples, as the
most important aspects will not disappear among unnecessary assignments.
97
The center() method
The one-parameter variant of the center() method makes a copy of the original string, trying to center it
inside a field of a specified width.
The centering is actually done by adding some spaces before and after the string.
Don't expect this method to demonstrate any sophisticated skills. It's rather simple.
The example in the editor uses brackets to clearly show you where the centered string actually begins
and terminates.
If the target field's length is too small to fit the string, the original string is returned.
Run the snippets above and check what output they produce.
The two-parameter variant of center() makes use of the character from the second argument, instead
of a space. Analyze the example below:
print('[' + 'gamma'.center(20, '*') + ']')
98
The endswith() method
The endswith() method checks if the given string ends with the specified argument and
returns True or False, depending on the check result.
Note: the substring must adhere to the string's last character - it cannot just be located somewhere near
the end of the string.
Look at our example in the editor, analyze it, and run it. It outputs:
yes
You should now be able to predict the output of the snippet below:
t = "zeta"
print(t.endswith("a"))
print(t.endswith("A"))
print(t.endswith("et"))
print(t.endswith("eta"))
it's safer - it doesn't generate an error for an argument containing a non-existent substring (it
returns -1 then)
it works with strings only - don't try to apply it to any other sequence.
Look at the code in the editor. This is how you can use it.
99
The example prints:
1
-1
Note: don't use find() if you only want to check if a single character occurs within a string - the in
Can you predict the output? Run it and check your predictions.
If you want to perform the find, not from the string's beginning, but from any position, you can use
a two-parameter variant of the find() method. Look at the example:
print('kappa'.find('a', 2))
The second argument specifies the index at which the search will be started (it doesn't have to fit
inside the string).
Among the two a letters, only the second will be found. Run the snippet and check.
You can use the find() method to search for all the substring's occurrences, like here:
txt = """A variation of the ordinary lorem ipsum text has been used in typesetting since the 1960s or
earlier, when it was popularized by advertisements for Letraset transfer sheets. It was introduced to the
Information Age in the mid-1980s by the Aldus Corporation, which employed it in graphics and word-
processing templates for its desktop publishing program PageMaker (from Wikipedia)"""
fnd = txt.find('the')
100
while fnd != -1:
print(fnd)
fnd = txt.find('the', fnd + 1)
The code prints the indices of all occurrences of the article the, and its output looks like this:
15
80
198
221
238
There is also a three-parameter mutation of the find() method - the third argument points to the first
index which won't be taken into consideration during the search (it's actually the upper limit of the
search).
The second argument specifies the index at which the search will be started (it doesn't have to fit inside
the string).
(a cannot be found within the given search boundaries in the second print().
Note: any string element that is not a digit or a letter causes the method to return False. An empty string
does, too.
101
The example output is:
True
True
True
False
False
False
Hint: the cause of the first result is a space - it's neither a digit nor a letter.
102
Look at Example 1 - its output is:
True
False
103
Look at Example 1 in the editor - it outputs:
False
True
Again, Look at the code in the editor - Example 3 produces the following output:
False
False
True
as its name suggests, the method performs a join - it expects one argument as a list; it must be
assured that all the list's elements are strings - the method will raise a TypeError exception
otherwise;
all the list's elements will be joined into one string but...
...the string from which the method has been invoked is used as a separator, put among the
strings;
104
the join() method is invoked from within a string containing a comma (the string can be
arbitrarily long, or it can be empty)
Code:
# Demonstrating the join() method
print(",".join(["omicron", "pi", "rho"]))
If the string doesn't contain any upper-case characters, the method returns the original string.
Code:
# Demonstrating the lower() method
print("SiGmA=60".lower())
The parameterless lstrip() method returns a newly created string formed from the original one by
removing all leading whitespaces.
The brackets are not a part of the result - they only show the result's boundaries.
Code:
# Demonstrating the lstrip() method
print("[" + " tau ".lstrip() + "]")
105
The example outputs:
[tau ]
The one-parameter lstrip() method does the same as its parameterless version, but removes all
characters enlisted in its argument (a string), not just whitespaces:
print("www.cisco.com".lstrip("w."))
Can you guess the output of the snippet below? Think carefully. Run the code and check your
predictions.
print("pythoninstitute.org".lstrip(".org"))
Surprised? Leading characters, leading whitespaces. Again, experiment with your own examples.
The second argument can be an empty string (replacing is actually removing, then), but the first cannot
be.
Code:
# Demonstrating the replace() method
print("www.netacad.com".replace("netacad.com", "pythoninstitute.org"))
print("This is it!".replace("is", "are"))
print("Apple juice".replace("juice", ""))
www.pyhoninstitute.org
106
Apple
The three-parameter replace() variant uses the third argument (a number) to limit the number of
replacements.
Can you guess its output? Run the code and check your guesses.
Take a look at the example code in the editor and try to predict its output. Run the code to check if you
were right.
107
Look at the code example in the editor. Can you guess its output? Run the code to check your guesses.
The method assumes that the substrings are delimited by whitespaces - the spaces don't take part in
the operation, and aren't copied into the resulting list.
Code:
# Demonstrating the split() method
print("phi chi\npsi".split())
Look at the code in the editor. The example produces the following output:
['phi', 'chi', 'psi']
Note: the reverse operation can be performed by the join() method.
Code:
# Demonstrating the startswith() method
108
print("omega".startswith("meg"))
print("omega".startswith("om"))
print()
Look at the example in the editor. This is the result from it:
False
True
Code:
# Demonstrating the strip() method
print("[" + " aleph ".strip() + "]")
Look at the second example in the editor. This is the result it returns:
[aleph]
Now carry out you own experiments with the two methods.
Look at the first example in the editor. Can you guess the output? It won't look good, but you must see
it:
Code:
Demonstrating the swapcase() method
print("I know that I know nothing.".swapcase())
print()
Output:
i KNOW THAT i KNOW NOTHING.
109
The title() method
The title() method performs a somewhat similar function - it changes every word's first letter to upper-
case, turning all other ones to lower-case.
Code:
# Demonstrating the title() method
print("I know that I know nothing. Part 1.".title())
print()
Look at the second example in the editor. Can you guess its output? This is the result:
I Know That I Know Nothing. Part 1.
Code:
# Demonstrating the upper() method
print("I know that I know nothing. Part 2.".upper())
Hoooray! We've made it to the end of this section. Are you surprised with any of the string methods
we've discussed so far? Take a couple of minutes to review them, and let's move on to the next part of
the course where we'll show you what great things we can do with strings.
110
Your own split
Estimated time
20-25 minutes
Level of difficulty
Medium
Objectives
Scenario
You already know how split() works. Now we want you to prove it.
Your task is to write your own function, which behaves almost exactly like the original split() method,
i.e.:
it should return a list of words created from the string, divided in the places where the string
contains whitespaces;
Expected output
['To', 'be', 'or', 'not', 'to', 'be,', 'that', 'is', 'the',
'question']
['To', 'be', 'or', 'not', 'to', 'be,that', 'is', 'the', 'question']
[]
['abc']
[]
111
String in action
Comparing strings
Python's strings can be compared using the same set of operators which are in use in relation to
numbers.
Take a look at these operators - they can all compare strings, too:
==
!=
>
>=
<
<=
There is one "but" - the results of such comparisons may sometimes be a bit surprising. Don't forget that
Python is not aware (it cannot be in any way) of subtle linguistic issues - it just compares code point
values, character by character.
The results you get from such an operation are sometimes astonishing. Let's start with the simplest
cases.
Two strings are equal when they consist of the same characters in the same order. By the same fashion,
two strings are not equal when they don't consist of the same characters in the same order.
The final relation between strings is determined by comparing the first different character in both
strings (keep ASCII/UNICODE code points in mind at all times.)
When you compare two strings of different lengths and the shorter one is identical to the longer one's
beginning, the longer string is considered greater.
String comparison is always case-sensitive (upper-case letters are taken as lesser than lower-case).
112
The expression is True:
'beta' > 'Beta'
Using any of the remaining comparison operators will raise a TypeError exception.
113
The results in this case are:
False
True
False
True
TypeError exception
Sorting
Comparing is closely related to sorting (or rather, sorting is in fact a very sophisticated case of
comparing).
This is a good opportunity to show you two possible ways to sort lists containing strings. Such an
operation is very common in the real world - any time you see a list of names, goods, titles, or cities, you
expect them to be sorted.
The function takes one argument (a list) and returns a new list, filled with the sorted argument's
elements. (Note: this description is a bit simplified compared to the actual implementation - we'll
discuss it later.)
Look at the code in the editor, and run it. The snippet produces the following output:
['omega', 'alpha', 'pi', 'gamma']
['alpha', 'gamma', 'omega', 'pi']
The second method affects the list itself - no new list is created. Ordering is performed in situ by the
method named sort().
114
['alpha', 'gamma', 'omega', 'pi']
If you need an order other than non-descending, you have to convince the function/method to change
its default behaviors. We'll discuss it soon.
The number-string conversion is simple, as it is always possible. It's done by a function named str().
The reverse transformation (string-number) is possible when and only when the string represents a valid
number. If the condition is not met, expect a ValueError exception.
Use the int() function if you want to get an integer, and float() if you need a floating-point value.
115
14.3
In the next section, we're going to show you some simple programs that process strings.
116
LAB: A LED Display
Estimated time
30 minutes
Level of difficulty
Medium
Objectives
Scenario
It's a device (sometimes electronic, sometimes mechanical) designed to present one decimal digit using
a subset of seven segments. If you still don't know what it is, refer to the following Wikipedia article.
Your task is to write a program which is able to simulate the work of a seven-display device, although
you're going to use single LEDs instead of segments.
Each digit is constructed from 13 LEDs (some lit, some dark, of course) - that's how we imagine it:
Your code has to display any non-negative integer number entered by the user.
Tip: using a list containing patterns of all ten digits may be very helpful.
Test data
Sample input:
123
Sample output:
117
Sample input:
9081726354
Sample output:
118
Four simple programs
The first problem we want to show you is called the Caesar cipher - more details
here: https://en.wikipedia.org/wiki/Caesar_cipher.
This cipher was (probably) invented and used by Gaius Julius Caesar and his troops during the Gallic
Wars. The idea is rather simple - every letter of the message is replaced by its nearest consequent
(A becomes B, B becomes C, and so on). The only exception is Z, which becomes A.
The program in the editor is a very simple (but working) implementation of the algorithm.
it accepts Latin letters only (note: the Romans used neither whitespaces nor digits)
all letters of the message are in upper case (note: the Romans knew only capitals)
line 02: ask the user to enter the open (unencrypted), one-line message;
line 03: prepare a string for an encrypted message (empty for now)
line 07: convert the letter to upper-case (it's preferable to do it blindly, rather than check
whether it's needed or not)
line 08: get the code of the letter and increment it by one;
line 09: if the resulting code has "left" the Latin alphabet (if it's greater than the Zcode)...
line 11: append the received character to the end of the encrypted message;
outputs:
119
BWFDBFTBS
Look at the code in the editor. Check carefully if it works. Use the cryptogram from the previous
program.
Using list comprehension may make the code slimmer. You can do that if you want.
line 03: ask the user to enter a line filled with any number of numbers (the numbers can be
floats)
line 06: as the string-float conversion may raise an exception, it's best to continue with the
protection of the try-except block;
line 08: ...and try to convert all its elements into float numbers; if it works, increase the sum;
line 11: print a diagnostic message showing the user the reason for the failure.
The code has one important weakness - it displays a bogus result when the user enters an empty line.
Can you fix it?
120
The IBAN Validator
The fourth program implements (in a slightly simplified form) an algorithm used by European banks to
specify account numbers. The standard named IBAN (International Bank Account Number) provides a
simple and fairly reliable method of validating the account numbers against simple typos that can occur
during rewriting of the number e.g., from paper documents, like invoices or bills, into computers.
a two-letter country code taken from the ISO 3166-1 standard (e.g., FR for France, GB for Great
Britain, DE for Germany, and so on)
two check digits used to perform the validity checks - fast and simple, but not fully reliable,
tests, showing whether a number is invalid (distorted by a typo) or seems to be good;
the actual account number (up to 30 alphanumeric characters - the length of that part depends
on the country)
The standard says that validation requires the following steps (according to Wikipedia):
(step 1) Check that the total IBAN length is correct as per the country (this program won't do
that, but you can modify the code to meet this requirement if you wish; note: you have to teach
the code all the lengths used in Europe)
(step 2) Move the four initial characters to the end of the string (i.e., the country code and the
check digits)
(step 3) Replace each letter in the string with two digits, thereby expanding the string, where A =
10, B = 11 ... Z = 35;
(step 4) Interpret the string as a decimal integer and compute the remainder of that number on
division by 97; If the remainder is 1, the check digit test is passed and the IBAN might be valid.
line 03: ask the user to enter the IBAN (the number can contain spaces, as they significantly
improve number readability...
line 05: the entered IBAN must consist of digits and letters only - if it doesn't...
line 07: the IBAN mustn't be shorter than 15 characters (this is the shortest variant, used in
Norway)
line 09: moreover, the IBAN cannot be longer than 31 characters (this is the longest variant,
used in Malta)
121
line 10: if it is longer, make an announcement;
line 12: move the four initial characters to the number's end, and convert all letters to upper
case (step 02 of the algorithm)
line 13: this is the variable used to complete the number, created by replacing the letters with
digits (according to the algorithm's step 03)
line 18: ...convert it into two digits (note the way it's done here)
line 19: the converted form of the IBAN is ready - make an integer out of it;
Let's add some test data (all these numbers are valid - you can invalidate them by changing any
character).
German: DE02100100100152517108
If you are an EU resident, you can use you own account number for tests.
122
LAB: Improving the Caesar cipher
Estimated time
30-45 minutes
Level of difficulty
Hard
Pre-requisites
Objectives
Scenario
You are already familiar with the Caesar cipher, and this is why we want you to improve the code we
showed you recently.
The original Caesar cipher shifts each character by one: a becomes b, z becomes a, and so on. Let's make
it a bit harder, and allow the shifted value to come from the range 1..25 inclusive.
Moreover, let the code preserve the letters' case (lower-case letters will remain lower-case) and all non-
alphabetical characters should remain untouched.
asks the user for a shift value (an integer number from the range 1..25 - note: you should force
the user to enter a valid shift value (don't give up and don't let bad data fool you!)
Test data
Sample input:
abcxyzABCxyz
123
2
123
Sample output:
cdezabCDEzab
123
Sample input:
The die is cast
25
Sample output:
Sgd chd hr bzrs
124
LAB: Palindromes
Estimated time
10-15 minutes
Level of difficulty
Easy
Objectives
Scenario
It's a word which look the same when read forward and backward. For example, "kayak" is a
palindrome, while "loyal" is not.
Note:
spaces are not taken into account during the check - treat them as non-existent;
there are more than a few correct solutions - try to find more than one.
Test data
Sample input:
Ten animals I slam in a net
It's a palindrome
Sample input:
Eleven animals I slam in a net
It's not a palindrome
125
LAB: Anagrams
Estimated time
10-15 minutes
Level of difficulty
Easy
Objectives
Scenario
An anagram is a new word formed by rearranging the letters of a word, using all the original letters
exactly once. For example, the phrases "rail safety" and "fairy tales" are anagrams, while "I am" and
"You are" are not.
checks whether, the entered texts are anagrams and prints the result.
Note:
spaces are not taken into account during the check - treat them as non-existent
Test data
Sample input:
Listen
Silent
Anagrams
Sample input:
modern
126
norman
Not anagrams
127
LAB: The Digit of Life
Estimated time
10-15 minutes
Level of difficulty
Easy
Objectives
Scenario
Some say that the Digit of Life is a digit evaluated using somebody's birthday. It's simple - you just need
to sum all the digits of the date. If the result contains more than one digit, you have to repeat the
addition until you get exactly one digit. For example:
2 + 0 + 1 + 7 + 0 + 1 + 0 + 1 = 12
1+2=3
asks the user her/his birthday (in the format YYYYMMDD, or YYYYDDMM, or MMDDYYYY -
actually, the order of the digits doesn't matter)
Test data
Sample input:
19991229
Sample output:
6
Sample input:
128
20000101
Sample output:
4
129
LAB: Find a word!
Estimated time
15-20 minutes
Level of difficulty
Medium
Objectives
Scenario
Let's play a game. We will give you two strings: one being a word (e.g., "dog") and the second being a
combination of any characters.
Your task is to write a program which answers the following question: are the characters comprising the
first string hidden inside the second string?
For example:
if the second string is "vcxzxdcybfdstbywuefsas", the answer is no (as there are neither the
letters "d", "o", or "g", in this order)
Hints:
you should use the two-argument variants of the pos() functions inside your code;
Test data
Sample input:
donor
Nabucodonosor
Sample output:
Yes
Sample input:
130
donut
Nabucodonosor
Sample output:
No
131
LAB: Sudoku
Estimated time
60 minutes
Level of difficulty
Hard
Objectives
Scenario
As you probably know, Sudoku is a number-placing puzzle played on a 9x9 board. The player has to fill
the board in a very specific way:
each row of the board must contain all digits from 0 to 9 (the order doesn't matter)
each column of the board must contain all digits from 0 to 9 (again, the order doesn't matter)
each of the nine 3x3 "tiles" (we will name them "sub-squares") of the table must contain all
digits from 0 to 9.
reads 9 rows of the Sudoku, each containing 9 digits (check carefully if the data entered are
valid)
Test data
Sample input:
295743861
431865927
876192543
387459216
612387495
549216738
132
763524189
928671354
154938672
Sample output:
Yes
Sample input:
195743862
431865927
876192543
387459216
612387495
549216738
763524189
928671354
254938671
Sample output:
No
133