IBM PY0101EN - Python Basics For Data Science

PYTHON BASICS
FOR DATA SCIENCE
Author
JOSEPH SANTARCANGELO
IBM Senior Data Scientist
Compiled By
OSVALDO ALENCAR
Brasilware Engineering
brasilware@gmail.com
IBM PY0101EN
This course is available at edx.org
SUMMARY VIDEOS LAB 2 of 590

SUMMARY
SUMMARY ................................................................................................................................................... 3
ABOUT THE AUTHOR ................................................................................................................................. 8
WELCOME TO PYTHON BASICS FOR DATA SCIENCE! ............................................................................... 9
ABOUT THIS COURSE ................................................................................................................................. 9
General Information .................................................................................................................................. 9
Note For learners auditing the course ............................................................................................9
Learning Objectives ................................................................................................................................ 10
Syllabus .................................................................................................................................................. 11
Module 1: Python Basics ............................................................................................................. 11
Module 3: Python Programming Fundamentals ........................................................................... 11
Module 4: Working with Data in Python .................................................................................... 12
Module 5: APIs and Data Collection............................................................................................. 12
Final Exam ................................................................................................................................... 12
Python Code Quick Reference Guide ............................................................................................ 12
Table of Videos ............................................................................................................................ 13
Hands-On Lab ............................................................................................................................. 14
COURSE INTRODUCTION .......................................................................................................................... 15
Video 001: Course Introduction (1:45) .................................................................................................... 15
Grading Scheme ..................................................................................................................................... 19
Change Log ............................................................................................................................................. 19
MODULE 1: PYTHON BASICS..................................................................................................................... 20
Introduction to Python ............................................................................................................................. 20
Learning Objectives ................................................................................................................................. 20
Video 002: Introduction to Python (4:14) ................................................................................................. 21
What you will learn..................................................................................................................... 21
Who is Python For ........................................................................................................................ 22
What makes Python great .......................................................................................................... 23
Diversity and inclusion efforts ...................................................................................................... 23
Summary ..................................................................................................................................... 24
Video 003: Getting Started With Jupyter (4:01) ...................................................................................... 25
What you will learn ................................................................................................................... 25
Launching the notebook ............................................................................................................... 26
Hands-on Lab: Write Your First Program ....................................................................................... 27
Recap .......................................................................................................................................... 32
Video 004: Types (3:02) .......................................................................................................................... 37
Data types ................................................................................................................................... 37
Hands-on Lab: Types ................................................................................................................... 41
Video 005: Expressions and Variables (3:55) .......................................................................................... 47
Expressions ................................................................................................................................. 47
Variables...................................................................................................................................... 49
Hands-on Lab: Expression and Variables ...................................................................................... 52
Practice Quiz: Expressions and Variables ....................................................................................... 56
Video 006: String Operations (3:58) ....................................................................................................... 57

String Methods ............................................................................................................................ 62
Hands-On Lab: String Operations................................................................................................. 65
Practice Quiz: String Operations ................................................................................................... 77
Grade Quiz: String Operations ..................................................................................................... 77
Module 1 Summary: Python Basics ......................................................................................................... 78
Module 1 Cheat Sheet: Python Basics ...................................................................................................... 79
Graded Quiz: Python Basics .................................................................................................................... 80
Glossary – Python Basics ........................................................................................................................ 81
MODULE 2: PYTHON DATA STRUCTURES ................................................................................................ 84
Introduction............................................................................................................................................. 84
Learning Objectives ................................................................................................................................ 84
Video 007: Lists and Tuples (8:51) .......................................................................................................... 85
Tuples .......................................................................................................................................... 85
Lists ............................................................................................................................................. 91
The help command ...................................................................................................................... 97
Hands-On Lab: Lists ..................................................................................................................... 98
Hands-On Lab: Tuples................................................................................................................ 105
Cheat Sheet: Lists and Tuples ..................................................................................................... 113
Practice Quiz: Lists and Tuples .................................................................................................... 119
Video 008: Dictionaries (2:25) ............................................................................................................... 120
Hands-On Lab: Dictionaries ........................................................................................................ 126
Practice Quiz: Dictionaries .......................................................................................................... 132
Video 009: Sets (5:17 ).......................................................................................................................... 133
sets............................................................................................................................................ 133
Hands-On Lab: Sets ................................................................................................................... 141
Practice Quiz: Sets ..................................................................................................................... 148
Discussion Prompt: Python Data Structures ............................................................................... 148
Module 2 Summary: Python Data Structures ......................................................................................... 149
Cheat Sheet: Dictionaries and Sets ........................................................................................................ 151
Cheat Sheet: Python Data structure Part-2 ................................................................................ 151
Graded Quiz: Python Data Structures .................................................................................................... 153
Reading: Glossary: Python Data Structures .......................................................................................... 154
MODULE 3: PYTHON PROGRAMMING FUNDAMENTALS ........................................................................ 156
Module Introduction and Learning Objectives ....................................................................................... 156
Video 010: Conditions and Branching (10:17) ....................................................................................... 157
Comparison Operators ............................................................................................................... 157
Branching .................................................................................................................................. 162
Logical Operators ....................................................................................................................... 167
Hands-on Lab: Conditions and Branching .................................................................................. 172
Reading: Conditions and Branching ............................................................................................ 183
Practice Quiz: Conditions and Branching ..................................................................................... 186
Video 011: Loops (6:45) ........................................................................................................................ 187
For loops.................................................................................................................................... 188
While Loops ............................................................................................................................. 195

Reading: Loops .......................................................................................................................... 200
Hands-on Lab: Loops ................................................................................................................. 205
Practice Quiz: Loops ................................................................................................................... 213
Video 012: Functions (13:31) ................................................................................................................ 215
Python Built-in Functions ........................................................................................................... 221
Making Functions ...................................................................................................................... 224
Reading: Exploring Python Functions ......................................................................................... 234
Hands-on Lab: Functions ........................................................................................................... 242
Practice Quiz: Functions ............................................................................................................. 255
Video 013: Exception Handling 3:49 ..................................................................................................... 257
What you will learn .................................................................................................................. 257
Introduction ............................................................................................................................... 257
Recap ........................................................................................................................................ 262
Reading: Exception Handling ..................................................................................................... 263
Hands-On Lab: Exception Handling ............................................................................................ 267
Practice Quiz: Exception Handling .............................................................................................. 275
Video 014: Objects and Classes (10:52)................................................................................................ 276
Built-in Types in Python ............................................................................................................. 276
Objects: Types ........................................................................................................................... 277
Methods .................................................................................................................................... 278
Classes ...................................................................................................................................... 279
Methods .................................................................................................................................... 288
Reading: Objects and Classes..................................................................................................... 294
Hands-on Lab: Objects and Classes ........................................................................................... 300
Practice Quiz: Objects and Classes ............................................................................................. 309
Practice Lab: Text Analysis ......................................................................................................... 312
Module 3 Summary: Python Programming Fundamentals ..................................................................... 317
Reading: Cheat Sheet - Python Programming Fundamentals ................................................................ 319
Reading: Glossary: Python Programming Fundamentals.......................................................................... 327
MODULE 4: WORKING WITH DATA IN PYTHON .................................................................................... 330
Learning Objectives .............................................................................................................................. 330
Video 015: Reading Files with Open (3:43) .......................................................................................... 331
Hands-On Lab: Reading Files with Open ................................................................................... 340
Video 016: Writing Files with Open (2:54) ........................................................................................... 346
Hands-On Lab: Writing Files with Open .................................................................................... 357
Practice Quiz: Reading and Writing Files with Open ................................................................... 365
Video 017: Pandas: Loading Data (4:51) .............................................................................................. 366
Importing Pandas ...................................................................................................................... 366
Dataframes ................................................................................................................................ 367
Using Dataframes ...................................................................................................................... 372
Working With Dataframes ......................................................................................................... 377
Video 018: Pandas: Working with and Saving Data (2:06) ................................................................... 381
List Unique Values ..................................................................................................................... 381
Extracting .................................................................................................................................. 382

Creating a Database .................................................................................................................. 383
Save as CSV .............................................................................................................................. 386
Practice Lab: Selecting Data in a DataFrame .............................................................................. 387
Hands on Lab: Pandas ............................................................................................................... 393
Practice Quiz: Pandas ................................................................................................................. 401
Video 019: One Dimensional NumPy (11:23) ....................................................................................... 403
Objectives .................................................................................................................................. 403
The basics and Array Creation ................................................................................................... 404
Indexing and Slicing ................................................................................................................... 408
Basic Operations ........................................................................................................................ 413
Hands-On Lab: One Dimensional Numpy .................................................................................. 424
Video 020: 2-Dimensional NumPy Arrays (7:13).................................................................................. 441
Table of Contents....................................................................................................................... 441
Hands-On Lab: Two Dimensional Numpy .................................................................................. 460
Reading: Some Context on APIs................................................................................................. 466
Practice Quiz: Numpy in Python ................................................................................................. 468
Module 4 Summary: Working with Data in Python..................................................................... 469
Working with Data in Python Cheat Sheet ................................................................................. 471
Reading: Glossary: Working with Data in Python ....................................................................... 475
MODULE 5 - APIS AND DATA COLLECTION ........................................................................................... 477
Module Introduction and Learning Objectives ...................................................................................... 477
Learning Objectives .............................................................................................................................. 477
Video 021: Application Program Interface (5:12).................................................................................. 478
Hands-On Lab: Introduction to API ............................................................................................ 487
Practice Quiz: Simple APIs ......................................................................................................... 491
Video 022: REST APIs & HTTP Requests - Part 1 (4:11) ...................................................................... 492
Outline ...................................................................................................................................... 492
Uniform Resource Locator: URL ................................................................................................. 494
Video 023: REST APIs & HTTP Requests - Part 2 (4:56) ...................................................................... 499
Outline ...................................................................................................................................... 499
Requests Module in Python ....................................................................................................... 499
Get Request With URL Parameters ............................................................................................ 501
Post Requests ............................................................................................................................ 504
Compare POST and GET ............................................................................................................ 505
Hands-on Lab: Access REST APIs & Request HTTP ................................................................... 506
Video 024 (Optional): HTML for Web Scraping (5:00) ......................................................................... 513
Video 025: Web Scraping (4:59) .......................................................................................................... 527
What You Will Learn ................................................................................................................. 527
Introduction ............................................................................................................................... 528
What is Web Scraping? ............................................................................................................. 529
Hands-on Lab: Web Scraping .................................................................................................... 543
Video 026: Working with Different File Formats (4:11) ........................................................................ 553
What you will learn ................................................................................................................... 553
Introduction ............................................................................................................................... 553

Understanding File Formats ....................................................................................................... 554
Python Pandas Library ............................................................................................................... 554
Hands-on Lab: Working with Different File Formats .................................................................. 558
Module 5 Summary: APIs, and Data Collection .................................................................................... 578
Cheat Sheet: APIs and Data Collection ................................................................................................. 580
Reading: Glossary: APIs and Data Collection ....................................................................................... 583
Final Exam ................................................................................................................................................ 587
Instructions ........................................................................................................................................... 587
Final Exam ............................................................................................................................................ 587
Course Wrap Up....................................................................................................................................... 588
Congratulations and Next Steps ........................................................................................................... 588
Congratulations! ................................................................................................................................... 589
Credits and Acknowledgments ................................................................................................................. 590

ABOUT THE AUTHOR
Joseph Santarcangelo has a PhD in Electrical Engineering, his research focused on using machine
learning, signal processing, and computer vision to determine how videos impact human
cognition. Joseph has been working for IBM since he completed his PhD.

WELCOME TO PYTHON BASICS FOR DATA SCIENCE!
We are glad you are taking this course! The team spent a lot of time developing this Python Basics
for Data Science course so everyone can learn to program with this incredibly powerful language.
As you may know, Python has become one of the top languages used not only by developers, but
also by data scientists, data engineers, and researchers alike. It lends itself well to creating
applications and analyzing big data.
So, we hope you enjoy this beginner's course in Python for Data Science.
ABOUT THIS COURSE
GENERAL INFORMATION
Kickstart your learning of Python for data science, as well as programming in general, with this
beginner-friendly introduction to Python.
Python is one of the world’s most popular programming languages, and there has never been
greater demand for professionals with the ability to apply Python fundamentals to drive business
solutions across industries.
This course will take you from zero to programming in Python in a matter of hours— no prior
programming experience necessary! You will learn Python fundamentals, including data
structures and data analysis, complete hands-on exercises throughout the course modules, and
create a final project to demonstrate your new skills.
By the end of this course, you’ll feel comfortable creating basic programs, working with data, and
solving real-world problems in Python. You’ll gain a strong foundation for more advanced
learning in the field and develop skills to help advance your career.
NOTE FOR LEARNERS AUDITING THE COURSE

Please note, the cloud base environment that you used in the previous lab is available to paid
learners only. But you can download the lab notebooks and run them in your own Jupyter lab
environment. If you want to use the cloud-based lab environment, you can upgrade your course
subscription to pay. This will allow you to obtain a course certificate!
This course can be applied to multiple Specialization or Professional Certificate programs.
Completing this course will count towards your learning in any of the following programs:
• Python Data Science Professional Certificate
• IBM Data Science Professional Certificate
• IBM Data Analyst Professional Certificate
• Full Stack Developer Professional Certificate
• Data Engineering Fundamentals Professional Certificate
• Upon completion of any of the above programs, in addition to earning a Specialization
completion certificate, you’ll also receive a digital badge from IBM recognizing your
expertise in the field.

LEARNING OBJECTIVES
The objective of this course is to get you started with Python as the programming language and
give you a taste of how to start working with data in Python.
In this course you will learn about:
• What Python is and why is it useful.
• The application of Python
• How to define variables
• Sets and conditional statements in Python
• The purpose of having functions in Python
• How to operate on files to read and write data in Python
• How to use pandas, a must have package for anyone attempting data analysis in Python

SYLLABUS
MODULE 1: PYTHON BASICS

• Types
o Practice Quiz
• Expressions & Variables
o Hands-On Lab: Your First Program, Types, Expressions, and Variables
o Practice Quiz
• String Operations
o Hands-On Lab: Strings
o Practice Quiz
o Graded Quiz
o Module 2: Python Data Structures
• Lists and Tuples
o Hands-On Lab: Lists
o Hands-On Lab: Tuples
o Practice Quiz
o Dictionaries
• Hands-On Lab: Dictionaries
o Practice Quiz
• Sets
o Hands-On Lab: Sets
o Practice Quiz
• Graded Quiz
MODULE 3: PYTHON PROGRAMMING FUNDAMENTALS
• Conditions and Branching
o Hands-On Lab: Conditions and Branching
o Practice Quiz
• Loops
o Hands-On Lab: Loops
o Practice Quiz
• Functions
o Hands-On Lab: Functions
o Practice Quiz
• Exception Handling
o Hands-On Lab: Exception Handling
o Practice Quiz
• Objects and Classes

o Hands-On Lab: Objects and Classes
o Practice Quiz
• Graded Quiz
MODULE 4: WORKING WITH DATA IN PYTHON
• Reading Files with Open
o Hands-On Lab: Reading Files with Open
• Writing Files with Open
o Hands-On Lab: Writing Files with Open
• Practice Quiz
• Loading Data with Pandas
o Pandas: Working with and Saving Data
o Hands-On Lab: Pandas with IBM Watson Studio
o Practice Quiz
• One Dimensional Numpy
o Hands-On Lab: One Dimensional Numpy
• Two Dimensional Numpy
o Hands-On Lab: Two Dimensional Numpy
o Practice Quiz
• Graded Quiz
MODULE 5: APIS AND DATA COLLECTION

• Simple APIs - Part 1
o Simple APIs - Part 2
o Hands-On Lab: Introduction to API
o Practice Quiz
• REST APIs & HTTP Requests - Part 1
o REST APIs & HTTP Requests - Part 2
o Hands-on Lab: Access REST APIs & Request HTTP
o Hands-on Lab: API Examples
• Optional: HTML for Webscraping
o Webscraping
o Hands-on Lab: Webscraping
• Working with different file formats (csv, xml, json, xlsx)
o Hands-on Lab: Working with different file formats
o Practice Quiz
• Graded Quiz
FINAL EXAM
PYTHON CODE QUICK REFERENCE GUIDE

TABLE OF VIDEOS
Module Topic Video Subject Duration Total
Intro Course Introduction 01 Course Introduction 01:45 0:01:45
Python Basics 02 Introduction to Python 04:14
03 Getting Started With Jupyter 04:01
1 04 Data Types 03:02 0:19:10
05 Expressions and Variables 03:55
06 String Operations 03:58
Python Data Structures 07 Lists and Tuples 08:51
2 08 Dictionaries 02:25 0:16:33
09 Sets 05:17
Python Programming 10 Conditions and Branching 10:17
Fundamentals
11 Loops 06:45
3 12 Functions 13:31 0:45:14
13 Exception Handling 03:49
14 Objects and Classes 10:52
Working With Data In 15 Reading Files With Open 03:43
Python
16 Writing Files With Open 02:54
17 Pandas: Loading Data 04:51
4 0:32:00
18 Pandas: Working With and Saving Data 02:06
19 One Dimensional Numpy 11:23
20 2-Dimensional Numpy Arrays 07:13
Apis And Data Collection 21 Application Program Interface 05:12
22 Rest Apis & Http Requests - Part 1 04:11
23 Rest Apis & Http Requests - Part 2 04:56
5 0:28:29
24 Html For Web Scraping 05:00
25 Web Scraping 04:59
26 Working With Different File Formats 04:11
TOTAL TIME 2:23:21 2:23:21
Note: Videos available at edx.org

HANDS-ON LAB
HANDS-ON LAB
Module Lab Video Subject Time (min) Total Time (h:m)
✓ 3 Writing Your First Python Code 10
✓ 4 Working with Types in Python 10

1 0:45
✓ 5 Working with Variables and Expressions in Python 10
✓ 6 String Operations 15
✓ 7 Lists in Python 15
✓ 7 Tuples in Python 15
2 1:15
✓ 8 Dictionaries in Python 25
✓ 9 Sets in Python 20
✓ 10 Conditions in Python 20
10 Reading: Conditions and Branching 10
11 Introduction to Loops in Python 10
✓ 11 Loops in Python 20
12 Reading: Exploring Python Functions 15
3 ✓ 12 Functions in Python 40 3:55
13 Reading: Exception Handling 10
✓ 13 Exception Handling 15
14 Reading: Objects and Classes 10
✓ 14 Hands-on Lab: Objects and Classes 40
✓ 14 Practice Lab: Text Analysis 45
✓ 15 Hands-On Lab: Reading Files with Open 40
✓ 16 Hands-On Lab: Writing Files with Open 25
✓ 18 Practice Lab: Selecting Data in a DataFrame 15

4 2:45
✓ 18 Hands on Lab: Pandas 15
✓ 19 Hands-On Lab: One Dimensional Numpy 40
✓ 20 Hands-On Lab: Two Dimensional Numpy 30
✓ 21 Hands-On Lab: Introduction to API 15
✓ 23 Hands-on Lab: Access REST APIs & Request HTTP 15
5 ✓ 25 Hands-on Lab: Web Scraping 30 2:00
✓ 26 Hands-on Lab: Working with Different File Formats 30
✓ 26 Practice Project: GDP Data Extraction and Processing 30
Total Time (h:m) 10:40

COURSE INTRODUCTION
VIDEO 001: COURSE INTRODUCTION (1:45)

Hello. I'm Joseph and I will be your instructor for this course. You made the right choice. If there’s
just one programming language I had to learn for data science in AI, it would unquestionably be
Python.
Figure 1
The best part is Python is super easy to learn and is often one of the first languages people turn to,
when trying to learn to code. Python is very powerful. It has a huge ecosystem of libraries that will
help you get the most complex things done with just a few lines of code.
Figure 2

Python is great from everything, from data analysis, web scraping to working with big data,
finance, computer vision, natural language processing, machine learning, deep learning and much
more. Python can do anything you can throw at it.
Figure 3
This course is designed for beginners, but if you know how to program, you can also take this
course and quickly learn Python.
Figure 4

In Module 1 you will learn Python basics, including types, expressions, variables and string
operations.
Figure 5
In module 2 you will cover Python data structures including lists, tuples, dictionaries, and sets.
Figure 6
In Module 3 I will teach you Python programming fundamentals such as conditions, branching,
loops, functions and objects and classes.
Figure 7

In Module 4 I will teach you how to work with data including loading data with Python’s built-in
functions, using popular libraries such as Numpy and Pandas, followed by application
programming interfaces, or Apis for short.
Figure 8
You will apply what you learn by doing projects using real-world datasets.
Figure 9
If you have any questions or require clarification, feel free to post on the course discussion
forum.
Good luck and happy learning.

GRADING SCHEME
1. The minimum passing mark for the course is 70% with the following weights:
• 50% - Graded Quizzes
• 50% - Final Exam
2. Though the Graded Quizzes and the Final Exam have respective weightage, the only grade
that matters is the overall grade for the course.
3. Graded Quizzes have no time limit. You are encouraged to review the course material to find
the answers. Please remember that the Graded Quizzes are worth 50% of your final mark.
4. The final exam has a 1-hour time limit.
5. Attempts are per question in both, the Review Questions and the Final Exam:
• One attempt – For True/False questions
• Two attempts – For any question other than True/False
• There are no penalties for incorrect attempts.
6. Check your grades in the course at any time by clicking on the “Progress” tab.
CHANGE LOG
2024-1-30
• Updated version of the course published on edX.org
• Improved the practice and graded quizzes to more thoroughly assess the knowledge and
skills gained in the course.
2020-09-01
• Updated version of the course published on edX.org
• Replaced links to labs with links from SN Asset Library
2019-03-25
• Assignment added
2017-12-01
• Notebooks slightly changed J.S
2017-12-01
• Notebook order changed J.S
2017-11-12
• Rephrased some sentences for clarity. Fixed some typos.
2017-09-15
• Newly renovated course released

MODULE 1: PYTHON BASICS
INTRODUCTION TO PYTHON
This module teaches the basics of Python and begins by exploring some of the different data
types such as integers, real numbers, and strings. Continue with the module and learn how to use
expressions in mathematical operations, store values in variables, and the many different ways
to manipulate strings.
LEARNING OBJECTIVES
In this module, you will:
• Demonstrate an understanding of types in Python by converting or casting data types
such as strings, floats, and integers.
• Interpret variables and solve expressions by applying mathematical operations.
• Describe how to manipulate strings by using a variety of methods and operations.
• Build a program in JupyterLab to demonstrate your knowledge of types, expressions,
and variables.
• Work with, manipulate, and perform operations on strings in Python.

VIDEO 002: INTRODUCTION TO PYTHON (4:14)
Figure 10
WHAT YOU WILL LEARN

Welcome to “Introduction to Python”. After watching the video, you will be able to
• identify the users of Python.
• List the benefits of using Python.
• Describe the diversity and inclusion efforts of the Python community.
Figure 11
Python is a powerhouse of a language. It is the most widely used and most popular programming
language used in the data science industry.
According to the 2019 Kaggle Data Science and Machine Learning Survey, ¾ of the over 10,000
respondents worldwide reported that they use Python regularly. Glassdoor reported that in 2019
more than 75% of data science positions listed included Python in their job descriptions. When
asked which language an aspiring data scientist should learn first, most data scientists say
Python.

Figure 12
WHO IS PYTHON FOR

Let’s start with the people who use Python. If you already know how to program, then Python is
great for you because it uses clear and readable syntax. You can develop the same programs from
other languages with lesser code using Python For beginners,
Python is a good language to start with because of the huge global community and wealth of
documentation.
Several different surveys done in 2019 established that over 80% of data professionals use
Python worldwide.
Python is useful in many areas including data science, AI and machine learning, web
development, and Internet of Things (IoT) devices, like the Raspberry Pi.
Large organizations that heavily use python include IBM, Wikipedia, Google, Yahoo!, CERN,
NASA, Facebook, Amazon, Instagram, Spotify, and Reddit.
Python is widely supported by a global community and shepherded by the Python Software
Foundation.
Figure 13

WHAT MAKES PYTHON GREAT
Python is a high-level, general-purpose programming language that can be applied to many
different classes of problems.
It has a large, standard library that provides tools suited to many different tasks including but not
limited to Databases, Automation, Web scraping, Text processing, Image processing, Machine
learning, and Data analytics.
For data science, you can use Python's scientific computing libraries like Pandas, NumPy, SciPy,
and Matplotlib.
For artificial intelligence, it has TensorFlow, PyTorch, Keras, and Scikit-learn.
Python can also be used for Natural Language Processing (NLP) using the Natural Language
Toolkit (NLTK).
Figure 14
DIVERSITY AND INCLUSION EFFORTS

Another great selling point for Python is that the Python community has a well- documented
history of paving the way for diversity and inclusion efforts in the tech industry as a whole.
The Python language has a code of conduct executed by the Python Software Foundation that
seeks to ensure safety and inclusion for all, in both online and in- person Python communities.
Figure 15

Communities like PyLadies seek to create spaces for people interested in learning Python in safe
and inclusive environments. PyLadies is an international mentorship group with a focus on
helping more women become active participants and leaders in the Python open-source
community.
Figure 16
SUMMARY
In the video, you learned that:
• Python uses clear and readable syntax.
• Python has a huge global community and a wealth of documentation.
• For data science, you can use python's scientific computing libraries like Pandas,
NumPy, SciPy, and Matplotlib.
• Python can also be used for Natural Language Processing (NLP) using the Natural
Language Toolkit (NLTK).
• Python community has a well-documented history of paving the way for diversity
and inclusion efforts in the tech industry as a whole.
Figure 17

VIDEO 003: GETTING STARTED WITH JUPYTER (4:01)
Figure 18
WHAT YOU WILL LEARN

Welcome to “Getting started with Jupyter.” After watching the video, you will be able to:
• Describe how to run, insert, and delete a cell in a notebook.
• Work with multiple notebooks.
• Present the notebook, and
• shut down the notebook session.
Figure 19
In the lab session of this module, you can launch a notebook using the Skills Network virtual
environment. After selecting the check box, click the Open tool tab, and the environment will open
the Jupyter Lab.

LAUNCHING THE NOTEBOOK
Here you see the open notebook.
Figure 20
On opening the notebook, you can change the name of the notebook.
Figure 21
Click File, then click Rename Notebook to give the required name. And you can now start working
on your new notebook.

HANDS-ON LAB: WRITE YOUR FIRST PROGRAM
Skills Network Labs (SN Labs) is a virtual lab environment used in this course. Upon clicking the
"Start Lab" button, you will be prompted to have your Username and Email passed to Skills
Network Labs in accordance with the IBM Skills Network privacy policy and will only be used for
communicating important information to enhance your learning experience.
SN Labs provides a Cloud-based virtual lab environment to help you complete hands-on labs for
this course without the need to download, install and configure software like Jupyter on your
computer. However, if you are having issues using SN Labs, or prefer to use JupyterLab on your
own computer you can download the notebook file for this lab or open the lab by clicking this
link.
In the new notebook, print “hello world”.
Figure 22
Figure 23

Figure 24
Then click the Run button to show that the environment is giving the correct output. On the main
menu bar at the top, click Run. In the drop-down menu, click Run Selected Cells to run the current
highlighted cells. Alternatively, you can use a shortcut, press Shift + Enter.
Figure 25
In case you have multiple code cells, click Run All cells to run the code in all the cells.
Figure 26

You can add code by inserting a new cell. To add a new cell, click the plus symbol in the toolbar.
Figure 27
In addition, you can delete a cell. Highlight the cell and on the main menu bar, click Edit, and then
click Delete Cells. Alternatively, you can use a shortcut by pressing D twice on the highlighted
cell.
Figure 28

Also, you can move the cells up or down as required.
Figure 29
So, now you have learned to work with a single notebook. Next, let’s learn to work with multiple
notebooks.
Figure 30
Click the plus button on the toolbar and select the file you want to open. Another notebook will
open.
Figure 31

Alternatively, you can click File on the menu bar and click Open a new launcher or Open a new
notebook.
Figure 32
And when you open the new file, you can move them around. For example, as shown, you can
place the notebooks side by side.
Figure 33
On one notebook, you can assign variable one to the number 1, and variable two to the number
2 and then you can print the result of adding the numbers one and two.
Figure 34

As a data scientist, it is important to communicate your results. Jupyter supports presenting
results directly from the notebooks. You can create a Markdown to add titles and text descriptions
to help with the flow of the presentation. To add markdown, click Code and select Markdown.
You can create line plots and convert each cell and output into a slide or sub-slide in the form of
a presentation. The slides functionality in Jupyter allows you to deliver code, visualization, text,
and outputs of the executed code as part of a project.
Figure 35
Now, when you have completed working with your notebook or notebooks, you can shut them
down. Shutting down notebooks release their memory. Click the stop icon on the sidebar, it is the
second icon from the top. You can terminate all sessions at once or shut them down individually.
And after you shut down the notebook session, you will see “no kernel” at the top right. This
confirms it is no longer active, you can now close the tabs.
Figure 36
RECAP
In the video, you learned how to:
• Run, delete, and insert a code cell.
• Run multiple notebooks at the same time.
• Present a notebook using a combination of Markdown and code cells.
• Shut down your notebook sessions after you have completed your work.

WRITING YOUR FIRST PYTHON CODE
Estimated time needed: 10 minutes
Objectives
After completing this lab you will be able to:

• Write your basic python code
Table of Contents
• Say 'Hello' to the world in Python

• What version of Python are we using?
• Writing comments in Python
• Errors in Python
• Does Python know about your error before it runs your code?
• Exercise: Your First Program
Say 'Hello' to the world in Python
When learning a new programming language, it is customary to start with an "hello world"
example. As simple as it is, this one line of code will ensure that we know how to print a string
in output and how to execute code within cells in a notebook.
[Tip]: To execute the Python code in the code cell, click on the cell to select it and press Shift + Enter.
# Try your first Python output

print('Hello, Python!')
Hello, Python!
After executing the cell above, you should see that Python prints Hello, Python!.
Congratulations on running your first Python code!
[Tip:] print() is a function. You passed the string 'Hello, Python!' as an argument to instruct Python on
what to print.
What version of Python are we using?
There are two popular versions of the Python programming language in use today: Python 2 and
Python 3. The Python community has decided to move on from Python 2 to Python 3, and many
popular libraries have announced that they will no longer support Python 2.
Since Python 3 is the future, in this course we will be using it exclusively. How do we know that
our notebook is executed by a Python 3 runtime? We can look in the top- right hand corner of this
notebook and see "Python 3".

We can also ask Python directly and obtain a detailed answer. Try executing the following
code:
# Check the Python Version

import sys
print(sys.version)
3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)]
[Tip:] sys is a built-in module that contains many system-specific parameters and functions, including the
Python version in use. Before using it, we must explicitly import it.
Writing comments in Python
In addition to writing code, note that it's always a good idea to add comments to your code. It will
help others understand what you were trying to accomplish (the reason why you wrote a given
snippet of code). Not only does this help other people understand your code, it can also
serve as a reminder to you when you come back to it weeks or months later.
To write comments in Python, use the number symbol # before writing your comment. When you
run your code, Python will ignore everything past the # on a given line.
# Practice on writing comments

print('Hello, Python!') # This line prints a string
Hello, Python!
After executing the cell above, you should notice that This line prints a string did not
appear in the output, because it was a comment (and thus ignored by Python).
The second line was also not executed because print('Hi') was preceded by the number sign
(#) as well! Since this isn't an explanatory comment from the programmer, but an actual line
of code, we might say that the programmer commented out that second line of code.
Errors in Python
Everyone makes mistakes. For many types of mistakes, Python will tell you that you have made
a mistake by giving you an error message. It is important to read error messages carefully to really
understand where you made a mistake and how you may go about correcting it.
For example, if you spell print as frint, Python will display an error message. Give it a try:
# Print string as error message
frint("Hello, Python!")
NameError Traceback (most recent call last)
Cell In[22], line 3
1 # Print string as error message
----> 3 frint("Hello, Python!")
NameError: name 'frint' is not defined

The error message tells you:
1. where the error occurred (more useful in large notebook cells or scripts), and
2. what kind of error it was (NameError)
Here, Python attempted to run the function frint, but could not determine what frint is since
it's not a built-in function and it has not been previously defined by us either.
You'll notice that if we make a different type of mistake, by forgetting to close the string,
we'll obtain a different error (i.e., a SyntaxError). Try it below:
# Try to see built-in error message

print"Hello, Python!)
Cell In[25], line 2
print"Hello, Python!)
^
SyntaxError: unterminated string literal (detected at line 2)
Does Python know about your error before it runs your code?
Python is what is called an interpreted language. Compiled languages examine your entire
program at compile time and are able to warn you about a whole class of errors prior to execution.
In contrast, Python interprets your script line by line as it executes it. Python will stop executing
the entire program when it encounters an error (unless the error is expected and handled by the
programmer, a more advanced subject that we'll cover later on in this course).
Try to run the code in the cell below and see what happens:
# Print string and error to see the running order

print("This will be printed")
frint("This will cause an error")
print("This will NOT be printed")
This will be printed
---------------------------------------------------------------------------

Cell In[27], line 3
1 # Print string and error to see the running order
2 print("This will be printed")
----> 3 frint("This will cause an error")
4 print("This will NOT be printed")
NameError: name 'frint' is not defined

Exercise: Your First Program
Generations of programmers have started their coding careers by simply printing "Hello,
world!". You will be following in their footsteps.
In the code cell below, use the print() function to print out the phrase: Hello, world!
# Write your code below. Don't forget to press Shift+Enter to execute the cell
print("Hello, world!")
Hello, world!
Now, let's enhance your code with a comment. In the code cell below, print out the phrase: Hello,
world! and comment it with the phrase Print the traditional hello world all in one
line of code.
# Print the traditional hello world
print("Hello, world!")
What is the value of z where z = x + y?
x = 10
y = 7
Z = x + y
Z
17
Congratulations, you have completed your first lesson and hands-on lab in Python.

VIDEO 004: TYPES (3:02)
Figure 37
DATA TYPES
A type is how Python represents different types of data. In the video, we will discuss some widely
used types in Python. You can have different types in Python. They can be integers like 11, real
numbers like 21.213, they can even be words. Integers, real numbers, and words can be
expressed as different data types.
The following chart summarizes three data types for the last examples:
• The first column indicates the expression.
• The second column indicates the data type.
Figure 38
We can see the actual data type in Python by using the type command. We can have int, which
stands for an integer and float that stands for float, essentially a real number. The type string is a
sequence of characters.
Figure 39

INT
Here are some integers. Integers can be negative or positive. It should be noted that there is a
finite range of integers but it is quite large.
Figure 40
FLOAT
Floats are real numbers. They include the integers but also numbers in between the integers.
Consider the numbers between 0 and 1. We can select numbers in between them. These numbers
are floats. Similarly, consider the numbers between 0.5 and 0.6. We can select numbers in
between them. These are floats as well. We can continue the process zooming in for different
numbers. Of course, there is a limit, but it is quite small.
Figure 41

CHANGING EXPRESSION TYPES
You can change the type of the expression in Python, this is called typecasting. You can convert
an int to a float. For example, you can convert or cast the integer 2 to a float 2.0. Nothing really
changes, if you cast a float to an integer, you must be careful. For example, if you cast the float
1.1 to 1, you will lose some information.
Figure 42
If a string contains an integer value, you can convert it to int. If we convert a string that contains
a non-integer value, we get an error.
Check out more examples in the lab.
Converting int to string
You can convert an int to a string or a float to a string.
Figure 43

BOOLEAN
Boolean is another important type in Python. A Boolean can take on two values. The first value is
True, just remember we use an uppercase T. Boolean values can also be False with an uppercase
F. Using the type command on a Boolean value, we obtain the term bool. This is short for Boolean.
Figure 44
If we cast a Boolean True to an integer or float, we will get a 1. If we cast a Boolean False to an
integer or float, we get a 0.
Figure 45
If you cast a 1 to a Boolean, you get a True. Similarly, if you cast a 0 to a Boolean, you get a False.
Figure 46
Check the labs for more examples or check Python.org for other kinds of types in Python.

HANDS-ON LAB: TYPES
WORKING WITH TYPES IN PYTHON
OBJECTIVES
• Work with various types of data in Python.
• Convert the data from one type to another.
TABLE OF CONTENTS
Types of objects in Python
• Integers
• Floats
• Converting from one object type to a different object type
• Boolean data type
• Exercise: Types
TYPES OF OBJECTS IN PYTHON

Python is an object-oriented language. There are many different types of objects in Python. Let's
start with the most common object types: strings, integers and floats.
Anytime you write words (text) in Python, you're using character strings (strings for short). The
most common numbers, on the other hand, are integers (e.g. -1, 0, 100) and floats, which
represent real numbers (e.g. 3.14, -42.0).
Figure 47
The following code cells contain some examples.
# Integer
11
11
#float
2.14
2.14

# String
"Hello, Python 101!"
“Hello, Python!”
You can get Python to tell you the type of an expression by using the built-in type() function.
You'll notice that Python refers to integers as int, floats as float, and character strings as str.
# Type of 12
type(12)
int
# Type of 2.14
type(2.14)
float
# Type of "Hello, Python 101!"

type("Hello, Python 101!")
st
In the code cell below, use the type() function to check the object type of 12.0.
Type(12.0)
float
Integers
Here are some examples of integers. Integers can be negative or positive numbers:
Figure 48
We can verify this is the case by using, you guessed it, the type() function:
# Print the type of -1

type(-1)
int
# Print the type of 4

type(4)
int
# Print the type of 0

type(0)
int

Floats
Floats represent real numbers; they are a superset of integer numbers but also include "numbers
with decimals". There are some limitations when it comes to machines representing real numbers,
but floating point numbers are a good representation in most cases. You can learn more about
the specifics of floats for your runtime environment, by checking the value of sys.float_info. This
will also tell you what's the largest and smallest number that can be represented with them.
Once again, can test some examples with the type() function:
# Print the type of 1.0

type(1.0) # Notice that 1 is an int, and 1.0 is a float
float

type(0.5)
float

type(0.56)
float
# System settings about float type

import sys
sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308,
min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53,
epsilon=2.220446049250313e-16, radix=2, rounds=1)
Converting from one object type to a different object type
You can change the type of the object in Python; this is called typecasting. For example, you
can convert an integer into a float (e.g. 2 to 2.0).
Let's try it:
# Verify that this is an integer

type(2)
int
Converting integers to floats

Let's cast integer 2 to float:
# Convert 2 to a float
float(2)
2.0
# Convert integer 2 to a float and check its type

type(float(2))
float

When we convert an integer into a float, we don't really change the value (i.e., the significand) of
the number. However, if we cast a float into an integer, we could potentially lose some
information. For example, if we cast the float 1.1 to integer we will get 1 and lose the decimal
information (i.e., 0.1):
# Casting 1.1 to integer will result in loss of information

int(1.1)
1
Converting from strings to integers or floats

Sometimes, we can have a string that contains a number within it. If this is the case, we can cast
that string that represents a number into an integer using int():
# Convert a string into an integer

int('1')
1
But if you try to do so with a string that is not a perfect match for a number, you'll get an error. Try
the following:
# Convert a string into an integer with error

int('1 or 2 people')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[24], line 3
1 # Convert a string into an integer with error
----> 3 int('1 or 2 people')
ValueError: invalid literal for int() with base 10: '1 or 2 people'
You can also convert strings containing floating point numbers into float objects:
# Convert the string "1.2" into a float

float('1.2')
1.2
[Tip:] Note that strings can be represented with single quotes ('1.2') or double quotes ("1.2"), but you can't mix
both (e.g., "1.2').
Converting numbers to strings

If we can convert strings to numbers, it is only natural to assume that we can convert numbers to
strings, right?
# Convert an integer to a string

str(1)
'1'
And there is no reason why we shouldn't be able to make floats into strings as well:
# Convert a float to a string

str(1.2)
'1.2'

Boolean data type
Boolean is another important type in Python. An object of type Boolean can take on one of two
values: True or False:
# Value true
True
True
Notice that the value True has an uppercase "T". The same is true for False (i.e. you must use the
uppercase "F").
# Value false
False
False
When you ask Python to display the type of a boolean object it will show bool which stands for
boolean:
# Type of True
type(True)
bool
# Type of False
type(False)
bool
We can cast boolean objects to other data types. If we cast a boolean with a value of True to an
integer or float we will get a one. If we cast a boolean with a value of False to an integer or float
we will get a zero. Similarly, if we cast a 1 to a Boolean, you get a True. And if we cast a 0 to a
Boolean we will get a False. Let's give it a try:
# Convert True to int

int(True)
1
# Convert 1 to Boolean
bool(1)
True
# Convert 0 to Boolean
bool(0)
False
# Convert True to float

float(True)
1.0

EXERCISE: TYPES
What is the data type of the result of: 6/2?
type(6/2) #float
float
What is the type of the result of: 6//2? (Note the double slash //.)
Type(6//2) #int, as the double slashes stand for integer division

int
What is the type of the result of: "Hello, World!"
type("Hello, World!")
str
What is the type of the result of: "hello" == "world"
type("hello" == "world") #bool

bool
Write the code to convert the following number representing employeeid "1001" to an integer
Type(“1001”) #1001
1001
Write the code to convert this number representing financial value "1234.56" to a floating
point number
Float(“1234.56”) #1234.56
31234.56
Write the code to convert this phone number 123-456-7890 to a string
Str((“123-456-7890”) #’123-456-7890’
'123-456-7890'
Congratulations, you have completed your hands-on lab on Types in Python.

VIDEO 005: EXPRESSIONS AND VARIABLES (3:55)
In the video, we will cover expressions and variables.
Figure 49
EXPRESSIONS
Expressions describe a type of operation that computers perform. Expressions are operations that
python performs. For example, basic arithmetic operations like adding multiple numbers. The
result in this case is 160. We call the numbers operands, and the math symbols in this case,
addition, are called operators.
Figure 50
We can perform operations such as subtraction using the subtraction sign. In this case, the result
is a negative number. We can perform multiplication operations using the asterisk. The result is
25. In this case, the operations are given by negative and asterisk.
Figure 51
At :31 in the video the voice over and the transcript should be subtraction instead of traction.
At :43 in the video the voice over and transcript should be operator instead of operand.

We can also perform division with the forward slash (/): 25 / 5 is 5.0; 25 / 6 is approximately
4.167.
In Python 3, the version we will be using in this course, both will result in a float.
Figure 52
We can use the double slash for integer division, where the result is rounded. Be aware, in
some cases the results are not the same as regular division.
Figure 53
Python follows mathematical conventions when performing mathematical expressions. The

following operations are in a different order. In both cases, Python performs multiplication, then
addition, to obtain the final result.
There are a lot more operations you can do with Python. Check the labs for more examples. We
will also be covering more complex operations throughout the course.
Figure 54

The expressions in the parentheses are performed first. We then multiply the result by 60. The
result is 1,920.
Figure 55
VARIABLES
Now, let's look at variables. We can use variables to store values. In this case, we assign a value
of 1 to the variable my_variable using the assignment operator, i.e, the equal sign. We can then
use the value somewhere else in the code by typing the exact name of the variable. We will use
a colon to denote the value of the variable.
Figure 56
We can assign a new value to my_variable using the assignment operator. We assign a value
of 10. The variable now has a value of 10. The old value of the variable is not important.
Figure 57

We can store the results of expressions. For example, we add several values and assign the result
to x. X now stores the result. We can also perform operations on x and save the result to a new
variable-y. Y now has a value of 2.666.
Figure 58
We can also perform operations on x and assign the value x. The variable x now has a value:
2.666.
Figure 59
As before, the old value of x is not important. We can use the type command in variables as well.
Figure 60

It's good practice to use meaningful variable names; so, you don't have to keep track of what the
variable is doing. Let say, we would like to convert the number of minutes in the highlighted
examples to number of hours in the following music dataset.
Figure 61
We call the variable that contains the total number of minutes "total_min". It's common to use
the underscore to represent the start of a new word. You could also use a capital letter. We call
the variable that contains the total number of hours, total_hour. We can obtain the total
number of hours by dividing total_min by 60. The result is approximately 2.367 hours.
Figure 62
If we modify the value of the first variable, the value of the variable will change. The final result
values change accordingly, but we do not have to modify the rest of the code.
Figure 63

HANDS-ON LAB: EXPRESSION AND VARIABLES
WORKING WITH VARIABLES AND EXPRESSIONS IN PYTHON
OBJECTIVES

• Use expressions and variables to perform operations
TABLE OF CONTENTS
• Expressions and Variables
o Expressions
o Exercise: Expressions
o Variables
o Exercise: Expression and Variables in Python
EXPRESSIONS AND VARIABLES
Expressions
Expressions in Python can include operations among compatible types (e.g., integers and floats).
For example, basic arithmetic operations like adding multiple numbers:
# Addition operation expression

43 + 60 + 16 + 41
160
We can perform subtraction operations using the minus operator. In this case the result is a
negative number:
# Subtraction operation expression

50 – 60
-10
We can do multiplication using an asterisk:
# Multiplication operation expression

5 * 5
25
We can also perform division with the forward slash:
# Division operation expression

25 / 5
5.0
# Division operation expression

25 / 6
4.166666666666667

As seen in the quiz above, we can use the double slash for integer division, where the result is
rounded down to the nearest integer:
# Integer division operation expression

25 // 5
5
# Integer division operation expression

25 // 6
4
Let's write an expression that calculates how many hours there are in 160 minutes:
160//60
2
Python follows well accepted mathematical conventions when evaluating mathematical

expressions. In the following example, Python adds 30 to the result of the multiplication (i.e.,
120).
# Mathematical expression
30 + 2 * 60
150
And just like mathematics, expressions enclosed in parentheses have priority. So, the following
multiplies 32 by 60.
# Mathematical expression (30 + 2) * 60

1920
Variables
Just like with most programming languages, we can store values in variables, so we can use them
later on. For example:
# Store value into variable

x = 43 + 60 + 16 + 41
To see the value of x in a Notebook, we can simply place it on the last line of a cell:
# Print out the value in variable

x
160
We can also perform operations on x and save the result to a new variable:
# Use another variable to store the result of the operation between variable and value
y = x / 60
y
2.6666666666666665

If we save a value to an existing variable, the new value will overwrite the previous value:
# Overwrite variable with new value

x = x / 60
x
2.6666666666666665
It's a good practice to use meaningful variable names, so you and others can read the code and
understand it more easily:
# Name the variables meaningfully

total_min = 43 + 42 + 57 # Total length of albums in minutes
total_min
142
# Name the variables meaningfully

total_hours = total_min / 60 # Total length of albums in hours
total_hours
2.3666666666666667
In the cells above we added the length of three albums in minutes and stored it in total_min. We
then divided it by 60 to calculate total length total_hours in hours. You can also do it all at once in
a single expression, as long as you use parenthesis to add the albums length before you divide, as
shown below.
# ComplicateD expression
total_hours = (43 + 42 + 57) / 60 # Total hours in a single expression
total_hours
2.3666666666666667
If you'd rather have total hours as an integer, you can of course replace the floating point division
with integer division (i.e., //).
Exercise: Expressions in Python
Write an expression to add 30 and 20 and subtract 40
x = 30+20-40
print(“x = “,x)
x = 10
Write an expression to subtract 5 from 55 and divide the result by 10
x =(55-5)/10
Print(“x =”,x)
x = 5.0
Write an expression to multiply 6 with 10 and divide the result by 12
x = (6*10)/12
print(“x =”,x)
W = 5.0

Exercise: Variables in Python
What is the value of x where x = 3 + 2 * 2
x = 3+3*2
x
7
What is the value of y where y = (3 + 2) * 2?
y = (3+2)*2
y
10
What is the value of z where z = x + y?
z = x + y #x = 7 + 10
z
17
Congratulations, you have completed your hands-on lab on Expressions and Variables in Python.

PRACTICE QUIZ: EXPRESSIONS AND VARIABLES
Question 1
What is the result of the following operation? 11//2
o 11/2
o 5
o 0.18
o 5.5
Question 2
What is the value of x after the following is run? x=4
x=x/2
o 0.5
o 4.0
o 1.0
o 2.0
Question 3
Which line of code will perform the action as required for implementing the following equation?
y=2x2-3
o y = 2x3-3
o y = 2*x*2-3
o y = 2*x*x-3
o y = 2x*x-3

VIDEO 006: STRING OPERATIONS (3:58)
In Python, a string is a sequence of characters. A string is contained within two quotes. You
could also use single quotes.
Figure 64
A string can be spaces or digits. A string can also be special characters.
Figure 65
We can bind or assign a string to another variable. It is helpful to think of a string as an ordered
sequence. Each element in the sequence can be accessed using an index represented by the array
of numbers. The first index can be accessed as follows:
We can access index six. Moreover, we can access the 13th index.
Figure 66

We can also use negative indexing with strings. The last element is given by the index negative
one. The first element can be obtained by index negative 15 and so on.
Figure 67
We can bind a string to another variable. It is helpful to think of string as a list or tuple. We can
treat the string as a sequence and perform sequence operations.
Figure 68
We can also input a stride value as follows: The two indicates we'd select every second
variable. We can also incorporate slicing.
Figure 69

In this case, we return every second value up to index four.
Figure 70
We can use the len command to obtain the length of the string. As there are 15 elements,
the result is 15.
Figure 71
At 1:13 in the video it says "Tuples: Slicing" and it should say "Strings: Slicing."
We can concatenate or combine strings. We use the addition symbols. The result is a new string
that is a combination of both.
Figure 72

We can replicate values of a string. We simply multiply the string by the number of times we
would like to replicate it, in this case, three. The result is a new string. The new string consists of
three copies of the original string.
Figure 73
This means you cannot change the value of the string, but you can create a new string. For example,
you can create a new string by setting it to the original variable and concatenate it with a new
string. The result is a new string that changes from Michael Jackson to Michael Jackson is the best.
Figure 74
Strings are immutable. Back slashes represent the beginning of escape sequences. Escape
sequences represent strings that may be difficult to input. For example, backslashes "n"
represents a new line. The output is given by a new line after the backslashes "n" is encountered.
Figure 75

Similarly, backslash "t" represents a tab. The output is given by a tab where the backslash,
"t" is.
Figure 76
If you want to place a backslash in your string, use a double backslash. The result is a backslash
after the escape sequence. We can also place an "r" in front of the string.
Figure 77

STRING METHODS
Now, let's take a look at string methods.
Figure 78
Strings are sequences and as such, have apply methods that work on lists and tuples. Strings also
have a second set of methods that just work on strings.
Figure 79
When we apply a method to the string A, we get a new string B that is different from A.
Figure 80

UPPER
Let's do some examples. Let's try with the method "Upper". This method converts lowercase
characters to uppercase characters. In this example, we set the variable A to the following value.
We apply the method "Upper", and set it equal to B. The value for B is similar to A, but all the
characters are uppercase.
Figure 81
REPLACE
The method replaces a segment of the string- i.e. a substring with a new string. We input the part
of the string we would like to change.
Figure 82
The second argument is what we would like to exchange the segment with. The result is a
new string with a segment changed.
Figure 83

FIND
The method find find substrings. The argument is the substring you would like to find. The
output is the first index of the sequence. We can find the substring Jack.
If the substring is not in the string, the output is negative one.
Figure 84
Check the labs for more examples.

HANDS-ON LAB: STRING OPERATIONS
STRING OPERATIONS
OBJECTIVES
• Work with Strings
• Perform operations on String
• Manipulate Strings using indexing and escape sequences
TABLE OF CONTENTS
• What are Strings?
• Indexing
o Negative Indexing
o Slicing
o Stride
o Concatenate Strings
• Escape Sequences
• String Operations
• Quiz on Strings
WHAT ARE STRINGS?

The following example shows a string contained within 2 quotation marks:
# Use quotation marks for defining string

"Michael Jackson"
'Michael Jackson'
We can also use single quotation marks:
# Use single quotation marks for defining string

'Michael Jackson'
'Michael Jackson'
A string can be a combination of spaces and digits:
# Digitals and spaces in string

'1 2 3 4 5 6 '
'1 2 3 4 5 6 '
A string can also be a combination of special characters:
# Special characters in string

'@#2_#]&*^%$'
'@#2_#]&*^%$'

We can print our string using the print statement:
# Print the string

print("hello!")
hello!
We can bind or assign a string to another variable:
# Assign string to variable

name = "Michael Jackson"
name
'Michael Jackson'
Indexing
It is helpful to think of a string as an ordered sequence. Each element in the sequence can be
accessed using an index represented by the array of numbers:
Figure 85
The first index can be accessed as follows:
[Tip]: Because indexing starts at 0, it means the first index is on the index 0.
# Print the first element in the string

print(name[0])
M
We can access index 6:
# Print the element on index 6 in the string

print(name[6])
l
Moreover, we can access the 13th index:
# Print the element on the 13th index in the string

print(name[13])
o

Negative Indexing
We can also use negative indexing with strings:
Figure 86
Negative index can help us to count the element from the end of the string. The last element is
given by the index -1:
# Print the last element in the string

name ="Michael Jackson"
print(name[-1])
n
The first element can be obtained by index -15:
# Print the first element in the string

print(name[-15])
M
We can find the number of characters in a string by using len, short for length:
# Find the length of string

len("Michael Jackson")
15
Slicing
We can obtain multiple characters from a string using slicing, we can obtain the 0 to 4th and 8th
to the 12th element:
Figure 87
[Tip]: When taking the slice, the first number means the index (start at 0), and the second number means the
length from the index to the last element you want (start at 1)

# Take the slice on variable name with only index 0 to index 3
name[0:4]
'Mich'
# Take the slice on variable name with only index 8 to index 11

name[8:12]
'Jack'
Stride
We can also input a stride value as follows, with the '2' indicating that we are selecting
every second variable:
Figure 88
# Get every second element. The elements on index 1, 3, 5 ...

name[::2]
'McalJcsn'
We can also incorporate slicing with the stride. In this case, we select the first five elements
and then use the stride:
# Get every second element in the range from index 0 to index 4

name[0:5:2]
'Mca'
Concatenate Strings
We can concatenate or combine strings by using the addition symbols, and the result is a new
string that is a combination of both:
# Concatenate two strings

statement = name + "is the best" statement
'Michael Jacksonis the best'
To replicate values of a string we simply multiply the string by the number of times we would like
to replicate it. In this case, the number is three. The result is a new string, and this new string
consists of three copies of the original string:
# Print the string for 3 times

3 * "Michael Jackson"
'Michael JacksonMichael JacksonMichael Jackson'
You can create a new string by setting it to the original variable. Concatenated with a new string,
the result is a new string that changes from Michael Jackson to “Michael Jackson is the best".
# Concatenate strings
name = name + " is the best" name
'Michael Jacksonis the best'
ESCAPE SEQUENCES
Back slashes represent the beginning of escape sequences. Escape sequences represent strings
that may be difficult to input. For example, back slash "n" represents a new line. The output is
given by a new line after the back slash "n" is encountered:
# New line escape sequence

print(" Michael Jackson \n is the best" )
Michael Jackson
is the best
Similarly, back slash "t" represents a tab:
# Tab escape sequence

print(" Michael Jackson \t is the best" )
Michael Jackson is the best
If you want to place a back slash in your string, use a double back slash:
# Include back slash in string

print(" Michael Jackson \\ is the best" )
Michael Jackson \ is the best
STRING OPERATIONS
There are many string operation methods in Python that can be used to manipulate the data. We
are going to use some basic string operations on the data.
Upper operation
Let's try with the method upper; this method converts lower case characters to upper case
characters:
# Convert all the characters in string to upper case

a = "Thriller is the sixth studio album"
print("before upper:", a)
b = a.upper()
print("After upper:", b)
before upper: Thriller is the sixth studio album
After upper: THRILLER IS THE SIXTH STUDIO ALBUM

Lower operation
Let's try with the method lower; this method converts upper case characters to lower case
characters:
# Convert all the characters in string to lower case

a = "MICHAEL JACKSON IS THE BEST"
print("Before lower:", a)
b = a.lower()
print("After lower:", b)
Before lower: MICHAEL JACKSON IS THE BEST
After lower: michael jackson is the best
Replace operation
The method replace replaces a segment of the string, i.e. a substring with a new string. We input
the part of the string we would like to change. The second argument is what we would like to
exchange the segment with, and the result is a new string with the segment changed:
a = "Michael Jackson is the best"

b = a.replace('Michael', 'Janet')
b
'Janet Jackson is the best'
# Replace the old substring with the new target substring by removing some
punctuations.
a = "Hello! Michael Jackson has: 12 characters."

print(a)
b = a.replace('!','').replace(':','').replace('.','')
print(b)
Hello! Michael Jackson has: 12 characters.
Hello Michael Jackson has 12 characters
Find operation
The method find finds a sub-string. The argument is the substring you would like to find, and the
output is the first index of the sequence. We can find the substring jack or el.
Figure 89
# Find the substring in the string. Only the index of the first elment of substring in
string will be the output
name.find('el')
5

# Find the substring in the string.
name.find('Jack')
8
If the sub-string is not in the string then the output is a negative one. For example, the string
'Jasdfasdasdf' is not a substring:
# If cannot find the substring in the string

name.find('Jasdfasdasdf')
-1
Split operation
The method Split splits the string at the specified separator, and returns a list:
#Split the substring into list

split_string = (name.split())
split_string
['Michael', 'Jackson']
RegEx
In Python, RegEx (short for Regular Expression) is a tool for matching and handling strings.
This RegEx module provides several functions for working with regular expressions, including
search, split, findall, and sub.
Python provides a built-in module called re, which allows you to work with regular
expressions. First, import the re module.
import re
Search operation
The search() function searches for specified patterns within a string. Here is an example that
explains how to use the search() function to search for the word "Jackson" in the string "Michael
Jackson is the best".
s1 = "Michael Jackson is the best"

# Define the pattern to search
for pattern = r"Jackson"
# Use the search() function to search for the pattern in the string result =
re.search(pattern, s1)
# Check if a match was found if result:

print("Match found!")
else:
print("Match not found.")
Match found!

Regular expressions (RegEx) are patterns used to match and manipulate strings of text. There are
several special sequences in RegEx that can be used to match specific characters or patterns.
Special Sequence Meaning Example
\d Matches any digit character (0-9) "123" matches "\d\d\d"
\D Matches any non-digit character "hello" matches "\D\D\D\D\D"
Matches any word character (a-z, A- "hello_world" matches

\w
Z, 0-9, and _) "\w\w\w\w\w\w\w\w\w\w\w"
\W Matches any non-word character "@#$%" matches "\W\W\W\W"
Matches any whitespace character

\s "hello world" matches
(space, tab, newline, etc.)
"\w\w\w\w\w\s\w\w\w\w\w"
Matches any non-whitespace

\S "hello_world" matches "\S\S\S\S\S\S\S\S\S"
character
Matches the boundary between a "cat" matches "\bcat\b" in "The cat sat on
word character and a non-word the mat"
\b character
Matches any position that is not a "cat" matches "\Bcat\B" in "category" but
\B
word boundary not in "The cat sat on the mat"
Special Sequence Examples:

A simple example of using the \d special sequence in a regular expression pattern with Python
code:
pattern = r"\d\d\d\d\d\d\d\d\d\d" # Matches any ten consecutive digits

text = "My Phone number is 1234567890" match = re.search(pattern, text)
if match:
print("Phone number found:", match.group()) else:
print("No match")
else:
print("Nom match!”)
Phone number found: 1234567890
The regular expression pattern is defined as r"\d\d\d\d\d\d\d\d\d\d", which uses the \d special
sequence to match any digit character (0-9), and the \d sequence is repeated ten times to
match ten consecutive digits.
A simple example of using the \W special sequence in a regular expression pattern with Python
code:
pattern = r"\W" # Matches any non-word character text = "Hello, world!"

matches = re.findall(pattern, text) print("Matches:", matches)
Matches: [',', ' ', '!']

The regular expression pattern is defined as r"\W", which uses the \W special sequence to match
any character that is not a word character (a-z, A-Z, 0-9, or _). The string we're searching for
matches in is "Hello, world!".
Findall operation
The findall() function finds all occurrences of a specified pattern within a string.
s2 = "Michael Jackson was a singer and known as the 'King of Pop'"
# Use the findall() function to find all occurrences of the "as" in the string
result = re.findall("as", s2)
# Print out the list of matched words print(result)

['as', 'as']
Split Function
A regular expression's split() function splits a string into an array of substrings based on a
specified pattern.
# Use the split function to split the string by the "\s" split_array = re.split("\s",
s2)
# The split_array contains all the substrings, split by whitespace characters

print(split_array)
['Michael', 'Jackson', 'was', 'a', 'singer', 'and', 'known', 'as', 'the', "'King", 'of', "Pop'"]
Sub Function
The sub function of a regular expression in Python is used to replace all occurrences of a pattern
in a string with a specified replacement.
# Define the regular expression pattern to search for

pattern = r"King of Pop"
# Define the replacement string

replacement = "legend"
# Use the sub function to replace the pattern with the replacement string
new_string = re.sub(pattern, replacement, s2, flags=re.IGNORECASE)
# The new_string contains the original string with the pattern replaced by the
replacement string
print(new_string)
Michael Jackson was a singer and known as the 'legend'

QUIZ ON STRINGS
What is the value of the variable a after the following code is executed?
# Write your code below and press Shift+Enter to execute

a = "1"
a
'1'
What is the value of the variable b after the following code is executed?

b = "2"
b
‘2‘
What is the value of the variable c after the following code is executed?

c = a + b c
c
‘12‘
Consider the variable d use slicing to print out the first three elements:
d = "ABCDEFG"
print(d[:3])
ABC
Use a stride value of 2 to print out every second character of the string e:
e = 'clocrkr1e1c1t'
print(e[::2])
correct
Print out a backslash:
# Write your code below and press Shift+Enter to execute print("\\\\\\\\")

print("\\\\\\\\")
#or
print(r"\ ")
\\\\
\
Convert the variable f to uppercase:

f = "You are wrong"
f.upper()
'YOU ARE WRONG'

Convert the variable f2 to lowercase:

f2="YOU ARE RIGHT"
f2
f2.lower()
‘you are right’
Consider the variable g, and find the first index of the sub-string snow:
g = "Mary had a little lamb Little lamb, little lamb Mary had a little lamb \
Its fleece was white as snow And everywhere that Mary went Mary went, Mary went \
Everywhere that Mary went The lamb was sure to go"
g.find("snow")
95
In the variable g, replace the sub-string Mary with Bob:
# Write your code below and press Shift+Enter to execute g.replace("Mary", "Bob")
'Bob had a little lamb Little lamb, little lamb Bob had a little lamb Its fleece was
white as snow And everywhere that Bob went Bob went, Bob went Everywhere that Bob went The
lamb was sure to go'
'Bob had a little lamb Little lamb, little lamb Bob had a little lamb Its fleece was white as
snow And everywhere that Bob went Bob went, Bob went Everywhere that Bob went The lamb was sure to
go'
In the variable g, replace the sub-string , with .:
# Write your code below and press Shift+Enter to execute g.replace(',','.')
'Mary had a little lamb Little lamb. little lamb Mary had a little lamb Its fleece was
white as snow And everywhere that Mary went Mary went. Mary went Everywhere that Mary went
The lamb was sure to go'
'Mary had a little lamb Little lamb. little lamb Mary had a little lamb Its fleece was white as
snow And everywhere that Mary went Mary went. Mary went Everywhere that Mary went The lamb was sure
to go'
In the variable g, split the substring to list:

g.split()
['Mary',
'had',
'a',
'little',
'lamb',
'Little',
'lamb,',
'little',
'lamb',
'Mary',
'had',
'a',
'little',
'lamb',
'Its',
'fleece',
'was',
'white',
'as',
'snow',
'And', 'everywhere', 'that',

'Mary',
'went',
'Mary',
'went,',
'Mary',
'went', 'Everywhere', 'that',
'Mary',
'went',
'The',
'lamb',
'was',
'sure',
'to',
'go']
In the string s3, find the four consecutive digit character using \d and search()
function:
s3 = "House number- 1105"

result = re.search("\d", s3)
# Check if a match was found if result:
print("Digit found") else:
print("Digit not found.")
Digit found
In the string str1, replace the sub-string fox with bear using sub() function:
str1= "The quick brown fox jumps over the lazy dog."
# Use re.sub() to replace "fox" with "bear"
new_str1 = re.sub(r"fox", "bear", str1)
print(new_str1)
The quick brown bear jumps over the lazy dog.
In the string str2 find all the occurrences of woo using findall() function:
str2= "How much wood would a woodchuck chuck, if a woodchuck could chuck wood?"

# Use re.findall() to find all occurrences of "woo"
matches = re.findall(r"woo", str2)
print(matches)
['woo', 'woo', 'woo', 'woo']
The last exercise!

PRACTICE QUIZ: STRING OPERATIONS
Question 1
What is the result of the following? Name[-1]
Figure 90
o "i"
o "M"
o "o"
o "n"
Correct: The index having a value of -1 denotes the final position within a cyclic sequence.
Question 2
What is the result of the following? print("AB\nC\nDE")
o AB\nC\nDE
o ABC DE
o AB CD E
o AB
o C DE
Correct: When the "print" function comes across the "\n" character, it displays a new line.
Question 3
What is the result of following?
"hello Mike".find("Mike")
If you are unsure, copy and paste the code into Jupyter Notebook and check.
o 2
o 6
o 5
o 6,7,8
Correct: The method helps you locate the position of the first character in a given string that
matches the first character of a specified substring.
GRADE QUIZ: STRING OPERATIONS

MODULE 1 SUMMARY: PYTHON BASICS
Congratulations! You have completed this module. At this point, you know that:
• Python can distinguish among data types such as integers, floats, strings, and Booleans.
• Integers are whole numbers that can be positive or negative.
• Floats include integers as well as decimal numbers between the integers.
• You can convert integers to floats using typecasting, but you cannot convert a float to
an integer.
• You can convert integers and floats to strings.
• You can convert an integer or float value to True (1) or False (0).
• Expressions in Python are a combination of values and operations used to produce a
single result.
• Expressions perform mathematical operations such as addition, subtraction,
multiplication, and so on.
• We use"//" to round off integer divisions, resulting in float values.
• Python follows the order of operations (BODMASS) to perform operations with
multiple expressions.
• Variables store and manipulate data, allowing you to access and modify values
throughout your code.
• The assignment operator "=" assigns a value to a variable.
• ":" denotes the value of the variable within the code.
• Assigning another value to the same variable overrides the previous value of that
variable.
• You can perform mathematical operations on variables using the same or different
variables.
• While performing operations with various variables, modifying a value in one variable
will lead to changes in the other variables.
• Python string operations involve manipulating text data using tasks such as indexing,
concatenation, slicing, and formatting.
• A string is usually written within double quotes or single quotes, including letters, white
space, digits, or special characters.
• A string attaches to another variable and is an ordered sequence of characters.
• Characters in a string identify their index numbers, which can be positive or negative.
• We use strings as a sequence to perform sequence operations.
• You can input a stride value to perform slicing while operating on a string.
• Operations like finding the length of the string, combining, concatenating, and
replicating, result in a new string.
• You cannot modify an existing string; they are immutable.
• You can perform escape sequences using " " to change the layout of the string.
• In Python, you perform tasks such as searching, modifying, and formatting text data with
its pre-built String Methods functions.

• You apply a method to a string to change its value, resulting in another string.
• You can perform actions such as changing the case of characters in a string, replacing
items in a string, finding items in a string, and so on using pre- builtString methods.
MODULE 1 CHEAT SHEET: PYTHON BASICS
Package/Method Description Code Example

Comments Comments are lines of text that are # This is a comment
ignored by the Python interpreter when

executing the code<./td>
Concatenation Combines (concatenates) strings. Syntax:
1. concatenated_string = string1
+ string2
Example:
1. result = "Hello" + "
John"</td>
Data Types - Integer - Float - Boolean - String Example:

1. x=7 # Integer Value
2. y=12.4 # Float Value
3. is_valid = True # Boolean Value
4. is_valid = False # Boolean Value
5. F_Name = "John" # String Value
Indexing Accesses character at a specific index. Example:

1. my_string="Hello"
2. char = my_string[0]
len() Returns the length of a string. Syntax:

1. len(string_name)
Example:
2. length = len(my_string)
lower() Converts string to lowercase. Example:

2. uppercase_text = my_string.lower()
print() Prints the message or variable inside `()`. Example:

1. print("Hello, world")
2. print(a+b)
Python Operators - Addition (+): Adds two values together. Example:

1. x = 9 y = 4
2. result_add= x + y # Addition
3. result_sub= x - y # Subtraction
4. result_mul= x * y # Multiplication
5. result_div= x / y # Division

6. result_fdiv= x // y # Floor
Division
7. result_mod= x % y # Modulo</td>
replace() Replaces substrings. Example:

2. new_text =
my_string.replace("Hello", "Hi")
Slicing Extracts a portion of the string. Syntax:
1. substring = string_name[start:end]
Example:
1. my_string="Hello" substring =
my_string[0:5]
split() Splits string into a list based on a delimiter. Example:

2. split_text = my_string.split(",")
strip() Removes leading/trailing whitespace. Example:

2. trimmed = my_string.strip()
upper() Converts string to uppercase. Example:

2. uppercase_text = my_string.upper()
Variable Assigns a value to a variable. Syntax:

Assignment 1. variable_name = value
Example:
1. name="John" # assigning John to
variable name
2. x = 5 # assigning 5 to variable x
GRADED QUIZ: PYTHON BASICS

GLOSSARY – PYTHON BASICS
Welcome! This alphabetized glossary contains many of the terms you'll find within this course.
This comprehensive glossary also includes additional industry-recognized terms not used in
course videos. These terms are important for you to recognize when working in the industry,
participating in user groups, and participating in other certificate programs.
Term Definition
AI AI (artificial intelligence) is the ability of a digital computer or
computer-controlled robot to perform tasks commonly associated with
intelligent beings.
Application development Application development, or app development, is the process of
planning, designing, creating, testing, and deploying a software
application to perform various business operations.
Arithmetic Operations Arithmetic operations are the basic calculations we make in everyday
life like addition, subtraction, multiplication and division. It is also called
as algebraic operations or mathematical operations.
Array of numbers Set of numbers or objects that follow a pattern presented as an

arrangement of rows and columns to explain multiplication.
Assignment operator in Python Assignment operator is a type of Binary operator that helps in
modifying the variable to its left with the use of its value to the right.
The symbol used for assignment operator is "=".
Asterisk Symbol "* " used to perform various operations in Python.
Backslash A backslash is an escape character used in Python strings to indicate
that the character immediately following it should be treated in a
special way, such as being treated as escaped character or raw string.
Boolean Denoting a system of algebraic notation used to represent logical

propositions by means of the binary digits 0 (false) and 1 (true).
Colon A colon is used to represent an indented block. It is also used to fetch
data and index ranges or arrays.
Concatenate Link (things) together in a chain or series.
Data engineering Data engineers are responsible for turning raw data into information
that an organization can understand and use. Their work involves
blending, testing, and optimizing data from numerous sources.
Data science Data Science is an interdisciplinary field that focuses on extracting

knowledge from data sets which are typically huge in amount. The field
encompasses analysis, preparing data for analysis, and presenting
findings to inform high-level decisions in an organization.
Data type Data type refers to the type of value a variable has and what type of
mathematical, relational or logical operations can be applied without
causing an error.
Double quote Symbol “ “ used to represent strings in Python.

Escape sequence An escape sequence is two or more characters that often begin with an
escape character that tell the computer to perform a function or
command.
Expression An expression is a combination of operators and operands that is
interpreted to produce some other value.
Float Python float () function is used to return a floating-point number from
a number or a string representation of a numeric value.
Forward slash Symbol “/“ used to perform various operation sin Python
Foundational Denoting an underlying basis or principle; fundamental.
Immutable Immutable Objects are of in-built datatypes like int, float, bool, string,
Unicode, and tuple. In simple words, an immutable object can’t be
changed after it is created.
Integer An integer is the number zero (0), a positive natural number (1, 2, 3,
and so on) or a negative integer with a minus sign (−1, −2, −3, and so
on.)
Manipulate Is the process of modifying a string or creating a new string by making
changes to existing strings.
Mathematical conventions A mathematical convention is a fact, name, notation, or usage which is
generally agreed upon by mathematicians.
Mathematical expressions Expressions in math are mathematical statements that have a minimum
of two terms containing numbers or variables, or both, connected by
an operator in between.
Mathematical operations The mathematical “operation” refers to calculating a value using
operands and a math operator.
Negative indexing Allows you to access elements of a sequence (such as a list, a string, or
a tuple) from the end, using negative numbers as indexes.
Operands The quantity on which an operation is to be done.

Operators in Python Operators are used to perform operations on variables and values.
Parentheses Parentheses is used to call an object.

Replicate To make an exact copy of.
Sequence A sequence is formally defined as a function whose domain is an
interval of integers.
Single quote Symbol ‘ ‘ used to represent strings in python.
Slicing in Python Slicing is used to return a portion from defined list.
Special characters A special character is one that is not considered a number or letter.
Symbols, accent marks, and punctuation marks are considered special
characters.
Stride value Stride is the number of bytes from one row of pixels in memory to the
next row of pixels in memory.
Strings In Python, Strings are arrays of bytes representing Unicode characters.

Substring A substring is a sequence of characters that are part of an original
string.
Type casting The process of converting one data type to another data type is called
Typecasting or Type Coercion or Type Conversion.
Types in Python Data types are the classification or categorization of data items. It
represents the kind of value that tells what operations can be
performed on a particular data.
Variables Variables are containers for storing data values.

MODULE 2: PYTHON DATA STRUCTURES
INTRODUCTION
This module begins a journey into Python data structures by explaining the use of lists and tuples
and how they can store data collections in a single variable.
Next, learn about dictionaries and how they function by storing data in pairs of keys and values,
and end with Python sets to learn how this type of collection can appear in any order and will
only contain unique elements.
LEARNING OBJECTIVES
In this module, you will:
• Describe and manipulate tuple combinations and list data structures.
• Execute basic tuple operations in Python.
• Perform list operations in Python.
• Write structures with correct keys and values to demonstrate understanding of
dictionaries.
• Work with and perform operations on dictionaries in Python.
• Create sets to demonstrate understanding of the differences between sets, tuples, and
lists.
• Work with sets in Python, including operations and logic operations.

VIDEO 007: LISTS AND TUPLES (8:51)
In the video we will cover lists and tuples. These are called compound data types and are one of
the key types of data structures in Python.
Figure 91
TUPLES
Figure 92
Tuples are an ordered sequence. Here is a tuple ratings. Tuples are expressed as comma
separated elements within parentheses. These are values inside the parentheses.
Figure 93
In Python, there are different types:

• strings,
• integer,
• float.

They can all be contained in a tuple, but the type of the variable is tuple. Each element of a tuple
can be accessed via an index.
Figure 94
The following table represents the relationship between the index and the elements in the tuple.
The first element can be accessed by the name of the tuple followed by a square bracket with the
index number, in this case zero. We can access the second element as follows. We can also access
the last element.
Figure 95
In Python, we can use negative index. The relationship is as follows. The corresponding
values are shown here.
Figure 96

CONCATENATING
We can concatenate or combine tuples by adding them. The result is the following with the
following index.
Figure 97
SLICING
If we would like multiple elements from a tuple, we could also slice tuples. For example, if we
want the first three elements, we use the following command. The last index is one larger than
the index you want.
Figure 98
Similarly, if we want the last two elements, we use the following command. Notice, how the last
index is one larger than the last index of the tuple.
Figure 99
In the video at 1:32 the line says: "Notice how the last index is one larger than the length of the tuple." The line should be read
as "Notice how the last index is one larger than the last index of the tuple."

LEN
We can use the len command to obtain the length of a tuple. As there are five elements,
the result is 5.
Figure 100
Tuples are immutable which means we can't change them. To see why this is important, let's see
what happens when we set the variable ratings 1 to ratings. Let's use the image to provide a
simplified explanation of what's going on.
Each variable does not contain a tuple but references the same immutable tuple object.
Figure 101
See the objects and classes module for more about objects.

Let's say, we want to change the element at index 2. Because tuples are immutable, we can't,
therefore ratings1 will not be affected by a change in rating because the tuple is immutable, i.e
we can't change it.
Figure 102
We can assign a different tuple to the ratings variable.
Figure 103
The variable ratings now references another tuple.
Figure 104

SORTED
As a consequence of immutability, if we would like to manipulate a tuple, we must create a new
tuple instead. For example, if we would like to sort a tuple we use the function sorted. The input is
the original tuple, the output is a new sorted list. For more on functions, see our video on functions.
Figure 105
NESTING
A tuple can contain other tuples as well as other complex data types. This is called nesting. We
can access these elements using the standard indexing methods. If we select an index with a
tuple, the same index convention applies. As such, we can then access values in the tuple. For
example, we could access the second element. We can apply this indexing directly to the tuple
variable NT. It is helpful to visualize this as a tree.
Figure 106
We can visualize this nesting as a tree. The tuple has the following indexes. If we consider indexes
with other tuples, we see the tuple at index 2 contains a tuple with two elements. We can access
those two indexes. The same convention applies to index 3. We can access the elements in those
tuples as well. We can continue the process.
Figure 107

We can even access deeper levels of the tree by adding another square bracket. We can access
different characters in the string or various elements in the second tuple contained in the first.
Figure 108
LISTS
Figure 109
Lists are also a popular data structure in Python. Lists are also an ordered sequence.
Here is a list, "L." A list is represented with square brackets. In many aspects, lists are like tuples.
One key difference is they are mutable.
Figure 110

Lists can contain strings, floats, integers. We can nest other lists. We also nest tuples and other
data structures. The same indexing conventions apply for nesting.
Figure 111
Like tuples, each element of a list can be accessed via an index. The following table represents
the relationship between the index and the elements in the list. The first element can be accessed
by the name of the list followed by a square bracket with the index number, in this case zero. We
can access the second element as follows. We can also access the last element. In Python, we
can use a negative index. The relationship is as follows. The corresponding indexes are as follows.
Figure 112
We can also perform slicing in lists. For example, if we want the last two elements in this list
we use the following command. Notice how the last index is one larger than the length of the
list. The index conventions for lists and tuples are identical. Check the labs for more examples.
Figure 113

We can concatenate or combine lists by adding them. The result is the following. The new list
has the following indices.
Figure 114
EXTEND
Lists are mutable, therefore we can change them. For example, we apply the method extend by
adding a dot followed by the name of the method then parentheses. The argument inside the
parentheses is a new list that we are going to concatenate to the original list. In this case, instead
of creating a new list, "L1," the original list, "L," is modified by adding two new elements.
To learn more about methods check out our video on objects and classes.
Figure 115
APPEND
Another similar method is append. If we apply append instead of extend, we add one element to
the list. If we look at the index there is only one more element. Index 3 contains the list we
appended.
Figure 116

Every time we apply a method, the list changes. If we apply extend, we add two new elements
to the list. The list L is modified by adding two new elements. If we append the string A, we
further change the list, adding the string A.
Figure 117
As lists are mutable we can change them. For example, we can change the first element as
follows. The list now becomes hard rock 10 1.2.
Figure 118
DEL
We can delete an element of a list using the del command. We simply indicate the list item we
would like to remove as an argument. For example, if we would like to remove the first element
the result becomes 10 1.2. We can delete the second element. This operation removes the second
element off the list.
Figure 119

SPLIT
We can convert a string to a list using split. For example, the method split converts every group
of characters separated by a space into an element of a list.
Figure 120
We can use the split function to separate strings on a specific character known, as a delimiter.
We simply pass the delimiter we would like to split on as an argument, in this case a comma. The
result is a list. Each element corresponds to a set of characters that have been separated by a
comma.
Figure 121
ALIASING
When we set one variable B equal to A, both A and B are referencing the same list. Multiple
names referring to the same object is known as aliasing.
Figure 122

We know from the list slide that the first element in B is set as hard rock. If we change the first
element in A to banana, we get a side effect, the value of B will change as a consequence.
Figure 123
A and B are referencing the same list, therefore if we change A, list B also changes. If we check
the first element of B after changing list A, we get banana instead of hard rock.
Figure 124
CLONE
You can clone list A by using the following syntax. Variable A references one list. Variable B
references a new copy or clone of the original list.
Figure 125

Now if you change A, B will not change:
Figure 126
THE HELP COMMAND

We can get more info on lists, tuples, and many other objects in Python using the help command.
Simply pass in the list, tuple, or any other Python object. See the labs for more things, you can do
with lists.
Figure 127

HANDS-ON LAB: LISTS
LISTS IN PYTHON
OBJECTIVES
• Perform list operations in Python, including indexing, list manipulation, and copy/clone
list.
TABLE OF CONTENTS
• About the Dataset
• Lists
o Indexing
o List Content
o List Operations
o Copy and Clone List
• Quiz on Lists
About the Dataset
Imagine you received album recommendations from your friends and compiled all of the
recommendations into a table, with specific information about each album.
The table has one row for each movie and several columns:
• artist - Name of the artist
• album - Name of the album
• released_year - Year the album was released
• length_min_sec - Length of the album (hours,minutes,seconds)
• genre - Genre of the album
• music_recording_sales_millions - Music recording sales (millions in USD) on
SONG://DATABASE
• claimed_sales_millions - Album's claimed sales (millions in USD) on
SONG://DATABASE
• date_released - Date on which the album was released
• soundtrack - Indicates if the album is the movie soundtrack (Y) or (N)
• rating_of_friends - Indicates the rating from your friends from 1 to 10

The dataset can be seen below:
A B C D E F G H I J
Michael Jackson Thriller 1982 00:42:19 Pop, rock, R&B 46 65 30/nov/82 10.0
AC/DC Back in Black 1980 00:42:11 Hard rock 26.1 50 25/jul/80 8.5
Pink Floyd The Dark Side of the 1973 00:42:49 Progressive rock 24.2 45 01/mar/73 9.5
Moon
Whitney The Bodyguard 1992 00:57:44 Soundtrack/R&B, 26.1 50 25/jul/80 Y 7.0
Houston soul, pop
Meat Loaf Bat Out of Hell 1977 00:46:33 Hard rock, 20.6 43 21-Oct-77 7.0
progressive rock
Eagles Their Greatest Hits 1976 00:43:08 Rock, soft rock, folk 32.2 42 17-Feb-76 9.5
(1971-1975) rock
Bee Gees Saturday Night Fever 1977 01:15:54 Disco 20.6 40 15/nov/77 Y 9.0
Fleetwood Mac Rumours 1977 00:40:01 Soft rock 27.9 40 04-Feb-77 9.5
A: Artist
B: Album
C: Released
D: Length
E: Genre
F: Music Recording Sales (Milliions)
G: Claimed Sales (Milliions)
H: Released
I: Soundtrack
J: Rating (Friends)
Lists
Indexing
We are going to take a look at lists in Python. A list is a sequenced collection of different objects
such as integers, strings, and even other lists as well. The address of each element within a list
is called an index. An index is used to access and refer to items within a list.
Figure 128

To create a list, type the list within square brackets [ ], with your content inside the parenthesis
and separated by commas. Let’s try it!
# Create a list
L = ["Michael Jackson", 10.1, 1982] L
['Michael Jackson', 10.1, 1982]
We can use negative and regular indexing with a list:
Figure 129
# Print the elements on each index
print('the same element using negative and positive indexing:\n Postive:',L[0],

'\n Negative:' , L[-3] )


the same element using negative and positive indexing: Postive: Michael Jackson
Negative: Michael Jackson
the same element using negative and positive indexing: Postive: 10.1
Negative: 10.1
the same element using negative and positive indexing: Postive: 1982
Negative: 1982
List Content
Lists can contain strings, floats, and integers. We can nest other lists, and we can also nest
tuples and other data structures. The same indexing conventions apply for nesting:
# Sample List
["Michael Jackson", 10.1, 1982, [1, 2], ("A", 1)]
['Michael Jackson', 10.1, 1982, [1, 2], ('A', 1)]

List Operations
slicing
We can also perform slicing in lists. For example, if we want the last two elements, we use the
following command:
# Sample List
L = ["Michael Jackson", 10.1,1982,"MJ",1] L
['Michael Jackson', 10.1, 1982, 'MJ', 1]
Figure 130
# List slicing
L[3:5]
['MJ', 1]
extend
We can use the method extend to add new elements to the list:
# Use extend to add elements to list

L = [ "Michael Jackson", 10.2] L.extend(['pop', 10])
L
['Michael Jackson', 10.2, 'pop', 10]
append
Another similar method is append. If we apply append instead of extend, we add one element
to the list:
# Use append to add elements to list

L = [ "Michael Jackson", 10.2] L.append(['pop', 10])
L
['Michael Jackson', 10.2, ['pop', 10]]
Each time we apply a method, the list changes. If we apply extend we add two new elements to
the list. The list L is then modified by adding two new elements:
# Use extend to add elements to list

L = [ "Michael Jackson", 10.2] L.extend(['pop', 10])
L
['Michael Jackson', 10.2, 'pop', 10]
If we append the list ['a','b'] we have one new element consisting of a nested list:
# Use append to add elements to list

L.append(['a','b']) L
['Michael Jackson', 10.2, 'pop', 10, ['a', 'b']]

As lists are mutable, we can change them. For example, we can change the first element
as follows:
# Change the element based on the index

A = ["disco", 10, 1.2]
print('Before change:', A) A[0] = 'hard rock' print('After change:', A)
Before change: ['disco', 10, 1.2]
After change: ['hard rock', 10, 1.2]
We can also delete an element of a list using the del command:
# Delete the element based on the index

print('Before change:', A) del(A[0])
print('After change:', A)
Before change: ['hard rock', 10, 1.2]
After change: [10, 1.2]
split
We can convert a string to a list using split. For example, the method split translates every group
of characters separated by a space into an element in a list:
# Split the string, default is by space

'hard rock'.split()
['hard', 'rock']
We can use the split function to separate strings on a specific character which we call a delimiter.
We pass the character we would like to split on into the argument, which in this case is a comma.
The result is a list, and each element corresponds to a set of characters that have been separated
by a comma:
# Split the string by comma

'A,B,C,D'.split(',')
['A', 'B', 'C', 'D']
Copy and Clone List

When we set one variable B equal to A, both A and B are referencing the same list in memory:
# Copy (copy by reference) the list A

A = ["hard rock", 10, 1.2]
B = A
print('A:', A)
print('B:', B)
A: ['hard rock', 10, 1.2]
B: ['hard rock', 10, 1.2]
Figure 131

Initially, the value of the first element in B is set as "hard rock". If we change the first element in
A to "banana", we get an unexpected side effect. As A and B are referencing the same list, if
we change list A, then list B also changes. If we check the first element of B we get "banana"
instead of "hard rock":
# Examine the copy by reference

print('B[0]:', B[0])
A[0] = "banana" print('B[0]:', B[0])
B[0]: hard rock
B[0]: banana
This is demonstrated in the following figure:
Figure 132
You can clone list A by using the following syntax:
# Clone (clone by value) the list A
B = A[:] B
['banana', 10, 1.2]
Variable B references a new copy or clone of the original list. This is demonstrated in the
following figure:
Figure 133
Now if you change A, B will not change:
print('B[0]:', B[0])
A[0] = "hard rock" print('B[0]:', B[0])
B[0]: banana
B[0]: banana

QUIZ ON LISTS
Create a list a_list, with the following elements 1, hello, [1,2,3] and True.

a_list = [1, 'hello', [1, 2, 3] , True]
a_list
[1, 'hello', [1, 2, 3], True]
Find the value stored at index 1 of a_list.

a_list[1]
'hello'
Retrieve the elements stored at index 1, 2 and 3 of a_list.

a_list[1:4]
['hello', [1, 2, 3], True]
Concatenate the following lists A = [1, 'a'] and B = [2, 1, 'd']:

A = [1, 'a']
B = [2, 1, 'd']
A + B
[1, 'a', 2, 1, 'd']
The last exercise!


HANDS-ON LAB: TUPLES
TUPLES IN PYTHON

OBJECTIVES
• Perform the basics tuple operations in Python, including indexing, slicing and sorting
TABLE OF CONTENTS
• Tuples
o Indexing
o Slicing
o Sorting
• Quiz on Tuples
About the Dataset
Imagine you received album recommendations from your friends and compiled all of the
recommendations into a table, with specific information about each album.
The table has one row for each movie and several columns:
• artist - Name of the artist
• album - Name of the album
• released_year - Year the album was released
• length_min_sec - Length of the album (hours,minutes,seconds)
• genre - Genre of the album
• music_recording_sales_millions - Music recording sales (millions in USD) on
SONG://DATABASE
• claimed_sales_millions - Album's claimed sales (millions in USD) on
SONG://DATABASE
• date_released - Date on which the album was released
• soundtrack - Indicates if the album is the movie soundtrack (Y) or (N)
• rating_of_friends - Indicates the rating from your friends from 1 to 10

The dataset can be seen below:
A B C D E F G H I J
Michael Jackson Thriller 1982 00:42:19 Pop, rock, R&B 46 65 30/nov/82 10.0
AC/DC Back in Black 1980 00:42:11 Hard rock 26.1 50 25/jul/80 8.5
Pink Floyd The Dark Side of the 1973 00:42:49 Progressive rock 24.2 45 01/mar/73 9.5
Moon
Whitney The Bodyguard 1992 00:57:44 Soundtrack/R&B, 26.1 50 25/jul/80 Y 7.0
Houston soul, pop
Meat Loaf Bat Out of Hell 1977 00:46:33 Hard rock, 20.6 43 21-Oct-77 7.0
progressive rock
Eagles Their Greatest Hits 1976 00:43:08 Rock, soft rock, folk 32.2 42 17-Feb-76 9.5
(1971-1975) rock
Bee Gees Saturday Night Fever 1977 01:15:54 Disco 20.6 40 15/nov/77 Y 9.0
Fleetwood Mac Rumours 1977 00:40:01 Soft rock 27.9 40 04-Feb-77 9.5
A: Artist
B: Album
C: Released
D: Length
E: Genre
F: Music Recording Sales (Milliions)
G: Claimed Sales (Milliions)
H: Released
I: Soundtrack
J: Rating (Friends)
TUPLES
In Python, there are different data types: string, integer, and float. These data types can all be
contained in a tuple as follows:
Figure 134
Now, let us create your first tuple with string, integer and float.
# Create your first tuple

tuple1 = ("disco",10,1.2 )
tuple1
('disco', 10, 1.2)

The type of variable is a tuple.
# Print the type of the tuple you created
type(tuple1)
tuple
Indexing
Each element of a tuple can be accessed via an index. The following table represents the
relationship between the index and the items in the tuple. Each element can be obtained by the
name of the tuple followed by a square bracket with the index number:
Figure 135
We can print out each value in the tuple:
# Print the variable on each index

print(tuple1[0])
print(tuple1[1])
print(tuple1[2])
disco
10
1.2
We can print out the type of each value in the tuple:
# Print the type of value on each index

print(type(tuple1[0]))
<class 'str'>
<class 'int'>
<class 'float'>
We can also use negative indexing. We use the same table above with corresponding negative
values:
Figure 136

We can obtain the last element as follows (this time we will not use the print statement to
display the values):
# Use negative index to get the value of the last element

tuple1[-1]
1.2
We can display the next two elements as follows:
# Use negative index to get the value of the second last element
tuple1[-2]
10
# Use negative index to get the value of the third last element
tuple1[-3]
'disco'
Concatenate Tuples
We can concatenate or combine tuples by using the + sign:
# Concatenate two tuples
tuple2 = tuple1 + ("hard rock", 10)

tuple2
('disco', 10, 1.2, 'hard rock', 10)
We can slice tuples obtaining multiple values as demonstrated by the figure below:
Figure 137
Slicing
We can slice tuples, obtaining new tuples with the corresponding elements:
# Slice from index 0 to index 2

tuple2[0:3]
('disco', 10, 1.2)
We can obtain the last two elements of the tuple:
# Slice from index 3 to index 4

tuple2[3:5]
('hard rock', 10)

We can obtain the length of a tuple using the length command:
# Get the length of tuple

len(tuple2)
5
This figure shows the number of elements:
Figure 138
Sorting
Consider the following tuple:
# A sample tuple
Ratings = (0, 9, 6, 5, 10, 8, 9, 6, 2)
We can sort the values in a tuple and save it to a new tuple:
# Sort the tuple

RatingsSorted = sorted(Ratings)
RatingsSorted
[0, 2, 5, 6, 6, 8, 9, 9, 10]
Nested Tuple
A tuple can contain another tuple as well as other more complex data types. This process is
called 'nesting'. Consider the following tuple with several elements:
# Create a nest tuple

NestedT =(1, 2, ("pop", "rock") ,(3,4),("disco",(1,2)))
Each element in the tuple, including other tuples, can be obtained via an index as shown in the
figure:
Figure 139
# Print element on each index

print("Element 0 of Tuple: ", NestedT[0])
Element 0 of Tuple: 1
Element 1 of Tuple: 2
Element 2 of Tuple: ('pop', 'rock')
Element 3 of Tuple: (3, 4)
Element 4 of Tuple: ('disco', (1, 2))

We can use the second index to access other tuples as demonstrated in the figure:
Figure 140
We can access the nested tuples:
# Print element on each index, including nest indexes
print("Element 2, 0 of Tuple: ", NestedT[2][0])

Element 2, 0 of Tuple: pop
Element 2, 1 of Tuple: rock
Element 3, 0 of Tuple: 3
Element 3, 1 of Tuple: 4
Element 4, 0 of Tuple: disco
Element 4, 1 of Tuple: (1, 2)
We can access strings in the second nested tuples using a third index:
# Print the first element in the second nested tuples

NestedT[2][1][0]
'r'
# Print the second element in the second nested tuples

NestedT[2][1][1]
'o'
Figure 141
Similarly, we can access elements nested deeper in the tree with a third index:
# Print the first element in the second nested tuples

NestedT[4][1][0]
1
# Print the second element in the second nested tuples

NestedT[4][1][1]
2

The following figure shows the relationship of the tree and the element NestedT[4][1][1]:
Figure 142
QUIZ ON TUPLES
Consider the following tuple:
# sample tuple
genres_tuple = ("pop", "rock", "soul", "hard rock", "soft rock", \
"R&B", "progressive rock", "disco")
genres_tuple
('pop',
'rock',
'soul',
'hard rock',
'soft rock',
'R&B',
'progressive rock',
'disco')
Find the length of the tuple, genres_tuple:

len(genres_tuple)
8
Figure 143
Access the element, with respect to index 3:

genres_tuple[3]
'hard rock'
Use slicing to obtain indexes 3, 4 and 5:

genres_tuple[3:6]
('hard rock', 'soft rock', 'R&B')

Find the first two elements of the tuple genres_tuple:

genres_tuple[0:2]
('pop', 'rock')
Find the first index of "disco":

genres_tuple.index("disco")
7
Generate a sorted List from the Tuple C_tuple=(-5, 1, -3):
C_tuple = (-5, 1, -3)

C_list = sorted(C_tuple)
C_list
[-5, -3, 1]

CHEAT SHEET: LISTS AND TUPLES
In case you are having issues viewing the lab instructions below or prefer to view the instructions
in a new browser tab, click here.
PYTHON DATA STRUCTURES CHEAT SHEET
List

append() The àppend()` method is used to add an Syntax:
element to the end of a list. list_name.append(element)
Example:
fruits = ["apple", "banana",
"orange"]
fruits.append("mango")
print(fruits)
copy() The `copy()` method is used to create a Example 1:
shallow copy of a list. my_list = [1, 2, 3, 4, 5]
new_list = my_list.copy()
print(new_list)
# Output: [1, 2, 3, 4, 5]
count() The `count()` method is used to count Example:

the number of occurrences of a specific my_list = [1, 2, 2, 3, 4, 2, 5,
2]
element in a list in Python. count = my_list.count(2)
print(count)
# Output: 4
Creating a list A list is a built-in data type that Example:

represents an ordered and mutable fruits = ["apple", "banana",
"orange", "mango"]
collection of elements. Lists are enclosed
in square brackets [] and elements are
separated by commas.
del The `del` statement is used to remove an Example:
element from list.`del` statement my_list = [10, 20, 30, 40, 50]
removes the element at the specified del my_list[2] # Removes the

element at index 2 print(my_list)
index.
# Output: [10, 20, 40, 50]
extend() The èxtend()` method is used to add Syntax:

multiple elements to a list. It takes an list_name.extend(iterable)
Example:
iterable (such as another list, tuple, or fruits = ["apple", "banana",
string) and appends each element of the "orange"]
more_fruits = ["mango",
iterable to the original list. "grape"]
fruits.extend(more_fruits)
print(fruits)
Indexing Indexing in a list allows you to access Example:

individual elements by their position. In my_list = [10, 20, 30, 40, 50]
Python, indexing starts from 0 for the print(my_list[0])

# Output: 10 (accessing the first
element)

print(my_list[-1])
# Output: 50 (accessing the last
first element and goes up to element using negative indexing)
`length_of_list - 1`.
insert() The ìnsert()` method is used to insert an Syntax:

element. list_name.insert(index, element)
Example:
my_list = [1, 2, 3, 4, 5]
my_list.insert(2, 6)
print(my_list)
Modifying a list You can use indexing to modify or assign Example:

new values to specific elements in the list. my_list = [10, 20, 30, 40, 50]
my_list[1] = 25 # Modifying the
second element
print(my_list)
# Output: [10, 25, 30, 40, 50]
pop() `pop()` method is another way to remove Example 1:

an element from a list in Python. It my_list = [10, 20, 30, 40, 50]
removes and returns the element at the removed_element = my_list.pop(2)

# Removes and returns the
specified index. If you don't provide an element at index 2
index to the `pop()` method, it will print(removed_element)
remove and return the last element of # Output: 30
the list by default print(my_list)

# Output: [10, 20, 40, 50]
Example 2:
my_list = [10, 20, 30, 40, 50]
removed_element = my_list.pop()
# Removes and returns the last
element
print(removed_element)
# Output: 50
print(my_list)
# Output: [10, 20, 30, 40]
remove() To remove an element from a list. The Example:

`remove()` method removes the first my_list = [10, 20, 30, 40, 50]
occurrence of the specified value. my_list.remove(30) # Removes the

element 30
print(my_list)
# Output: [10, 20, 40, 50]
reverse() The `reverse()` method is used to reverse Example 1:

the order of elements in a list my_list = [1, 2, 3, 4, 5]
my_list.reverse() print(my_list)
# Output: [5, 4, 3, 2, 1]
Slicing You can use slicing to access a range of Syntax:

elements from a list. list_name[start:end:step]
Example:
my_list = [1, 2, 3, 4, 5]

print(my_list[1:4])
# Output: [2, 3, 4] (elements from
index 1 to 3)
print(my_list[:3])
the beginning up to index 2)
print(my_list[2:])
index 2 to the end)
print(my_list[::2])
# Output: [1, 3, 5] (every second
element)
sort() The `sort()` method is used to sort the Example 1:

elements of a list in ascending order. If my_list = [5, 2, 8, 1, 9]
you want to sort the list in descending my_list.sort()
order, you can pass the `reverse=True` print(my_list)
argument to the`sort()` method. # Output: [1, 2, 5, 8, 9]
Example 2:
my_list = [5, 2, 8, 1, 9]
my_list.sort(reverse=True)
print(my_list)
# Output: [9, 8, 5, 2, 1]
Dictionary

Accessing Values You can access the values in a dictionary Syntax:
using their corresponding `keys`. Value = dict_name["key_name"]
Example:
name = person["name"]
age = person["age"]
Add or modify Inserts a new key-value pair into the Syntax:

dictionary. If the key already exists, the dict_name[key] = value
value will be updated; otherwise, a new Example:

entry is created. person["Country"] = "USA" # A new
entry will be created.
person["city"] = "Chicago" #
Update the
existing value for the same key
clear() The `clear()` method empties the Syntax:

dictionary, removing all key-value pairs 1. dict_name.clear()
within it. After this operation, the Example:

dictionary is still accessible and can be grades.clear()
used further.
copy() Creates a shallow copy of the dictionary. Syntax:
The new dictionary contains the same new_dict = dict_name.copy()
key-value pairs as the original, but they Example:

remain distinct objects in memory. new_person = person.copy()

new_person = dict(person) #
another way to create a copy of
dictionary
Creating a A dictionary is a built-in data type that Example:
Dictionary represents a collection of key-value pairs. dict_name = {} #Creates an empty
Dictionaries are enclosed in curly braces dictionary
`{}`. person = { "name": "John", "age":

30,
"city": "New York"}
del Removes the specified key- value pair Syntax:

from the dictionary. Raises a `KeyError` if 1. del dict_name[key]
the key does not exist. Example:

1. del person["Country"]
items() Retrieves all key-value pairs as tuples Syntax:

and converts them into a list of tuples. items_list =
list(dict_name.items())
Each tuple consists of a key and its
Example:
corresponding value. info = list(person.items())
key existence You can check for the existence of a key Example:
in a dictionary using the ìn` keyword if "name" in person:
2. print("Name exists in the
dictionary.")
keys() Retrieves all keys from the dictionary and Syntax:
1. keys_list =
converts them into a list. Useful for
list(dict_name.keys())
iterating or processing keys using list
Example:
methods.
1. person_keys = list(person.keys())
update() The ùpdate()` method merges the Syntax:

1. dict_name.update({key: value})
provided dictionary into the existing
dictionary, adding or updating key-value Example:
pairs. 1. person.update({"Profession":
"Doctor"})
values() Extracts all values from the dictionary Syntax:

1. values_list =
and converts them into a list. This list can
list(dict_name.values())
be used for further processing or
Example:
analysis.
1. person_values =
list(person.values())

Sets

add() Elements can be added to a set using the Syntax:
1. set_name.add(element)
àdd()` method. Duplicates are
automatically removed, as sets only store Example:
unique values. 1. fruits.add("mango")
clear() The `clear()` method removes all Syntax:

1. set_name.clear()
elements from the set, resulting in an
empty set. It updates the set in-place. Example:
1. fruits.clear()
copy() The `copy()` method creates a shallow Syntax:

1. new_set = set_name.copy()
copy of the set. Any modifications to the
copy won't affect the original set. Example:
1. new_fruits = fruits.copy()
Defining Sets A set is an unordered collection of Example:

1. empty_set = set() #Creating an
unique elements. Sets are enclosed in
Empty Set
curly braces `{}`. They are useful for
2. fruits = {"apple", "banana",
storing distinct values and performing "orange"}
set operations.
discard() Use the `discard()` method to remove a Syntax:
1. set_name.discard(element)
specific element from the set. Ignores if
the element is not found. Example:
1. fruits.discard("apple")
issubset() The ìssubset()` method checks if the Syntax:

1. is_subset = set1.issubset(set2)
current set is a subset of another set. It
returns True if all elements of the current Example:
set are present in the other set, 1. is_subset =
fruits.issubset(colors)
otherwise False.
issuperset() The ìssuperset()` method checks if the Syntax:
1. is_superset =
current set is a superset of another set. It
set1.issuperset(set2)
returns True if all elements of the other
Example:
set are present in the current set,
1. is_superset =
otherwise False. colors.issuperset(fruits)
pop() The `pop()` method removes and returns Syntax:

1. removed_element = set_name.pop()
an arbitrary element from the set. It
raises a `KeyError` if the set is empty. Use Example:
this method to remove elements when 1. removed_fruit = fruits.pop()
the order doesn't matter.

remove() Use the `remove()` method to remove a Syntax:
1. set_name.remove(element)
specific element from the set. Raises a
`KeyError` if the element is not found. Example:
1. fruits.remove("banana")
Set Operations Syntax:

1. union_set = set1.union(set2)

2. intersection_set =
set1.intersection(set2)
3. difference_set =
set1.difference(set2)
4. sym_diff_set =
set1.symmetric_difference(set2)
Perform various operations on sets: Example:

ùnion`, ìntersection`, `difference`, 1. combined = fruits.union(colors)
`symmetric difference`. 2. common =
fruits.intersection(colors)
3. unique_to_fruits =
fruits.difference(colors)
4. sym_diff =
fruits.symmetric_difference(colors)
update() The ùpdate()` method adds elements Syntax:

1. set_name.update(iterable)
from another iterable into the set. It
maintains the uniqueness of elements. Example:
1. fruits.update(["kiwi", "grape"]

PRACTICE QUIZ: LISTS AND TUPLES
Question 1
Consider the following tuple: say_what=('say',' what', 'you', 'will')
What is the result of the following? say_what[-1]
o 'will'
o say_what'
o 'you!'
o 'what!'
Correct: An index of −1 corresponds to the last item of a tuple, such as the string 'will'.
Question 2
Consider the following tuple A=(1,2,3,4,5). What is the outcome of
the following? A[1:4].
o (1, 2, 3, 4)
o (3, 4, 5)
o (2, 3, 4, 5)
o (2, 3, 4)
Correct:The indexes 1, 2, and 3 of the tuple correspond to these elements.
Question 3
Consider the following list B=[1,2,[3,'a'],[4,'b']]. What is the result of B[3][1]?
o 'a'
o 'b'
o 2
o [4,'b']
Correct: The list that follows relates to the index of nested list B[3].
Question 4
What is the outcome of the following operation? [1,2,3] + [1,1,1]
o TypeError
o [1, 2, 3, 1, 1, 1]
o [1, 2, 3; 1, 1, 1]
o [2,3,4]
Correct: The addition operator combines lists through concatenation.
Question 5
After operating A.append([2,3,4,5]), what will be the length of the list A = [1]?
o 10
o 6
o 2
o 5
Correct: Append only adds a single element to the list at a time.

VIDEO 008: DICTIONARIES (2:25)
Let's cover Dictionaries in Python. Dictionaries are a type of collection in Python.
Figure 144
If you recall, a list is integer indexes. These are like addresses. A list also has elements. A
dictionary has keys and values. The key is analogous to the index. They are like addresses, but
they don't have to be integers. They are usually characters. The values are similar to the element
in a list and contain information.
Figure 145
To create a dictionary, we use curly brackets. The keys are the first elements. They must be
immutable and unique. Each key is followed by a value separated by a colon. The values can be
immutable, mutable, and duplicates. Each key and value pair is separated by a comma.
Figure 146

Consider the following example of a dictionary. The album title is the key, and the value is the
released data.
Figure 147
We can use yellow to highlight the keys and leave the values in white.
Figure 148
It is helpful to use the table to visualize a dictionary where the first column represents the keys,
and the second column represents the values. We can add a few more examples to the
dictionary.
Figure 149
We can also assign the dictionary to a variable.
Figure 150

The key is used to look at the value. We use square brackets. The argument is the key. This
outputs the value.
Figure 151
Using the key of "Back in Black," this returns the value of 1980.
Figure 152
The key, "The Dark Side Of The Moon," gives us the value of 1973.
Figure 153

Using the key,"The bodyguard," gives us the value 1992 and so on.
Figure 154
We can add a new entry to the dictionary as follows.
Figure 155
This will add the value 2007 with a new key called "Graduation."
Figure 156
We can delete an entry as follows. This gets rid of the key "Thriller" and it's value.
Figure 157

We can verify if an element is in the dictionary using the "in" command as follows:
Figure 158
The command checks the keys. If they are in the dictionary, they return a true.
Figure 159
If we try the same command with a key that is not in the dictionary, we get a false.
Figure 160
In order to see all the keys in the dictionary, we can use the method keys to get the keys. The
output is a list-like object with all the keys.
Figure 161

In the same way, we can obtain the values using the method values.
Figure 162
Check out the labs for more examples and info. on dictionaries.

HANDS-ON LAB: DICTIONARIES
DICTIONARIES IN PYTHON

OBJECTIVES
After completing this lab, you will be able to:
• Work with and perform operations on dictionaries in Python.
TABLE OF CONTENTS
• Dictionaries
o What are Dictionaries?
o Keys
• Quiz on Dictionaries
Dictionaries
What are Dictionaries?

A dictionary consists of keys and values. It is helpful to compare a dictionary to a list. Instead of
being indexed numerically like a list, dictionaries have keys. These keys are the keys that are used
to access values within a dictionary.
Figure 163

An example of a Dictionary Dict:
# Create the dictionary
Dict = {"key1": 1, "key2": "2", "key3": [3, 3, 3], "key4": (4, 4, 4),
('key5'): 5, (0, 1): 6}
Dict
{'key1': 1,
'key2': '2',
'key3': [3, 3, 3],
'key4': (4, 4, 4),
'key5': 5,
(0, 1): 6}
• "key1": 1- The key is a string "key1", and its corresponding value is the integer 1.
• "key2": "2" - The key is a string "key2", and its corresponding value is the string "2".
• "key3": [3, 3, 3] - The key is a string "key3", and its corresponding value is a list [3, 3,
3]. The list contains three elements, all of which are integer 3.
• "key4": (4, 4, 4) - The key is a string "key4", and its corresponding value is a tuple (4, 4,
4). The tuple contains three elements, all of which are the integer 4.
• ('key5'): 5 - The key is a tuple ('key5',), and its corresponding value is the integer 5. Note
that the key is enclosed in parentheses to indicate it as a tuple. In this case, the tuple
contains a single element: the string "key5".
• (0, 1): 6 - The key is a tuple (0, 1), and its corresponding value is the integer
This tuple contains two elements: 0 and 1. The keys can be strings:
# Access to the value by the key

Dict["key1"]
1
Keys can also be any immutable object such as a tuple:
# Access to the value by the key

Dict[(0, 1)]
6
Each key is separated from its value by a colon ":". Commas separate the items, and the whole
dictionary is enclosed in curly braces. An empty dictionary without any items is written with just
two curly braces, like this "{}".
# Create a sample dictionary
release_year_dict = {"Thriller": "1982", "Back in Black": "1980", \

"The Dark Side of the Moon": "1973", "The Bodyguard": "1992", \
"Bat Out of Hell": "1977", "Their Greatest Hits (1971-1975)":
"1976", \
"Saturday Night Fever": "1977", "Rumours": "1977"}
release_year_dict
{'Thriller': '1982', 'Back in Black': '1980',
'The Dark Side of the Moon': '1973', 'The Bodyguard': '1992',
'Bat Out of Hell': '1977',
'Their Greatest Hits (1971-1975)': '1976', 'Saturday Night Fever': '1977',
'Rumours': '1977'}

In summary, like a list, a dictionary holds a sequence of elements. Each element is represented
by a key and its corresponding value. Dictionaries are created with two curly braces containing
keys and values separated by a colon. For every key, there can only be one single value, however,
multiple keys can hold the same value. Keys can only be strings, numbers, or tuples, but values
can be any data type.
It is helpful to visualize the dictionary as a table, as in the following image. The first column
represents the keys, the second column represents the values.
Figure 164
Keys
You can retrieve the values based on the names:
# Get value by keys

release_year_dict['Thriller']
'1982'
This corresponds to:
Figure 165

Similarly for The Bodyguard
# Get value by key

release_year_dict['The Bodyguard']
'1992'
Figure 166
Now let us retrieve the keys of the dictionary using the method keys():
# Get all the keys in dictionary

release_year_dict.keys()
dict_keys(['Thriller', 'Back in Black', 'The Dark Side of the Moon', 'The Bodyguard', 'Bat Out
of Hell', 'Their Greatest Hits (1971-1975)', 'Saturday Night Fever', 'Rumours'])
values
You can retrieve the values using the method values():
# Get all the values in dictionary

release_year_dict.values()
dict_values(['1982', '1980', '1973', '1992', '1977', '1976', '1977',
'1977'])
add
We can add an entry:
# Append value with key into dictionary

release_year_dict['Graduation'] = '2007'
release_year_dict
{'Thriller': '1982', 'Back in Black': '1980',
'Rumours': '1977',
'Graduation': '2007'}
del
We can delete an entry:
# Delete entries by key

del(release_year_dict['Thriller'])
del(release_year_dict['Graduation'])
release_year_dict
{'Back in Black': '1980',

'Rumours': '1977'}
Verify (in)
We can verify if an element is in the dictionary:
# Verify the key is in the dictionary

'The Bodyguard' in release_year_dict
True

Quiz on Dictionaries
You will need this dictionary for the next two questions:
# Question sample dictionary

soundtrack_dic = {"The Bodyguard":"1992", "Saturday Night Fever":"1977"}
soundtrack_dic
{'The Bodyguard': '1992', 'Saturday Night Fever': '1977'}
a) In the dictionary soundtrack_dic what are the keys ?

soundtrack_dic.keys() # The Keys "The Bodyguard" and "Saturday Night Fever"
dict_keys(['The Bodyguard', 'Saturday Night Fever'])
b) In the dictionary soundtrack_dic what are the values?

soundtrack_dic.values() # The values are "1992" and "1977"
dict_values(['1992', '1977'])
You will need this dictionary for the following questions:
The Albums Back in Black, The Bodyguard and Thriller have the following music recording
sales in millions 50, 50 and 65 respectively:
a) Create a dictionary album_sales_dict where the keys are the album name and the
sales in millions are the values.

album_sales_dict = {"The Bodyguard":50, "Back in Black":50, "Thriller":65}
b) Use the dictionary to find the total sales of Thriller:

album_sales_dict["Thriller"]
65
c) Find the names of the albums from the dictionary using the method keys():

album_sales_dict.keys()
dict_keys(['The Bodyguard', 'Back in Black', 'Thriller'])
d) Find the values of the recording sales from the dictionary using the method
values:
# Write your code below and press Shift+Enter to execute album_sales_dict.values()

dict_values([50, 50, 65])
The last exercise!


PRACTICE QUIZ: DICTIONARIES
Question 1
What are the keys of the following dictionary? {"a":1,"b":2}
o a,b
o 1,2
o {"a","b"}
o "a","b"
Correct: The key is the first element separated from its value by a colon.
Question 2
Consider the following Python Dictionary:
Dict={"A":1,"B":"2","C":[3,3,3],"D":(4,4,4),'E':5,'F':6}
What is the result of the following operation? Dict["D"]
o '4,4,4'
o (4, 4, 4)
o 1
o [3,3,3]
Correct: This corresponds to the key 'D' or Dict['D'].
Question 3
Which of the following is the correct syntax to extract the keys of a dictionary as a list?
o list(keys(dict))
o list(dict.keys())
o dict.keys().list()
o keys(dict.list())

VIDEO 009: SETS (5:17 )
SETS
Let's cover sets. They are also a type of collection.
• Sets are a type of collection: This means that like lists and tuples, you can input different
Python types.
• Unlike lists and tuples, they are unordered: This means sets do not record element
position.
• Sets only have unique elements: This means there is only one of a particular element in
a set.
CREATING A SET
To define a set, you use curly brackets. You place the elements of a set within the curly brackets.
You notice there are duplicate items. When the actual set is created, duplicate items will not be
present.
Figure 167
You can convert a list to a set by using the function set. This is called type casting.
Figure 168

You simply use the list as the input to the function set. The result will be a list converted to
a set.
Figure 169
Let's go over an example. We start off with a list. We input the list to the function set.
Figure 170
The function set returns a set. Notice how there are no duplicate elements.
Figure 171

SET OPERATIONS
Let's go over set operations. These could be used to change the set.
Consider the set A. Let's represent this set with a circle. If you are familiar with sets, this could
be part of a venn diagram. A venn diagram is a tool that uses shapes usually to represent sets.
add
We can add an item to a set using the add-method. We just put the set name followed by a dot,
then the add-method. The argument is the new element of the set we would like to add, in this
case, NSYNC.
Figure 172
The set A now has in NSYNC as an item.
Figure 173
If we add the same item twice, nothing will happen as there can be no duplicates in a set.
Figure 174

remove
Let's say we would like to remove NSYNC from set A. We can also remove an item from a set
using the remove-method. We just put the set name followed by a dot, then the remove-
method.
Figure 175
The argument is the element of the set we would like to remove, in this case, NSYNC.
After the remove-method is applied to the set, set A does not contain the item NSYNC. You
can use this method for any item in the set.
Figure 176
Verify (in)
We can verify if an element is in the set using the in command as follows.

The command checks that the item, in this case AC/DC, is in the set. If the item is in the set, it
returns true.
Figure 177

If we look for an item that is not in the set, in this case for the item Who, as the item is not in the
set, we will get a false.
These are types of mathematical set operations. There are other operations we can do.
Figure 178
Mathematical Set Operations
and
There are lots of useful mathematical operations we can do between sets.
Let's define the set album_set_1. We can represent it using a red circle or venn diagram.
Figure 179
Similarly, we can define the set album_set_2. We can also represent it using a blue circle or
venn diagram.
Figure 180

The intersection of two sets is a new set containing elements which are in both of those sets.
It's helpful to use venn diagrams.
Figure 181
The two circles that represent the sets combine, the overlap, represents the new set. As the
overlap is comprised with the red circle and blue circle, we define the intersection in terms of
and. In Python, we use an ampersand (&) to find the intersection of the two sets.
Figure 182
If we overlay the values of the set over the circle placing the common elements in the
overlapping area, we see the correspondence.
Figure 183

After applying the intersection operation, all the items that are not in both sets disappear.
Figure 184
In Python, we simply just place the ampersand between the two sets. We see that both AC/DC
and Back in Black are in both sets. The result is a new set album_set_3 containing all the
elements in both album_set_1 and album_set_2.
Figure 185
The union of two sets is the new set of elements which contain all the items in both sets. We can
find the union of the sets album_set_1 and album_set_2 as follows.
Figure 186

The result is a new set that has all the elements of album_set_1 and album_set_2. This new
set is represented in green.
Figure 187
Consider the new album album_set_3. The set contains the elements AC/DC and Back in
Black.
Figure 188
issubset
We can represent this with a Venn diagram, as all the elements and album set three are in
album_set_1. The circle representing album_set_1 encapsulates the circle representing
album_set_3. We can check if a set is a subset using the issubset method. As album_set_3
is a subset of the album_set_1, the result is true.
Figure 189
There is a lot more you can do with sets. Check out the lab for more examples.

HANDS-ON LAB: SETS
SETS IN PYTHON
OBJECTIVES
• Work with sets in Python, including operations and logic operations.
TABLE OF CONTENTS
• Sets
o Set Content
o Set Operations
o Sets Logic Operations
• Quiz on Sets
Sets
Set Content
A set is a unique collection of objects in Python. You can denote a set with a pair of curly brackets
{}. Python will automatically remove duplicate items:
# Create a set
set1 = {"pop", "rock", "soul", "hard rock", "rock", "R&B", "rock", "disco"}
set1
{'R&B', 'disco', 'hard rock', 'pop', 'rock', 'soul'}
The process of mapping is illustrated in the figure:
Figure 190
You can also create a set from a list as follows:
# Convert list to set

album_list = [ "Michael Jackson", "Thriller", 1982, "00:42:19", \ "Pop, Rock, R&B",
46.0, 65, "30-Nov-82", None, 10.0]
album_set = set(album_list)
album_set
{'00:42:19',
10.0,
1982,
'30-Nov-82', 46.0,
65,
'Michael Jackson', None,

'Pop, Rock, R&B', 'Thriller'}
Now let us create a set of genres:
# Convert list to set

music_genres = set(["pop", "pop", "rock", "folk rock", "hard rock", "soul",
"progressive rock", "soft rock", "R&B", "disco"])
music_genres
{'R&B',
'disco', 'folk rock', 'hard rock', 'pop',
'progressive rock', 'rock',
'soft rock', 'soul'}
Set Operations
Let us go over set operations, as these can be used to change the set. Consider the set A:
# Sample set
A = set(["Thriller", "Back in Black", "AC/DC"])
A
{'AC/DC', 'Back in Black', 'Thriller'}
add
We can add an element to a set using the add() method:
# Add element to set A.add("NSYNC")

A
{'AC/DC', 'Back in Black', 'NSYNC', 'Thriller'}
If we add the same element twice, nothing will happen as there can be no duplicates in a set:
# Try to add duplicate element to the set A.add("NSYNC")

A
{'AC/DC', 'Back in Black', 'NSYNC', 'Thriller'}
remove
We can remove an item from a set using the remove method:
# Remove the element from set A.remove("NSYNC")

A
{'AC/DC', 'Back in Black', 'Thriller'}
Verify (in)
We can verify if an element is in the set using the in command:
# Verify if the element is in the set "AC/DC"

in A
True

Sets Logic Operations
Remember that with sets you can check the difference between sets, as well as the symmetric
difference, intersection, and union:
Consider the following two sets:
# Sample Sets
album_set1 = set(["Thriller", 'AC/DC', 'Back in Black'])
album_set2 = set([ "AC/DC", "Back in Black", "The Dark Side of the Moon"])
Figure 191
# Print two sets

album_set1, album_set2
({'AC/DC', 'Back in Black', 'Thriller'},
{'AC/DC', 'Back in Black', 'The Dark Side of the Moon'})
As both sets contain AC/DC and Back in Black we represent these common elements
with the intersection of two circles.
Figure 192

You can find the intersect of two sets as follow using &:
# Find the intersections

intersection = album_set1 & album_set2 intersection
{'AC/DC', 'Back in Black'}
difference
You can find all the elements that are only contained in album_set1 using the difference
method:
# Find the difference in set1 but not set2

album_set1.difference(album_set2)
{'Thriller'}
You only need to consider elements in album_set1; all the elements in album_set2, including
the intersection, are not included.
Figure 193
album_set2.difference(album_set1)
{'The Dark Side of the Moon'}
Figure 194

intersection
You can also find the intersection of album_list1 and album_list2, using the
intersection method:
# Use intersection method to find the intersection of album_list1 and album_list2

album_set1.intersection(album_set2)
{'AC/DC', 'Back in Black'}
This corresponds to the intersection of the two circles:
Figure 195
union
The union corresponds to all the elements in both sets, which is represented by coloring
both circles:
Figure 196

The union is given by:
# Find the union of two sets

album_set1.union(album_set2)
{'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}
Issuperset and issubset

And you can check if a set is a superset or subset of another set, respectively, like this:
# Check if superset
set(album_set1).issuperset(album_set2)
False
# Check if subset
set(album_set2).issubset(album_set1)
False
Here is an example where issubset() and issuperset() return true:
# Check if subset
set({"Back in Black", "AC/DC"}).issubset(album_set1)
True
# Check if superset
album_set1.issuperset({"Back in Black", "AC/DC"})
True

QUIZ ON SETS
Convert the list ['rap','house','electronic music', 'rap'] to a set:

set(['rap', 'house', 'electronics music','rap'])
{'electronics music', 'house', 'rap'}
Consider the list A = [1, 2, 2, 1] and set B = set([1, 2, 2, 1]), does sum(A) ==
sum(B)?

A = [1,2,2,1]
B = set([1,2,2,1])
print("the sum of A is:",sum(A))
print("the sum aof B is:",sum(B))
the sum of A is: 6
the sum aof B is: 3
Create a new set album_set3 that is the union of album_set1 and album_set2:

album_set3 = album_set1.union(album_set2)
{'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}
Find out if album_set1 is a subset of album_set3:

album_set1.issubset(album_set3)
True
The last exercise!


PRACTICE QUIZ: SETS
Question 1
Consider the following set: {"A","A"}, what will the result be when you create the set?
o {}
o {"A"}
o {"A", "B"}
o {"A", "A"}
Correct: Sets in Python do not allow duplicate elements. Consequently, the resulting set will
automatically eliminate the duplicate, resulting in {"A"}.
Question 2
What method do you use to add an element to a set?
o Insert
o Extend
o Add
o Append
Correct: The add method adds elements to a set.
Question 3
What is the result of the following operation? {'a','b'} & {'a'}
o {}
o {'a','b'}
o {'b'}
o {'a'}
Correct: The intersection operation finds the elements that are in both sets
DISCUSSION PROMPT: PYTHON DATA STRUCTURES

Discussion Prompt:
Provide a use case not given in this course for each of the following Python data structures:
tuple, list, and dictionary. Explain why each use case is appropriate for each data structure as
opposed to the others.

MODULE 2 SUMMARY: PYTHON DATA STRUCTURES
• In Python, we often use tuples to group related data together. Tuples refer to ordered
and immutable collections of elements.
• Tuples are usually written as comma-separated elements in parentheses
• “()".
• You can include strings, integers, and floats in tuples and access them using both
positive and negative indices.
• You can perform operations such as combining, concatenating, and slicing on tuples.
• Tuples are immutable, so you need to create a new tuple to manipulate it.
• Tuples, termed nesting, can include other tuples of complex data types.
• You can access elements in a nested tuple through indexing.
• Lists in Python contain ordered collections of items that can hold elements of different
types and are mutable, allowing for versatile data storage and manipulation.
• List is an ordered sequence, represented with square brackets "[]".
• Lists possess mutability, rendering them akin to tuples.
• A list can contain strings, integers, and floats; you can nest lists within it.
• You can access each element in a list using both positive and negative indexing.
• Concatenating or appending a list will result in the modification of the same list.
• You can perform operations such as adding, deleting, splitting, and so forth on a list.
• You can separate elements in a list using delimiters.
• Aliasing occurs when multiple names refer to the same object.
• You can also clone a list to create another list.
• Dictionaries in Python are key-value pairs that provide a flexible way to store and
retrieve data based on unique keys.
• Dictionaries consist of keys and values, both composed of string elements.
• You denote dictionaries using curly brackets.
• The keys necessitate immutability and uniqueness.
• The values may be either immutable or mutable, and they allow duplicates.
• You separate each key-value pair with a comma, and you can use color highlighting to
make the key more visible.
• You can assign Dictionaries to a variable.
• You use the key as an argument to retrieve the corresponding value.
• You can make additions and deletions to dictionaries.
• You can perform an operation on a dictionary to check the key, which results in a true
or false output.
• You can apply methods to obtain a list of keys and values in a dictionary.
• Sets in Python are collections of unique elements, useful for tasks such as removing
duplicates and performing set operations like union and intersection. Sets lack order.

• Curly brackets "{}" are helpful for defining elements of a set.
• Sets do not contain duplicate items.
• A list passed through the set function generates a set containing unique elements.
• You use “Set Operations” to perform actions such as adding, removing, and
• verifying elements in a set.
• You can combine sets using the ampersand "&" operator to obtain the common
elements from both sets.
• You can use the Union function to combine two sets, including both the common
and unique elements from both sets.
• The sub-set method is used to determine if two or more sets are subsets.

CHEAT SHEET: DICTIONARIES AND SETS
CHEAT SHEET: PYTHON DATA STRUCTURE PART-2

DICITONARIES

Creating a A dictionary is a built-in data type Example:
Dictionary that represents a collection of key- 1. dict_name = {} #Creates an empty
dictionary
value pairs. Dictionaries are enclosed
2. person = { "name": "John", "age": 30,
in curly braces {}. "city": "New York"}
Accessing Values You can access the values in a Syntax:
dictionary using their corresponding 1. Value = dict_name["key_name"]
keys. Example:
1. name = person["name"]
2. age = person["age"]
Add or modify Inserts a new key-value pair into the Syntax:

dictionary. If the key already exists, 1. dict_name[key] = value
the value will be updated; otherwise, Example:
a new entry is created. 1. person["Country"] = "USA" # A new entry
will be created.
2. person["city"] = "Chicago" # Update the
existing value for the same key
del Removes the specified key-value pair Syntax:

from the dictionary. Raises a KeyError 1. del dict_name[key]
if the key doesnot exist. Example:
1. del person["Country"]
update() The update() method merges the Syntax:

provided dictionary into the existing 1. dict_name.update({key: value})
dictionary, adding or updating key- Example:
value pairs. 1. person.update({"Profession": "Doctor"})
clear() The clear() method empties the Syntax:

dictionary, removing all key-value 1. dict_name.clear()
pairs within it. After this operation, Example:
the dictionary is still accessible and 1. grades.clear()
can be used further.
key existence You can check for the existence of a Example:
key in a dictionary using the in 1. if "name" in person:
keyword 2. print("Name exists in the dictionary.")
copy() Creates a shallow copy of the Syntax:

dictionary. The new dictionary 1. new_dict = dict_name.copy()
contains the same key-value pairs as Example:
1. new_person = person.copy()

the original, but they remain distinct 2. new_person = dict(person) # another way
to create a copy of dictionary
objects in memory.
keys() Retrieves all keys from the dictionary Syntax:
and converts them into a list. Useful 1. keys_list = list(dict_name.keys())
for iterating or processing keys using Example:
list methods. 1. person_keys = list(person.keys())
values() Extracts all values from the dictionary Syntax:

and converts them into a list. This list 1. values_list = list(dict_name.values())
can be used for further processing or Example:
analysis. 1. person_values = list(person.values())
items() Retrieves all key-value pairs as tuples Syntax:

and converts them into a list of 1. items_list = list(dict_name.items())
tuples. Each tuple consists of a key Example:
and its corresponding value. 1. info = list(person.items())
SETS

add() Elements can be added to a set using Syntax:
the àdd()` method. Duplicates are 1. set_name.add(element)
automatically removed, as sets only Example:
store unique values. 1. fruits.add("mango")
clear() The `clear()` method removes all Syntax:

elements from the set, resulting in an 1. set_name.clear()
empty set. It updates the set in-place. Example:
1. fruits.clear()</td>
copy() The `copy()` method creates a shallow Syntax:

copy of the set. Any modifications to 1. new_set = set_name.copy()
the copy won't affect the original set. Example:
1. new_fruits = fruits.copy()
Defining Sets A set is an unordered collection of Example:

unique elements. Sets are enclosed in 1. empty_set = set() #Creating an Empty
curly braces `{}`. They are useful for 2. Set fruits = {"apple", "banana",
"orange"}
storing distinct values and
performing set operations.
discard() Use the `discard()` method to remove Syntax:
a specific element from the set. 1. set_name.discard(element)
Ignores if the element is not found. Example:
1. fruits.discard("apple")
issubset() The ìssubset()` method checks if the Syntax:

current set is a subset of another set. 1. is_subset = set1.issubset(set2)
It returns True if all elements of the Example:

current set are present in the other 1. is_subset = fruits.issubset(colors)
set, otherwise False.

issuperset() The ìssuperset()` method checks if Syntax:
the current set is a superset of is_superset = set1.issuperset(set2)
another set. It returns True if all Example:
elements of the other set are present 1. is_superset = colors.issuperset(fruits)
in the current set, otherwise False.
pop() The `pop()` method removes and Syntax:
returns an arbitrary element from the 1. removed_element = set_name.pop()
set. It raises a `KeyError` if the set is Example:
empty. Use this method to remove 1. removed_fruit = fruits.pop()
elements when the order doesn't
matter.
remove() Use the `remove()` method to remove Syntax:
a specific element from the set. 1. set_name.remove(element)
Raises a `KeyError` if the element is Example:
not found. 1. fruits.remove("banana")
Set Operations Perform various operations on sets: Syntax:

ùnion`, ìntersection`, `difference`, 1. union_set = set1.union(set2)
`symmetric difference`. 2. intersection_set =
set1.intersection(set2)
3. difference_set = set1.difference(set2)
4. sym_diff_set =
set1.symmetric_difference(set2)
Example:
1. combined = fruits.union(colors)
2. common = fruits.intersection(colors)
3. unique_to_fruits =
fruits.difference(colors)
4. sym_diff =
fruits.symmetric_difference(colors)
update() The ùpdate()` method adds elements Syntax:
from another iterable into the set. It 1. set_name.update(iterable)
maintains the uniqueness of Example:
elements. 1. fruits.update(["kiwi", "grape"])
GRADED QUIZ: PYTHON DATA STRUCTURES

READING: GLOSSARY: PYTHON DATA STRUCTURES
Welcome! This alphabetized glossary contains many of the terms in this course. This
comprehensive glossary also includes additional industry-recognized terms not used in course
videos. These terms are important for you to recognize when working in the industry,
Term Definition
Aliasing Aliasing refers to giving another name to a function or a variable.
Ampersand A character typically "&" standing for the word "and."
Compound elements Compound statements contain (groups of) other statements; they affect or
control the execution of those other statements in some way.
Delimiter A delimiter in Python is a character or sequence of characters used to separate or

mark the boundaries between elements or fields within a larger data structure,
such as a string or a file.
Dictionaries A dictionary in Python is a data structure that stores a collection of key-value

pairs, where each key is unique and associated with a specific value.
Function A function is a block of code, defining a set procedure, which is executed only
when it is called.
Immutable Immutable Objects are of in-built datatypes like int, float, bool, string, Unicode,
and tuple. In simple words, an immutable object can't be changed after it is
created.
Intersection The intersection of two given sets is the largest set, which contains all the
elements that are common to both sets.
Keys The keys () method in Python Dictionary, returns a view object that displays a
list of all the keys in the dictionary in order of insertion using Python.
Lists A list is any list of data items, separated by commas, inside square brackets.
Logic operations In Python, logic operations refer to the use of logical operators such as "and,"
"or," and "not" to perform logical operations on Boolean values (True or False).
Mutable Immutable objects are of in-built datatypes like int, float, bool, string, Unicode,
and tuple. A mutable object can be changed after it is created.
Nesting A nested function is simply a function within another function and is sometimes
called an "inner function".
Ratings in python Ratings in Python typically refer to a numerical or qualitative measure assigned
to something to indicate its quality, performance, or value.
Set operations Set operations in Python refer to mathematical operations performed on sets,
which are unordered collections of unique elements.
Sets in python A set is an unordered collection of unique elements.

Syntax The rules that define the structure of the language for python is called its
syntax.
Tuples These are used store multiple items in a single variable.

Type casting In python, this is converting one data type to another.
Variables In python, a variable is a symbolic name or identifier used to store and
manipulate data. Variables serve as containers for values, and these values can
be of various data types, including numbers, strings, lists, and more.
Venn diagram A Venn diagram is a graphical representation that uses overlapping circles to
illustrate the relationships and commonalities between sets or groups of items.
Versatile data Versatile data, in a general context, refers to data that can be used in multiple
ways, is adaptable to different applications or purposes, and is not restricted to
a specific use case.

MODULE 3: PYTHON PROGRAMMING FUNDAMENTALS
MODULE INTRODUCTION AND LEARNING OBJECTIVES

This module discusses Python fundamentals and begins with the concepts of conditions and
branching. Continue through the module and learn how to implement loops to iterate over
sequences, create functions to perform a specific task, perform exception handling to catch
errors, and how classes are needed to create objects.
Learning Objectives
In this module you will learn about:
• Classify conditions and branching by identifying structured scenarios with outputs.
• Work with objects and classes.
• Explain objects and classes by identifying data types and creating a class.
• Use exception handling in Python.
• Explain what functions do.
• Build a function using inputs and outputs.
• Explain how for loops and while loops work.
• Work with condition statements in Python, including operators and branching.
• Create and use loop statements in Python.

VIDEO 010: CONDITIONS AND BRANCHING (10:17)
In the video, you will learn about conditions and branching.
Figure 197
COMPARISON OPERATORS
Comparison operations compares some value or operand.
Figure 198
Then based on some condition, they produce a Boolean.
Figure 199

Let's say we assign a value of a to six.
Figure 200
We can use the equality operator denoted with two equal signs to determine if two values are
equal. In this case, if seven is equal to six.
Figure 201
In this case, as six is not equal to seven, the result is false.
Figure 202

If we performed an equality test for the value six, the two values would be equal.
Figure 203
As a result, we would get a true.
Figure 204
Consider the following equality comparison operator: If the value of the left operand, in this case,
the variable i is greater than the value of the right operand, in this case five, the condition
becomes true or else we get a false.
Let's display some values for i on the left. Let's see the value is greater than five in green and
the rest in red. If we set i equal to six, we see that six is larger than five and as a result, we get a
true.
Figure 205

We can also apply the same operations to floats. If we modify the operator as follows, if the left
operand i is greater than or equal to the value of the right operand,
in this case five, then the condition becomes true. In this case, we include the value of five in the
number line and the color changes to green accordingly. If we set the value of i equal to five, the
operand will produce a true.
Figure 206
If we set the value of i to two, we would get a false because two is less than five.
Figure 207
We can change the inequality if the value of the left operand, in this case, i is less than the value
of the right operand, in this case, six. Then condition becomes true. Again, we can represent this
with a colored number line. The areas where the inequality is true are marked in green and red
where the inequality is false. If the value for i is set to two, the result is a true. As two is less than
six.
Figure 208

The inequality test uses an exclamation mark preceding the equal sign. If two operands are not
equal, then the condition becomes true. We can use a number line. When the condition is true,
the corresponding numbers are marked in green and red for where the condition is false. If we
set i equal to two, the operator is true as two is not equal to six.
Figure 209
We compare strings as well. Comparing ACDC and Michael Jackson using the equality test, we
get a false, as the strings are not the same. Using the inequality test, we get a true, as the strings
are different.
Figure 210
See the labs for more examples.

BRANCHING
Branching allows us to run different statements for a different input.
Figure 211
IF STATEMENT
It's helpful to think of an if statement as a locked room. If this statement is true, you can enter
the room and your program can run some predefined tasks. If the statement is false, your program
will skip the task. For example, consider the blue rectangle representing an ACDC concert. If the
individual is 18 or older, they can enter the ACDC concert. If they are under the age of 18, they
cannot enter the concert. Individual proceeds to the concert their age is 17, therefore, they are
not granted access to the concert and they must move on.
Figure 212
If the individual is 19, the condition is true. They can enter the concert then they can move on.
Figure 213

This is the syntax of the if statement from our previous example. We have the if statement. We
have the expression that can be true or false. The brackets are not necessary. We have a colon.
Within an indent, we have the expression that is run if the condition is true. The statements after
the if statement will run regardless if the condition is true or false. For the case where the age is
17, we set the value of the variable age to 17. We check the if statement, the statement is false.
Figure 214
Therefore, the program will not execute the statement to print, "you will enter". In this case, it
will just print "move on".
Figure 215
For the case where the age is 19, we set the value of the variable age to 19. We check the if
statement. The statement is true. Therefore, the program will execute the statement to print
"you will enter". Then it will just print "move on".
Figure 216

ELSE STATEMENT
The else statement will run a different block of code if the same condition is false.
Let's use the ACDC concert analogy again. If the user is 17, they cannot go to the ACDC concert
but they can go to the Meat Loaf concert represented by the purple square.
Figure 217
If the individual is 19, the condition is true, they can enter the ACDC concert then they can move
on as before.
Figure 218
The syntax of the else statement is similar. We simply append the statement else. We then add
the expression we would like to execute with an indent.
Figure 219
statement, the statement is false. Therefore, we progress to the else statement. We run the

statement in the indent. This corresponds to the individual attending the Meat Loaf concert. The
program will then continue running.
Figure 220
statement, the statement is true. Therefore, the program will execute the statement to print "you
will enter". The program skips the expressions in the else statement and continues to run the
rest of the expressions.
Figure 221

ELIF STATEMENT
The elif statement, short for else if, allows us to check additional conditions if the preceding
condition is false. If the condition is true, the alternate expressions will be run. Consider the
concert example, if the individual is 18, they will go to the Pink Floyd concert instead of attending
the ACDC or Meat Loaf concerts. The person of 18 years of age enters the area as they are not
over 19 years of age. They cannot see ACDC but as their 18 years, they attend Pink Floyd. After
seeing Pink Floyd, they move on.
Figure 222
The syntax of the elif statement is similar. We simply add the statement elif with the condition.
We, then add the expression we would like to execute if the statement is true with an indent. Let's
illustrate the code on the left. An 18-year-old enters. They are not older than 18 years of age.
Therefore, the condition is false. So, the condition of the elif statement is checked. The condition
is true. So, then we would print "go see Pink Floyd". Then we would move on as before. If the
variable age was 17, the statement "go see Meat Loaf" would print. Similarly, if the age was
greater than 18, the statement "you will enter" would print.
Figure 223
Check the labs for more examples.

LOGICAL OPERATORS
Now let's take a look at logic operators.
Figure 224
Logic operations take Boolean values and produce different Boolean values.
Figure 225
Figure 226

NOT OPERATOR
The first operation is the not operator. If the input is true, the result is a false.
Figure 227
Figure 228
Similarly, if the input is false, the result is a true.
Figure 229

Figure 230
OR OPERATOR
Let A and B represent Boolean variables. The OR operator takes in the two values and produces
a new Boolean value.
Figure 231
We can use this table to represent the different values. The first column represents the possible
values of A. The second column represents the possible values of B. The final column represents
the result of applying the OR operation. We see the OR operator only produces a false if all the
Boolean values are false.
Figure 232

The following lines of code will print out: "This album was made in the 70s or 90s", if the variable
album year does not fall in the 80s.
Figure 233
Let's see what happens when we set the album year to 1990. The colored number line is green
when the condition is true and red when the condition is false. In this case, the condition is false.
Examining the second condition, we see that 1990 is greater than 1989. So, the condition is true.
We can verify by examining the corresponding second number line. In the final number line, the
green region indicates, where the area is true. This region corresponds to where at least one
statement is true. We see that 1990 falls in the area. Therefore, we execute the statement.
Figure 234
AND OPERATOR
Let A and B represent Boolean variables. The AND operator takes in the two values and
produces a new Boolean value.
Figure 235

Figure 236
We can use this table to represent the different values. The first column represents the possible
values of A. The second column represents the possible values of B. The final column represents
the result of applying the AND operation. We see the AND operator only produces a true if all
the Boolean values are true.
Figure 237
The following lines of code will print out "This album was made in the 80s" if the variable album
year is between 1980 and 1989. Let's see what happens when we set the album year to 1983.
As before, we can use the colored number line to examine where the condition is true. In this
case, 1983 is larger than 1980, so, the condition is true. Examining the second condition, we see
that 1990 is greater than 1983. So, this condition is also true. We can verify by examining the
corresponding second number line. In the final number line, the green region indicates where
the area is true. Similarly, this region corresponds to where both statements are true. We see
that 1983 falls in the area. Therefore, we execute the statement.
Figure 238
Branching allows us to run different statements for different inputs.

HANDS-ON LAB: CONDITIONS AND BRANCHING
CONDITIONS IN PYTHON

OBJECTIVES
• work with condition statements in Python, including operators, and branching.
TABLE OF CONTENTS
• Condition Statements
o Comparison Operators
o Branching
o Logical operators
• Quiz on Condition Statement
Condition Statements
Comparison Operators
Comparison operations compare some value or operand and based on a condition, produce a
Boolean. When comparing two values you can use these operators:
• equal: ==
• not equal: !=
• greater than: >
• less than: <
• greater than or equal to: >=
• less than or equal to: <=
equal
Let's assign a a value of 5. Use the equality operator denoted with two equal == signs
to determine if two values are equal. The case below compares the variable a with 6.
# Condition Equal a = 5
a == 6
False
The result is False, as 5 does not equal to 6.

Consider the following equality comparison operator: i > 5. If the value of the left operand, in
this case the variable i, is greater than the value of the right operand, in this case 5, then the
statement is True. Otherwise, the statement is False. If i is equal to 6, because 6 is larger than 5,
the output is True.
# Greater than Sign i = 6

i > 5
True
Set i = 2. The statement is False as 2 is not greater than 5:
# Greater than Sign i = 2

i > 5
false
Let's display some values for i in the figure. Set the values greater than 5 in green and the rest
in red. The green region represents where the condition is True, the red where the statement is
False. If the value of i is 2, we get False as the 2 falls in the red region. Similarly, if the value for i
is 6 we get a True as the condition falls in the green region.
Figure 239
Not equal
The inequality test uses an exclamation mark preceding the equal sign, if two operands are not
equal then the condition becomes True. For example, the following condition will produce True
as long as the value of i is not equal to 6:
# Inequality Sign i = 2
i != 6
True
When i equals 6 the inequality expression produces False.
# Inequality Sign i = 6
i != 6
False
See the number line below. When the condition is True, the corresponding numbers are marked
in green and for where the condition is False the corresponding number is marked in red. If we

set i equal to 2 the operator is true, since 2 is in the green region. If we set i equal to 6, we get a
False, since the condition falls in the red region.
Figure 240
We can apply the same methods on strings. For example, we can use an equality operator
on two different strings. As the strings are not equal, we get a False.
# Use Equality sign to compare the strings

"ACDC" == "Michael Jackson"
False
If we use the inequality operator, the output is going to be True as the strings are not equal.
# Use Inequality sign to compare the strings

"ACDC" != "Michael Jackson"
True

The inequality operation is also used to compare the letters/words/symbols according to the
ASCII value of letters. The decimal value shown in the following table represents the order of
the character:
Char. ASCII Char. ASCII Char. ASCII Char. ASCII
A 65 N 78 a 97 n 110
B 66 O 79 b 98 o 111
C 67 P 80 c 99 p 112
D 68 Q 81 d 100 q 113
E 69 R 82 e 101 r 114
F 70 S 83 f 102 s 115
G 71 T 84 g 103 t 116
H 72 U 85 h 104 u 117
I 73 V 86 i 105 v 118
J 74 W 87 j 106 w 119
K 75 X 88 k 107 x 120
L 76 Y 89 l 108 y 121
M 77 Z 90 m 109 z 122
For example, the ASCII code for! is 33, while the ASCII code for + is 43. Therefore + is larger than
! as 43 is greater than 33.
Similarly, from the table above we see that the value for A is 65, and the value for B is 66,
therefore:
# Compare characters
'B' > 'A'
True
Note: Upper Case Letters have different ASCII code than Lower Case Letters, which means the
comparison between the letters in Python is case-sensitive.
Branching
If statement
Branching allows us to run different statements for different inputs. It is helpful to think of an if
statement as a locked room, if the statement is True we can enter the room and your program
will run some predefined tasks, but if the statement is False the program will ignore the task.
For example, consider the blue rectangle representing an ACDC concert. If the individual is older
than 18, they can enter the ACDC concert. If they are 18 or younger, they cannot enter the concert.

We can use the condition statements learned before as the conditions that need to be checked
in the if statement. The syntax is as simple as if condition statement:, which contains a word if,
any condition statement, and a colon at the end. Start your tasks which need to be executed
under this condition in a new line with an indent. The lines of code after the colon and with an
indent will only be executed when the if statement is True. The tasks will end when the line of
code does not contain the indent.
In the case below, the code print(“you can enter”) is executed only if the variable age
is greater than 18 is a True case because this line of code has the indent. However, the execution
of print(“move on”) will not be influenced by the if statement.
# If statement example
age = 19
#age = 18
#expression that can be true or false if age > 18:
#within an indent, we have the expression that is run if the condition is true
print("you can enter" )
#The statements after the if statement will run regardless if the condition is true or
false
print("move on")
you can enter move on
Try uncommenting the age variable:
# If statement example
age = 19
age = 18
#expression that can be true or false if age > 18:
#within an indent, we have the expression that is run if the condition is true
print("you can enter" )
#The statements after the if statement will run regardless if the condition is true or
false
print("move on")
move on
It is helpful to use the following diagram to illustrate the process. On the left side, we see
what happens when the condition is True. The person enters the ACDC concert representing the
code in the indent being executed; they then move on. On the right side, we see what happens
when the condition is False; the person is not granted access, and the person moves on. In this
case, the segment of code in the indent does not run, but the rest of the statements are run.
Figure 241

else
The else statement runs a block of code if none of the conditions are True before this
else statement. Let's use the ACDC concert analogy again. If the user is 17, they cannot go to the
ACDC concert, but they can go to the Meatloaf concert. The syntax of the else statement is similar
as the syntax of the if statement, as else:. Notice that there is no condition statement for else.
Try changing the values of age to see what happens:
# Else statement example

age = 18
# age = 19
if age > 18:
print("you can enter" ) else:
print("go see Meat Loaf" ) print("move on")
go see Meat Loaf move on
The process is demonstrated below, where each of the possibilities is illustrated on each side of
the image. On the left is the case where the age is 17, we set the variable age to 17, and this
corresponds to the individual attending the Meatloaf concert. The right portion shows what
happens when the individual is over 18, in this case 19, and the individual is granted access to
the concert.
Figure 242
elif
The elif statement, short for else if, allows us to check additional conditions if the condition
statements before it is False. If the condition for the elif statement is True, the alternate
expressions will be run. Consider the concert example, where if the individual is 18, they will go
to the Pink Floyd concert instead of attending the ACDC or Meat-loaf concert. A person that is
18 years of age enters the area, and as they are not older than 18 they can not see ACDC, but
since they are 18 years of age, they attend Pink Floyd. After seeing Pink Floyd, they move
on. The syntax of the elif statement is similar in that we merely change the if in the if
statement to elif.
# Elif statment example

age = 18
if age > 18:
print("you can enter" ) elif age == 18:
print("go see Pink Floyd") else:
print("go see Meat Loaf" ) print("move on")
go see Pink Floyd move on

The three combinations are shown in the figure below. The left-most region shows what
happens when the individual is less than 18 years of age. The central component shows when
the individual is exactly 18. The rightmost shows when the individual is over 18.
Figure 243
Look at the following code:
# Condition statement example

album_year = 1983
album_year = 1970
if album_year > 1980:

print("Album year is greater than 1980") print('do something..')
do something…
Feel free to change album_year value to other values -- you'll see that the result changes!
Notice that the code in the above indented block will only be executed if the results are True.
As before, we can add an else block to the if block. The code in the else block will only be
executed if the result is False.
Syntax:
if (condition): # do something
else:
# do something else

If the condition in the if statement is False, the statement after the else block will execute.
This is demonstrated in the figure:
Figure 244
# Condition statement example album_year = 1983

#album_year = 1970
if album_year > 1980:

print("Album year is greater than 1980") else:
print("less than 1980") print('do something..')
Album year is greater than 1980 do something..
Feel free to change the album_year value to other values -- you'll see that the result changes
based on it!
Logical operators
Sometimes you want to check more than one condition at once. For example, you might want to
check if one condition and another condition are both True. Logical operators allow you to
combine or modify conditions.
• and
• or
• not

These operators are summarized for two variables using the following truth tables:
Figure 245
And operator
The and statement is only True when both conditions are true. The or statement is True if one
condition, or both are True. The not statement outputs the opposite truth value.
Let's see how to determine if an album was released after 1979 (1979 is not included) and
before 1990 (1990 is not included). The time periods between 1980 and 1989 satisfy this
condition. This is demonstrated in the figure below. The green on lines a and b represents
periods where the statement is True. The green on line c represents where both conditions
are True, this corresponds to where the green regions overlap.
Figure 246
The block of code to perform this check is given by:
# Condition statement example

album_year = 1980
if(album_year > 1979) and (album_year < 1990):
print ("Album year was in between 1980 and 1989")
print("")
print("Do Stuff..")
Album year was in between 1980 and 1989 Do Stuff..

To determine if an album was released before 1980 (1979 and earlier) or after 1989 (1990 and
onward ), an or statement can be used. Periods before 1980 (1979 and earlier) or after 1989
(1990 and onward) satisfy this condition. This is demonstrated in the following figure, the color
green in a and b represents periods where the statement is true. The color green in c represents
where at least one of the conditions are true.
Figure 247
The block of code to perform this check is given by:

if(album_year < 1980) or (album_year > 1989): print ("Album was not made in the
1980's")
else:
print("The Album was made in the 1980's ")
Album was not made in the 1980's
Not operator
The not statement checks if the statement is false:

if not (album_year == 1984):
print ("Album year is not 1984") Album year is not 1984

PRACTISE EXERCISES
1. There are 2 sisters, Annie and Jane, born in 1996 and 1999 respectively. They want to know
who was born in a leap year. Write an if-else statement to determine who was born in a leap
year.
Hint: A leap year is one that is divisible by 4

Annie=1996
Jane=1999
if Annie%4==0:
print("Annie was born in a leap year")
elif Jane%4==0:
print("Jane was born in a leap year")
else:
print("None of them were born in a leap year")
Annie was born in a leap year
2. In a school canteen, children under the age of 9 are only given milk porridge for breakfast.
Children from 10 to 14 are given a sandwich, and children from 15 to 17 are given a burger.
The canteen master asks the age of the student and gives them breakfast accordingly.
Sam's age is 10. Use if-else statement to determine what the canteen master will offer to
him.
Hint: For each range of age, create an if condition and print what the student will
get according to their age
age = 10
if age <=9:
print ("You will get a bowl of porridge!")
elif age>=10 and age<=14:
print ("You will get a sandwich!")
elif age>=15 and age<=17:
print("You will get a burger!")
You will get a sandwich!
Congratulations, you have completed your lab on Conditions and Branching!

READING: CONDITIONS AND BRANCHING
CONDITIONS AND BRANCHING
OBJECTIVE:
In this reading, you'll learn about:
• Comparison operators
• Branching
• Logical operators
Comparison operations
Comparison operations are essential in programming. They help compare values and make
decisions based on the results.
Equality operator
The equality operator checks if two values are equal. For example, in Python:
==
age = 25
if age == 25:
print("You are 25 years old.")
Here, the code checks if the variable age is equal to 25 and prints a message accordingly.
Inequality operator
The inequality operator checks if two values are not equal:
if age != 30:
print("You are not 30 years old.")
Here, the code checks if the variable age is not equal to 30 and prints a message accordingly.
Greater than and less than

You can also compare if one value is greater than another.
if age>= 20:
Print("Yes, the Age is greater than 20")
Here, the code checks if the variable age is greater than or equal to 20 and prints a message
accordingly.

Branching
Branching is like making decisions in your program based on conditions. Think of it as real-life
choices.
The IF statement
Consider a real-life scenario of entering a bar. If you're above a certain age, you can enter;
otherwise, you cannot.
age = 20
if age >= 21:
print("You can enter the bar.") else:
print("Sorry, you cannot enter.")
Here, you are using the if statement to make a decision based on the age variable.
The elif Statement

Sometimes, there are multiple conditions to check. For example, if you're not old enough
for the bar, you can go to a movie instead.
if age >= 21:

print("You can enter the bar.") elif age >= 18:
print("You can watch a movie.") else:
print("Sorry, you cannot do either.")
Real-life example: Automated Teller Machine (ATM)

When a user interacts with an ATM, the software in the ATM can use branching to make
decisions based on the user's input. For example, if the user selects "Withdraw Cash" the ATM
can branch into different denominations of bills to dispense based on the amount requested.
user_choice = "Withdraw Cash"

if user_choice == "Withdraw Cash":
amount = input("Enter the amount to withdraw: ") if amount % 10 == 0:
dispense_cash(amount) else:
print("Please enter a multiple of 10.")
else:
print("Thank you for using the ATM.")
Logical operators
Logical operators help combine and manipulate conditions.
The NOT operator
Real-life example: Notification settings

In a smartphone's notification settings, you can use the NOT operator to control when to send
notifications. For example, you might only want to receive notifications when your phone is not
in "Do Not Disturb" mode.
The not operator negates a condition.
is_do_not_disturb = True if not is_do_not_disturb:

send_notification("New message received")

The AND operator
Real-life example: Access control

In a secure facility, you can use the AND operator to check multiple conditions for access. To open
a high-security door, a person might need both a valid ID card and a matching fingerprint.
The AND operator checks if all required conditions are true, like needing both keys to open a
safe.
has_matching_fingerprint = True
if has_valid_id_card and has_matching_fingerprint:
open_high_security_door()
The OR operator
Real-life example: Movie night decision

When planning a movie night with friends, you can use the OR operator to decide on a movie
genre. You'll choose a movie if at least one person is interested.The OR operator checks if at least
one condition is true. It's like choosing between different movies to watch.
friend1_likes_comedy = True
friend2_likes_action = False
friend3_likes_drama = False
if friend1_likes_comedy or friend2_likes_action or friend3_likes_drama:
choose a movie()
SUMMARY
In this reading, you delved into the most frequently used operator and the concept of conditional
branching, which encompasses the utilization of if statements and if- else statements.

PRACTICE QUIZ: CONDITIONS AND BRANCHING
Question 1
What is the outcome of the following? 1=2
o ValueError: invalid literal for int()
o SyntaxError:can't assign to literal
o True
o False
Correct: This statement results in a syntax error.
Question 2
What is the output of the following code segment? i=6
i<5
o False
o True
Correct: 6 is not less than 5.
Question 3
True or False. The equality operator is case-sensitive in the following code segment.
‘a’==‘A’
o True
o False
Correct: The equality operator is case-sensitive.
Question 4
Which of the following best describes the purpose of ‘elif’ statement in a conditional
structure?
o It describes the end of a conditional structure.
o It describes a condition to test for if any one of the conditions has not been met.
o It defines the condition in case the preceding conditions in the if statement are not
fulfilled.
o It describes a condition to test if all other conditions have failed.
Correct! You can use the ‘elif' ‘' statement only when you do not meet any of the prior
conditions.

VIDEO 011: LOOPS (6:45)
In the video we will cover Loops in particular for loops and while loops. We will use many visual
examples in the video. See the labs for examples with data.
Figure 248
Before we talk about loops, let's go over the range function. The range function outputs and
ordered sequence as a list I. If the input is a positive integer, the output is a sequence. The
sequence contains the same number of elements as the input but starts at zero.
Figure 249
For example, if the input is three the output is the sequence zero, one, two.
Figure 250
If the range function has two inputs where the first input is smaller than the second input, the
output is a sequence that starts at the first input. Then the sequence iterates up to but not
including the second number.
Figure 251

For the input 10 and 15 we get the following sequence. See the labs for more capabilities of the
range function. Please note, if you use Python three, the range function will not generate a list
explicitly like in Python two.
Figure 252
FOR LOOPS
In this section, we will cover for loops. We will focus on lists, but many of the procedures can be
used on tuples.
Figure 253
Loops perform a task over and over. Consider the group of colored squares. Let's say we would
like to replace each colored square with a white square. Let's give each square a number to make
things a little easier and refer to all the group of squares as squares. If we wanted to tell
someone to replace squares zero with a white square, we would say equals replace square zero
with a white square.
Figure 254

or we can say four squares zero in squares square zero equals white square.
Figure 255
Similarly, for the next square we can say for square one in squares, square one equals white
square.
Figure 256
For the next square we can say for square two in squares, square two equals white square.
Figure 257
We repeat the process for each square.
Figure 258

The only thing that changes is the index of the square we are referring to.
Figure 259
If we're going to perform a similar task in Python we cannot use actual squares.
Figure 260
RANGE FUNCTION
So let's use a list to represent the boxes. Each element in the list is a string representing the
color. We want to change the name of the color in each element to white. Each element in the
list has the following index. This is a syntax to perform a loop in Python. Notice the indent.
Figure 261

The range function generates a list. The code will simply repeat everything in the indent five
times. If you were to change the value to six it would do it 6 times. However, the value of I is
incremented by one each time. In this segment we change the I element of the list to the string
white.
Figure 262
The value of I is set to zero. Each iteration of the loop starts at the beginning of the indent. We
then run everything in the indent.
Figure 263
The first element in the list is set to white.
Figure 264

We then go to the start of the indent, we progress down each line. When we reach the line to
change the value of the list, we set the value of index one to white.
Figure 265
The value of I increases by one. We repeat the process for index two.
Figure 266
process continues for the next index, until we've reached the final element.
Figure 267

We can also iterate through a list or tuple directly in python. We do not even need to use indices.
Here is the list squares. Each iteration of the list we pass one element of the list squares to the
variable square. Lets display the value of the variable square on this section. For the first
iteration, the value of square is red.
Figure 268
We then start the second iteration. For the second iteration, the value of square is yellow. We
then start the third iteration.
Figure 269
For the final iteration, the value of square is green.
Figure 270

ENUMERATE
A useful function for iterating data is enumerate. It can be used to obtain the index and the
element in the list. Let's use the box analogy with the numbers representing the index of each
square. This is the syntax to iterate through a list and provide the index of each element. We use
the list squares and use the names of the colors to represent the colored squares. The argument
of the function enumerate is the list. In this case squares.
Figure 271
The variable I is the index and the variable square is the corresponding element in the list.
Figure 272
Let's use the left part of the screen to display the different values of the variable square and I for
the various iterations of the loop.
Figure 273

For the first iteration, the value of the variable is red corresponding to the zeroth index, and
the value for I is zero.
Figure 274
For the second iteration. The value of the variable square is yellow, and the value of I
corresponds to its index i.e. 1.
Figure 275
We repeat the process for the last index.
WHILE LOOPS
While loops are similar to for loops but instead of executing a statement a set number of times
a while loop will only run if a condition is met.
Figure 276

Let's say we would like to copy all the orange squares from the list squares to the list New
squares, but we would like to stop if we encounter a non-orange square.
Figure 277
We don't know the value of the squares beforehand. We would simply continue the process
while the square is orange or see if the square equals orange.
Figure 278
If not, we would stop. For the first example, we would check if the square was orange.
Figure 279

It satisfies the conditions so we would copy the square.
Figure 280
We repeat the process for the second square. The condition is met.
Figure 281
So, we copy the square.
Figure 282

In the next iteration, we encounter a purple square. The condition is not met. So, we stop the
process. This is essentially what a while loop does.
Figure 283
Let's use the figure on the left to represent the code. We will use a list with the names of the
color to represent the different squares.
Figure 284
We create an empty list of new squares. In reality the list is of indeterminate size.
We start the index at zero the while statement will repeatedly execute the statements
within the indent until the condition inside the bracket is false.
Figure 285

We append the value of the first element of the list squares to the list new squares. We increase
the value of I by one.
Figure 286
We append the value of the second element of the list squares to the list new squares.
Figure 287
We increment the value of I. Now the value in the array squares is purple; therefore, the condition
for the while statement is false and we exit the loop.
Figure 288
Check out the labs for more examples of loop many with real data.

READING: LOOPS
INTRODUCTION TO LOOPS IN PYTHON
Objectives
1. Understand Python loops.

2. How the loop Works
3. Learn about the needs for loop
4. Utilize Python's Range function.
5. Familiarize with Python's enumerate function.
6. Apply while loops for conditional tasks.
7. Distinguish appropriate loop selection.
What is a Loop?
In programming, a loop is like a magic trick that allows a computer to do something over and
over again. Imagine you are a magician's assistant, and your magician friend asks you to pull a
rabbit out of a hat, but not just once - they want you to keep doing it until they tell you to stop.
That is what loops do for computers - they repeat a set of instructions as many times as needed.
How Loop works?
Here's how it works in Python:
Figure 289

• Start: The for loop begins with the keyword for, followed by a variable that will take on
each value in a sequence.
• Condition: After the variable, you specify the keyword in and a sequence, such as a list
or a range, that the loop will iterate through.
• If Condition True:
1. The loop takes the first value from the sequence and assigns it to the variable.
2. The indented block of code following the loop header is executed using this value.
3. The loop then moves to the next value in the sequence and repeats the process until
all values have been used.
• Statement: Inside the indented block of the loop, you write the statements that you want
to repeat for each value in the sequence.
• Repeat: The loop continues to repeat the block of code for each value in the
• sequence until there are no more values left.
• If Condition False:
1. Once all values in the sequence have been processed, the loop terminates
automatically.
2. The loop completes its execution, and the program continues to the next statement
after the loop.
The Need for Loops
Think about when you need to count from 1 to 10. Doing it manually is easy, but what if you had
to count to a million? Typing all those numbers one by one would be a nightmare! This is where
loops come in handy. They help computers repeat tasks quickly and accurately without getting
tired.
Main Types of Loops
For Loops
For loops are like a superhero's checklist. A for loop in programming is a control structure that
allows the repeated execution of a set of statements for each item in a sequence, such as
elements in a list or numbers in a range, enabling efficient iteration and automation of tasks
Syntax of for loop
for val in sequence:

# statement(s) to be executed in sequence as a part of the loop.
Here is an example of For loop:

Imagine you're a painter, and you want to paint a beautiful rainbow with seven colors. Instead of
picking up each color one by one and painting the rainbow, you could tell a magical painter's
assistant to do it for you. This is what a basic for loop does in programming.
We have a list of colors.
colors = ["red", "orange", "yellow", "green", "blue", "indigo", "violet"]

Let's print the color’s name in the new line using for loop.
for color in colors:

(color)
In this example, the for loop picks each color from the colors list and prints it on the screen. You
don't have to write the same code for each color - the loop does it automatically!
Sometimes you do not want to paint a rainbow, but you want to count the number of steps to
reach your goal. A range-based for loop is like having a friendly step counter that helps you
reach your target.
Here is how you might use a for loop to count from 1 to 10:
for number in range(1, 11):

print(number)
Here, the range(1, 11) generates a sequence from 1 to 10, and the for loop goes through
each number in that sequence, printing it out. It's like taking 10 steps, and you're guided by the
loop!
Range Function
The range function in Python generates an ordered sequence that can be used in loops. It takes
one or two arguments:
• If given one argument (e.g., range(11)), it generates a sequence starting from 0 up to (but
not including) the given number.
for number in range(11):

print(number)
• If given two arguments (e.g., range(1, 11)), it generates a sequence starting from the
first argument up to (but not including) the second argument.
for number in range(1, 11):

print(number)
The Enumerated For Loop

Have you ever needed to keep track of both the item and its position in a list? An enumerated
for loop comes to your rescue. It's like having a personal assistant who not only hands you the
item but also tells you where to find it.
Consider this example:
fruits = ["apple", "banana", "orange"]

for index, fruit in enumerate(fruits):
print(f"At position {index}, I found a {fruit}")
With this loop, you not only get the fruit but also its position in the list. It's as if you have a
magical guide pointing out each fruit's location!

While Loops
While loops are like a sleepless night at a friend's sleepover. Imagine you and your friends keep
telling ghost stories until someone decides it's time to sleep. As long as no one says, "Let's sleep"
you keep telling stories. A while loop works similarly - it repeats a task as long as a certain
condition is true. It's like saying, "Hey computer, keep doing this until I say stop!"
Basic syntax of While Loop.
# Code to be executed while the condition is true

# Indentation is crucial to indicate the scope of the loop
For example, here's how you might use a while loop to count from 1 to 10:
count = 1
while count <= 10:
print(count)
count += 1
here's a breakdown of the above code.

1. There is a variable named count initialized with the value 1.
2. The while loop is used to repeatedly execute a block of code as long as a given
condition is True. In this case, the condition is count <= 10, meaning the loop will
continue as long as count is less than or equal to 10.
3. Inside the loop:
o The print(count) statement outputs the current value of the count variable.
o The count += 1 statement increments the value of count by 1. This step ensures
that the loop will eventually terminate when count becomes greater than 10.
4. The loop will continue executing as long as the condition count <= 10 is satisfied.
5. The loop will print the numbers 1 to 10 in consecutive order since the print statement is
inside the loop block and executed during each iteration.
6. Once count reaches 11, the condition count <= 10 will evaluate to False, and the loop will
terminate.
7. The output of the code will be the numbers 1 to 10, each printed on a separate line.
The Loop Flow
Both for and while loops have their special moves, but they follow a pattern:
• Initialization: You set up things like a starting point or conditions.
• Condition: You decide when the loop should keep going and when it should stop.
• Execution: You do the task inside the loop.
• Update: You make changes to your starting point or conditions to move forward.
• Repeat: The loop goes back to step 2 until the condition is no longer true.

When to Use Each
For Loops: Use for loops when you know the number of iterations in advance and want to
process each element in a sequence. They are best suited for iterating over collections and
sequences where the length is known.
While Loops: Use while loops when you need to perform a task repeatedly as long as a certain
condition holds true. While loops are particularly useful for situations where the number of
iterations is uncertain or where you're waiting for a specific condition to be met.
SUMMARY
In this adventure into coding, we explored loops in Python - special tools that help us do things
over and over again without getting tired. We met two types of loops: "for loops" and
"while loops."
For Loops were like helpers that made us repeat tasks in order. We painted colors, counted
numbers, and even got a helper to tell us where things were in a list. For loops made our job
easier and made our code look cleaner.
While Loops were like detectives that kept doing something as long as a rule was true. They
helped us take steps, guess numbers, and work until we were tired. While loops were like smart
assistants that didn't stop until we said so.

HANDS-ON LAB: LOOPS
LOOPS IN PYTHON

OBJECTIVES
• work with the loop statements in Python, including for-loop and while-loop.
LOOPS IN PYTHON
Welcome! This notebook will teach you about the loops in the Python Programming Language.
By the end of this lab, you'll know how to use the loop statements in Python, including for loop,
and while loop.
TABLE OF CONTENTS
• Loops
o Range
o What is for loop?
o What is while loop?
• Quiz on Loops
Loops
Range
Sometimes, you might want to repeat a given operation many times. Repeated executions like
this are performed by loops. We will look at two types of loops, for loops and while
loops.
Before we discuss loops lets discuss the range object. It is helpful to think of the range object as
an ordered list. For now, let's look at the simplest case. If we would like to generate an object
that contains elements ordered from 0 to 2 we simply use the following command:
# Use the range

range(3)
range(0, 3)
Figure 290
NOTE: While in Python 2.x it returned a list as seen in video lessons, in 3.x it
returns a range object.

What is for loop?
The for loop enables you to execute a code block multiple times. For example, you would use
this if you would like to print out every element in a list. Let's try to use a for loop to
print all the years presented in the list dates:
This can be done as follows:
# For loop example

dates = [1982,1980,1973]
N = len(dates)
for i in range(N):
print(dates[i])
1982
1980
1973
The code in the indent is executed N times, each time the value of i is increased by 1 for every
execution. The statement executed is to print out the value in the list at index i as shown here:
Figure 291
In this example we can print out a sequence of numbers from 0 to 7:
# Example of for loop

for i in range(0, 8):
print(i)
0
1
2
3
4
5
6
7
In Python we can directly access the elements in the list as follows:
# Example of for loop, loop through list

for year in dates:
print(year)
1982
1980
1973

For each iteration, the value of the variable year behaves like the value of dates[i] in the first
example:
Figure 292
Figure 293
Figure 294
Figure 295

We can change the elements in a list:
# Use for loop to change the elements in list

squares = ['red', 'yellow', 'green', 'purple', 'blue']
for i in range(0, 5):

print("Before square ", i, 'is', squares[i])
squares[i] = 'white'
print("After square ", i, 'is', squares[i])
Before square 0 is red
After square 0 is white
Before square 1 is yellow
Before square 2 is green
Before square 3 is purple
Before square 4 is blue
We can access the index and the elements of a list as follows:
# Loop through the list and iterate on both index and element value
squares=['red', 'yellow', 'green', 'purple', 'blue']
for i, square in enumerate(squares):
print(i, square)
red
yellow
green
purple
blue
What is while loop?

As you can see, the for loop is used for a controlled flow of repetition. However, what if we don't
know when we want to stop the loop? What if we want to keep executing a code block until a
certain condition is met? The while loop exists as a tool for repeated execution based on a
condition. The code block will keep being executed until the given logical condition returns a
False boolean value.
Here's how a while loop works:
• First, you specify a condition that the loop will check before each iteration (repetition)
of the code block.
• If the condition is initially true, the code block is executed.
• After executing the code block, the condition is checked again.
• If the condition is still true, the code block is executed again.
• Steps 3 and 4 repeat until the condition becomes false.
• Once the condition becomes false, the loop stops, and the program continues with
the next line of code after the loop.
Here's an example of a while loop that prints numbers from 1 to 5:
count = 1
while count <= 5: print(count) count += 1
1
2
3
4
5

In this example, the condition count <= 5 is checked before each iteration. As long as count is
less than or equal to 5, the code block inside the loop is executed. After each iteration, the value
of count is incremented by 1 using count += 1. Once count reaches 6, the condition becomes false,
and the loop stops.
Let’s say we would like to iterate through list dates and stop at the year 1973, then print out the
number of iterations. This can be done with the following block of code:
# While Loop Example

dates = [1982, 1980, 1973, 2000]
i = 0
year = dates[0]
while(year != 1973):
print(year)
i = i + 1
year = dates[i]
print("It took ", i ,"repetitions to get out of loop.")

1982
1980
It took 2 repetitions to get out of loop.
A while loop iterates merely until the condition in the argument is not met, as shown in the
following figure:
Figure 296
Key point of While Loop:

1. A while loop repeatedly executes a block of code as long as a given condition is true.
2. It does not have a fixed number of iterations but continues executing until the
condition becomes false.
3. The condition is checked before each iteration, and if it's false initially, the code
block is skipped entirely.
4. The condition is typically based on a variable or expression that can change during the
execution of the loop.
5. It provides more flexibility in terms of controlling the loop's execution based on
dynamic conditions.

Key point of For Loop:
1. A for loop iterates over a sequence (such as a list, string, or range) or any object that
supports iteration.
2. It has a predefined number of iterations based on the length of the sequence or the
number of items to iterate over.
3. It automatically handles the iteration and does not require maintaining a separate
variable for tracking the iteration count.
4. It simplifies the code by encapsulating the iteration logic within the loop itself.
5. It is commonly used when you know the exact number of iterations or need to iterate
over each item in a collection.
Practise Exercises on Loops

Write a for loop the prints out all the element between -5 and 5 using the range function.

for i in range(-4, 5):
print(i)
-4
-3
-2
-1
0
1
2
3
4
Print the elements of the following list: Genres=[ 'rock', 'R&B', 'Soundtrack',
'R&B', 'soul', 'pop'] Make sure you follow Python conventions.

Genres = ['rock', 'R&B', 'Soundtrack', 'R&B', 'soul', 'pop']
for Genre in Genres:
print(Genre)
rock
R&B
Soundtrack
R&B
soul
pop
Write a for loop that prints out the following list: squares=['red', 'yellow', 'green',
'purple', 'blue']
squares=['red', 'yellow', 'green', 'purple', 'blue']

for square in squares:
print(square)
red
yellow
green
purple
blue

Write a while loop to display the values of the Rating of an album playlist stored in the list
PlayListRatings. If the score is less than 6, exit the loop. The list
PlayListRatings is given by: PlayListRatings = [10, 9.5, 10, 8, 7.5, 5, 10,
10]

PlayListRatings = [10, 9.5, 10, 8, 7.5, 5, 10, 10]
i = 0
rating = PlayListRatings[0]
while(i < len(PlayListRatings) and rating >= 6):
print(rating)
i = i + 1
rating = PlayListRatings[i]
10
9.5
10
8
7.5
Write a while loop to copy the strings 'orange' of the list squares to the
list new_squares. Stop and exit the loop if the value on the list is not 'orange':
squares = ['orange', 'orange', 'purple', 'blue ', 'orange']

new_squares = []
i = 0
while(i < len(squares) and squares[i] == 'orange'):
new_squares.append(squares[i])
i = i + 1
print (new_squares)
['orange', 'orange']
Some real-life problems!

Your little brother has just learned multiplication tables in school. Today he has learned tables
of 6 and 7. Help him memorise both the tables by printing them using for loop.
Hint: Write two for loops. One to print the multiplication table of 6 and the other for
7
# Write your code here

print("Multiplication table of 6:")
for i in range (10):
print("6*",i,"=",6*i)
print("Multiplication table of 7:")
for i in range (10):
print("7*",i,"=",7*i)
Multiplication table of 6:
6* 0 = 0
6* 1 = 6
6* 2 = 12
6* 3 = 18
6* 4 = 24
6* 5 = 30
6* 6 = 36
6* 7 = 42
6* 8 = 48
6* 9 = 54
Multiplication table of 7:
7* 0 = 0

7* 1 = 7
7* 2 = 14
7* 3 = 21
7* 4 = 28
7* 5 = 35
7* 6 = 42
7* 7 = 49
7* 8 = 56
7* 9 = 63
The following is a list of animals in a National Zoo. Animals = ["lion", "giraffe",

"gorilla", "parrots", "crocodile","deer", "swan"]
Your brother needs to write an essay on the animals whose names are made of 7 letters. Help
him find those animals through a while loop and create a separate list of such animals.
Hint: Use while loop to iterate over the elements of the list. Use if-else statement
inside the for loop to check the length of each element and if the length is 7, add the
element to a new list.
# Write your code here

Animals = ["lion", "giraffe", "gorilla", "parrots", "crocodile","deer", "swan"]
New = []
i=0
while i<len(Animals):
j=Animals[i]
if(len(j)==7):
New.append(j)
i=i+1
print(New)
['giraffe', 'gorilla', 'parrots']
Congratulations, you have completed lab on loops.

PRACTICE QUIZ: LOOPS
Question 1
What will be the result of the following?
for x in range(0, 3):

print(x)
o
1
2
3
o
0
1
2
3
o
0
1
o
0
1
2
Question 2
What is the output of the following:
for x in ['A','B','C']:
print(x+'A')
o AA
BB
CC
o A
B
C
o A
B
C
A
o AA
BA
CA

Question 3
What is the output of the following:
for i,x in enumerate(['A','B','C']):

print(i,x)
o A0
B1
C2
o AA
BB
CC
o 0A
1B
2C
o 0
1
2

VIDEO 012: FUNCTIONS (13:31)
In the video we will cover functions. You will learn how to use some of Python’s built- in
functions as well as how to build your own functions.
Figure 297
Functions take some input then produce some output or change.
Figure 298
The function is just a piece of code you can reuse. You can implement your own
function, but in many cases, you use other people’s functions.
Figure 299

In this case, you just have to know how the function works and in some cases how to import the
functions.
Figure 300
Let the orange and yellow squares represent similar blocks of code.
Figure 301
We can run the code using some input and get an output. If we define a function to do the task
we just have to call the function. Let the small squares represent the lines
of code used to call the function.
Figure 302

We can replace these long lines of code by just calling the function a few times.
Figure 303
Now we can just call the function.
Figure 304
Our code is much shorter.
Figure 305

The code performs the same task.
Figure 306
You can think of the process like this:
Figure 307
When we call the function f1, we pass an input to the function. These values are passed
to all those lines of code you wrote.
Figure 308

This returns a value. You can use the value.
Figure 309
For example, you can input this value to a new function f2. When we call this new function f2,
the value is passed to another set of lines of code.
Figure 310
The function returns a value.
Figure 311

The process is repeated passing the values to the function you call.
Figure 312
You can save these functions and reuse them or use other people’s functions.
Figure 313

PYTHON BUILT-IN FUNCTIONS
Python has many built-in functions; you don't have to know how those functions work internally,
but simply what task those functions perform.
Figure 314
LEN
The function len takes in an input of type sequence, such as a string or list, or type collection,
such as a dictionary or set, and returns the length of that sequence or collection.
Figure 315
Consider the following list.
Figure 316

The len function takes this list as an argument, and we assign the result to the variable L. The
function determines there are 8 items in the list, then returns the length of the list, in this case,
8.
Figure 317
SUM
The function sum takes in an iterable like a tuple or list and returns the total of all the elements.
Consider the following list. We pass the list into the sum function and assign the result to the
variable S. The function determines the total of all the elements, then returns it, in this case, the
value is 70.
Figure 318
SORTED
There are two ways to sort a list:
1. The first is using the function sorted.
2. We can also use the list method sort. (Methods are similar to functions).
Let's use this as an example to illustrate the difference. The function sorted returns a new sorted
list or tuple.
Consider the list album_ratings. We can apply the function sorted to the list
album_ratings and get a new list sorted_album_rating. The result is a new sorted list.

Figure 319
If we look at the list album_ratings, nothing has changed. Generally, functions take an input,
in this case, a list. They produce a new output, in this instance, a sorted list.
Figure 320
SORT
If we use the method sort, the list album_ratings will change and no new list will be
created. Let's use this diagram to help illustrate the process. In this case, the rectangle
represents the list album_ratings.
Figure 321

When we apply the method sort to the list, the list album_rating changes. Unlike the previous
case, we see that the list album_ratings has changed. In this case, no new list is created.
Figure 322
Figure 323
MAKING FUNCTIONS
Now that we have gone over how to use functions in Python, let’s see how to build our own
functions.
Figure 324

BUILDING FUNCTIONS IN PYTHON
We will now get you started on building your own functions in Python. This is an example of a
function in Python that returns its input value + 1. To define a function, we start with the
keyword def.
Figure 325
The name of the function should be descriptive of what it does.
Figure 326
We have the function formal parameter "a" in parentheses. Followed by a colon.
Figure 327

We have a code block with an indent, for this case, we add 1 to "a" and assign it to b.
Figure 328
We return or output the value for b.
Figure 329
After we define the function, we can call it. The function will add 1 to 5 and return a
6. We can call the function again; this time assign it to the variable "c" The value for c is 11.
Figure 330

UNDERSTANDING FUNCTION CALL PROCESS
Let's explore this further. Let's go over an example when you call a function. It should be noted
that this is a simplified model of Python, and Python does not work like this under the hood. We
call the function giving it an input, 5. It helps to think of the value of 5 as being passed to the
function.
Figure 331
Now the sequences of commands are run, the value of "a" is 5. "b" would be assigned a value of
6. We then return the value of b, in this case, as b was assigned a value of 6, the function returns
a 6.
Figure 332
If we call the function again, the process starts from scratch; we pass in an 8. The subsequent
operations are performed. Everything that happened in the last call will happen again with a
different value of "a" The function returns a value, in this case, 9. Again, this is just a helpful
analogy.
Figure 333

ADD DOCUMENTATION TO FUNCTIONS
Let’s try and make this function more complex. It's customary to document the function on the
first few lines; this tells anyone who uses the function what it does. This documentation is
surrounded in triple quotes. You can use the help command on the function to display the
documentation as follows. This will print out the function name and the documentation. We will
not include the documentation in the rest of the examples.
Figure 334
MULTIPLE PARAMETERS
A function can have multiple parameters. The function mult multiplies two numbers; in other
words, it finds their product. If we pass the integers 2 and 3, the result is a new integer. If we
pass the integer 10 and the float 3.14, the result is a float 31.4.
Figure 335
MULTIPLYING INTEGER AND STRING IN FUNCTIONS

If we pass in the integer 2 and the string “Michael Jackson,” the string Michael Jackson is repeated
2 times. This is because the multiplication symbol can also mean repeat a sequence. If you
accidentally multiply an integer with a string instead of 2 integers, you won’t get an error.
Instead, you will get a string, and your program will progress, potentially failing later because
you have a string where you expected an integer. This property will make coding simpler, but you
must test your code more thoroughly.
Figure 336

FUNCTIONS WITHOUT A RETURN STATEMENT
In many cases, a function does not have a return statement. In these cases, Python will return
the special “None” object. Practically speaking, if your function has no return statement, you can
treat it as if the function returns nothing at all. The function MJ simply prints the name 'Michael
Jackson’. We call the function. The function prints “Michael Jackson.”
Figure 337
FUNCTIONS WITH AN EMPTY BODY

Let's define the function “NoWork” that performs no task. Python doesn’t allow a function to
have an empty body, so we can use the keyword pass, which doesn’t do anything, but satisfies
the requirement of a non-empty body. If we call the function and print it out, the function returns
a None. In the background, if the return statement is not called, Python will automatically return
a None. It is helpful to view the function NoWork with the following return statement.
Figure 338
FUNCTIONS PERFORMING MULTIPLE TASKS

Usually, functions perform more than one task. This function prints a statement then returns a
value. Let's use this table to represent the different values as the function is called. We call the
function with an input of 2. We find the value of b. The function prints the statement with the
values of a and b. Finally, the function returns the value of b, in this case, 3.
Figure 339

USING LOOPS IN FUNCTIONS
We can use loops in functions. This function prints out the values and indexes of a loop or tuple.
We call the function with the list album ratings as an input. Let's display the list on the right with
its corresponding index. Stuff is used as an input to the function enumerate. This operation will
pass the index to i and the value in the list to “s”. The function would begin to iterate through
the loop. The function will print the first index and the first value in the list. We continue iterating
through the loop. The values of i and s are updated. The print statement is reached. Similarly,
the next values of the list and index are printed. The process is repeated. The values of i and s
are updated. We continue iterating until the final values in the list are printed out.
Figure 340
COLLECTING ARGUMENTS
Variadic parameters allow us to input a variable number of elements. Consider the following
function: the function has an asterisk on the parameter names. When we call the function, three
parameters are packed into the tuple names. We then iterate through the loop; the values are
printed out accordingly.
Figure 341
If we call the same function with only two parameters as inputs, the variable names only contain
two elements. The result is only two values are printed out.
Figure 342

SCOPE
Scope: Global Variables
The scope of a variable is the part of the program where that variable is accessible. Variables
that are defined outside of any function are said to be within the global scope, meaning they can
be accessed anywhere after they are defined. Here we have a function that adds the string DC to
the parameter x. When we reach the part where the value of x is set to AC, this is within the
global scope, meaning x is accessible anywhere after it is defined. A variable defined in the
global scope is called a global variable. When we call the function, we enter a new scope or the
scope of AddDC. We pass as an argument to the AddDC function, in this case, AC. Within the
scope of the function, the value of x is set to ACDC. The function returns the value and is
assigned to z. Within the global scope, the value z is set to ACDC.
Figure 343
After the value is returned, the scope of the function is deleted.
Figure 344

Scope: Local Variables
Local variables only exist within the scope of a function. Consider the function Thriller; the local
variable Date is set to 1982. When we call the function, we create a new scope. Within that scope
of the function, the value of the date is set to 1982. The value of date does not exist within the
global scope.
Figure 345
Variables inside the global scope can have the same name as variables in the local scope with no
conflict. Consider the function Thriller; the local variable Date is set to 1982. The global variable
date is set to 2017. When we call the function, we create a new scope. Within that scope, the
value of the date is set to 1982. If we call the function, it returns the value of Date in the local
scope, in this case, 1982. When we print in the global scope, we use the global variable value.
The global value of the variable is 2017. Therefore, the value is set to 2017.
Figure 346

Scope: Variable
If a variable is not defined within a function, Python will check the global scope. Consider the
function ACDC. The function has the variable Rating, with no value assigned. If we define the
variable Rating in the global scope, then call the function, Python will see there is no value for
the variable Rating. As a result, Python will leave the scope and check if the variable Rating
exists in the global scope. It will use this value of Rating in the global scope within the scope of
ACDC. In the function, will print out a 9. The value of z in the global scope will be 10, as we
added 1. The value of Rating will be unchanged within the global scope.
Figure 347
Consider the function PinkFloyd. If we define the variable ClaimedSales with the keyword
global, the variable will be a global variable. We call the function PinkFloyd. The variable
ClaimedSales is set to the string “45 million” in the global scope. When we print the variable,
we get a value of “45 million.”
Figure 348
There is a lot more you can do with functions. Check out the lab for more examples.

READING: EXPLORING PYTHON FUNCTIONS
EXPLORING PYTHON FUNCTIONS
OBJECTIVES:
By the end of this reading, you should be able to:
1. Describe the function concept and the importance of functions in programming
2. Write a function that takes inputs and performs tasks
3. Use built-in functions like len(), sum(), and others effectively
4. Define and use your functions in Python
5. Differentiate between global and local variable scopes
6. Use loops within the function
7. Modify data structures using functions
INTRODUCTION TO FUNCTIONS
A function is a fundamental building block that encapsulates specific actions or computations.
As in mathematics, where functions take inputs and produce outputs, programming functions
perform similarly. They take inputs, execute predefined actions or calculations, and then return
an output.
Purpose of functions
Functions promote code modularity and reusability. Imagine you have a task that needs to be
performed multiple times within a program. Instead of duplicating the same code at various
places, you can define a function once and call it whenever you need that task. This reduces
redundancy and makes the code easier to manage and maintain.
Benefits of using functions
Modularity: Functions break down complex tasks into manageable components

Reusability: Functions can be used multiple times without rewriting code
Readability: Functions with meaningful names enhance code understanding
Debugging: Isolating functions eases troubleshooting and issue fixing
Abstraction: Functions simplify complex processes behind a user-friendly interface
Collaboration: Team members can work on different functions concurrently
Maintenance: Changes made in a function automatically apply wherever it's used

HOW FUNCTIONS TAKE INPUTS, PERFORM TASKS, AND PRODUCE OUTPUTS
Inputs (Parameters)
Functions operate on data, and they can receive data as input. These inputs are known as
parameters or arguments. Parameters provide functions with the necessary information they
need to perform their tasks. Consider parameters as values you pass to a function, allowing it to
work with specific data.
Performing tasks
Once a function receives its input (parameters), it executes predefined actions or computations.
These actions can include calculations, operations on data, or even more complex tasks. The
purpose of a function determines the tasks it performs. For instance, a function could calculate
the sum of numbers, sort a list, format text, or fetch data from a database.
Producing outputs
After performing its tasks, a function can produce an output. This output is the result of the
operations carried out within the function. It's the value that the function “returns” to the code
that called it. Think of the output as the end product of the function's work. You can use this
output in your code, assign it to variables, pass it to other functions, or even print it out for display.
Example:
Consider a function named calculate_total that takes two numbers as input (parameters), adds
them together, and then produces the sum as the output. Here's how it works:
def calculate_total(a, b): # Parameters: a and b

total = a + b # Task: Addition
return total # Output: Sum of a and b
result = calculate_total(5, 7) # Calling the function with inputs 5 and 7

print(result) # Output: 12
PYTHON'S BUILT-IN FUNCTIONS

Python has a rich set of built-in functions that provide a wide range of functionalities. These
functions are readily available for you to use, and you don't need to be concerned about how
they are implemented internally. Instead, you can focus on understanding what each function
does and how to use it effectively.
Using built-in functions or Pre-defined functions
To use a built-in function, you simply call the function's name followed by parentheses. Any
required arguments or parameters are passed into the function within these parentheses. The
function then performs its predefined task and may return an output you can use in your code.

Here are a few examples of commonly used built-in functions:
Len()
len(): Calculates the length of a sequence or collection
string_length = len("Hello, World!") # Output: 13

list_length = len([1, 2, 3, 4, 5]) # Output: 5
sum()
sum(): Adds up the elements in an iterable (list, tuple, and so on)
total = sum([10, 20, 30, 40, 50]) # Output: 150
max()
max(): Returns the maximum value in an iterable
highest = max([5, 12, 8, 23, 16]) # Output: 23
min()
min(): Returns the minimum value in an iterable
lowest = min([5, 12, 8, 23, 16]) # Output: 5
Python's built-in functions offer a wide array of functionalities, from basic operations like len()
and sum() to more specialized tasks.
DEFINING YOUR FUNCTIONS

Defining a function is like creating your mini-program:
1. Use def followed by the function name and parentheses
Here is the syntax to define a function:
def function_name():
pass
A "pass" statement in a programming function is a placeholder or a no-op (no operation)

statement. Use it when you want to define a function or a code block syntactically but do not
want to specify any functionality or implementation at that moment.
• Placeholder: "pass" acts as a temporary placeholder for future code that you intend to
write within a function or a code block.
• Syntax Requirement: In many programming languages like Python, using "pass" is
necessary when you define a function or a conditional block. It ensures that the code
remains syntactically correct, even if it doesn't do anything yet.
• No Operation: "pass" itself doesn't perform any meaningful action. When the
interpreter encounters “pass”, it simply moves on to the next statement without
executing any code.

Function Parameters:
• Parameters are like inputs for functions

• They go inside parentheses when defining the function
• Functions can have multiple parameters
Example:
def greet(name):
print("Hello, " + name)
result = greet("Alice")
print(result) # Output: Hello, Alice
Docstrings (Documentation Strings)

• Docstrings explain what a function does
• Placed inside triple quotes under the function definition
• Helps other developers understand your function
Example:
def multiply(a, b):

"""
This function multiplies two numbers.
Input: a (number), b (number)
Output: Product of a and b
"""
print(a * b)
multiply(2,6)
Return statement
• Return gives back a value from a function

• Ends the function's execution and sends the result
• A function can return various types of data
Example:
def add(a, b):

return a + b
sum_result = add(3, 5) # sum_result gets the value 8

Understanding scopes and variables
Scope is where a variable can be seen and used:

• Global Scope: Variables defined outside functions; accessible everywhere
• Local Scope: Variables inside functions; only usable within that function
Example:
Part 1: Global variable declaration
global_variable = "I'm global"
This line initializes a global variable called global_variable and assigns it the value "I'm
global".
Global variables are accessible throughout the entire program, both inside and
outside functions.
Part 2: Function definition
def example_function():
local_variable = "I'm local"
print(global_variable) # Accessing global variable
print(local_variable) # Accessing local variable
Here, you define a function called example_function().

Within this function:
• A local variable named local_variable is declared and initialized with the string value
"I'm local." This variable is local to the function and can only be accessed within the
function's scope.
• The function then prints the values of both the global variable (global_variable) and
the local variable (local_variable). It demonstrates that you can access global and
local variables within a function.
Part 3: Function call
example_function()
In this part, you call the example_function() by invoking it. This results in the function's code
being executed.
As a result of this function call, it will print the values of the global and local variables within the
function.
Part 4: Accessing global variable outside the function
print(global_variable) # Accessible outside the function
After calling the function, you print the value of the global variable global_variable outside
the function. This demonstrates that global variables are accessible inside and outside of
functions.
Part 5: Attempting to access local variable outside the function
# print(local_variable) # Error, local variable not visible here

In this part, you are attempting to print the value of the local variable local_variable outside
of the function. However, this line would result in an error.
Local variables are only visible and accessible within the scope of the function where
they are defined.
Attempting to access them outside of that scope would raise a "NameError".
USING FUNCTIONS WITH LOOPS
Functions and loops together
1. Functions can contain code with loops

2. This makes complex tasks more organized
3. The loop code becomes a repeatable function
Example:
def print_numbers(limit):
for i in range(1, limit+1):
print(i)
print_numbers(5) # Output: 1 2 3 4 5
Enhancing code organization and reusability
1. Functions group similar actions for easy understanding

2. Looping within functions keeps code clean
3. You can reuse a function to repeat actions
Example
def greet(name):
return "Hello, " + name
for _ in range(3):
print(greet("Alice"))
MODIFYING DATA STRUCTURE USING FUNCTIONS

You'll use Python and a list as the data structure for this illustration. In this example, you will
create functions to add and remove elements from a list.
Part 1: Initialize an empty list
# Define an empty list as the initial data structure

my_list = []
In this part, you start by creating an empty list named my_list. This empty list serves as the data
structure that you will modify throughout the code.
Part 2: Define a function to add elements
# Function to add an element to the list

def add_element(data_structure, element):
data_structure.append(element)

Here, you define a function called add_element. This function takes two parameters:
• data_structure: This parameter represents the list to which you want to add an
element
• element: This parameter represents the element you want to add to the list
Inside the function, you use the append method to add the provided element to the
data_structure, which is assumed to be a list.
Part 3: Define a function to remove elements
# Function to remove an element from the list

def remove_element(data_structure, element):
if element in data_structure:
data_structure.remove(element)
else:
print(f"{element} not found in the list.")
In this part, you define another function called remove_element. It also takes two parameters:
• data_structure: The list from which we want to remove an element

• element: The element we want to remove from the list
Inside the function, you use conditional statements to check if the element is present in the
data_structure. If it is, you use the remove method to remove the first occurrence of the
element. If it's not found, you print a message indicating that the element was not found in the
list.
Part 4: Add elements to the list
# Add elements to the list using the add_element function

add_element(my_list, 42)
Here, you use the add_element function to add 3 elements (42, 17, and 99) to the my_list.
These are added one at a time using function calls.
Part 5: Print the current list
# Print the current list

print("Current list:", my_list)
This part simply prints the current state of the my_list to the console, allowing us to see the
elements that have been added so far.
Part 6: Remove elements from the list
# Remove an element from the list using the remove_element function

remove_element(my_list, 17)
remove_element(my_list, 55) # This will print a message since 55 is not in the list
In this part, you use the remove_element function to remove elements from the my_list. First,
you attempt to remove 17 (which is in the list), and then you try to remove 55 (which is not in

the list). The second call to remove_element will print a message indicating that 55 was not
found.
Part 7: Print the updated list
# Print the updated list

print("Updated list:", my_list)
Finally, you print the updated my_list to the console. This allows us to observe the
modifications made to the list by adding and removing elements using the defined functions.
CONCLUSION
Congratulations! You've completed the Reading Instruction Lab on Python functions. You've
gained a solid understanding of functions, their significance, and how to create and use them
effectively. These skills will empower you to write more organized, modular, and powerful code
in your Python projects.

HANDS-ON LAB: FUNCTIONS
FUNCTIONS IN PYTHON
Objectives

• Understand functions and variables
• Work with functions and variables
FUNCTIONS IN PYTHON
Welcome! This notebook will teach you about the functions in the Python Programming
Language. By the end of this lab, you'll know the basic concepts about function, variables, and
how to use functions.
Table of Contents
• Functions
o What is a function?
o Variables
o Functions Make Things Simple
• Pre-defined functions
• Using if/else Statements and Loops in Functions
• Setting default argument values in your custom functions
• Global variables
• Scope of a Variable
• Collections and Functions
• Quiz on Loops
Functions
A function is a reusable block of code which performs operations specified in the function. They
let you break down tasks and allow you to reuse your code in different programs.
There are two types of functions :
• Pre-defined functions
• User defined functions
What is a Function?
You can define functions to provide the required functionality. Here are simple rules to define a
function in Python:
• Functions blocks begin def followed by the function name and parentheses ().
• There are input parameters or arguments that should be placed within these
parentheses.

• You can also define parameters inside these parentheses.
• There is a body within every function that starts with a colon (:) and is indented.
• You can also place documentation before the body.
• The statement return exits a function, optionally passing back a value.
An example of a function that adds on to the parameter a prints and returns the output as
b:
# First function example: Add 1 to a and store as b

def add(a):
"""
add 1 to a """
b = a + 1
print(a, "if you add one", b)
return(b)
The figure below illustrates the terminology:
Figure 349
We can obtain help about a function :
# Get a help on add function

help(add)
Help on function add in module main :
add(a)
add 1 to a
We can call the function:
# Call the function add()

add(1)
1 if you add one 2
2
If we call the function with a new input we get a new result:
# Call the function add() add(2)

2 if you add one 3
3

We can create different functions. For example, we can create a function that multiplies
two numbers. The numbers will be represented by the variables a and b:
# Define a function for multiple two numbers
def Mult(a, b):

c = a * b
return(c)
print('This is not printed')
result = Mult(12,2)
print(result)
24
The same function can be used for different data types. For example, we can multiply two
integers:
# Use mult() multiply two integers

Mult(2, 3)
6
Note how the function terminates at the return statement, while passing back a value. This
value can be further assigned to a different variable as desired.
The same function can be used for different data types. For example, we can multiply two
integers:
Two Floats:
# Use mult() multiply two floats

Mult(10.0, 3.14)
31.400000000000002
We can even replicate a string by multiplying with an integer:
# Use mult() multiply two different type values together

Mult(2, "Michael Jackson ")
'Michael Jackson Michael Jackson '
Variables
The input to a function is called a formal parameter.

A variable that is declared inside a function is called a local variable. The parameter only exists
within the function (i.e. the point where the function starts and stops).
A variable that is declared outside a function definition is a global variable, and its value is
accessible and modifiable throughout the program. We will discuss more about global variables
at the end of the lab.
# Function Definition
def square(a):
# Local variable b
b = 1
c = a * a + b
print(a, "if you square + 1", c)
return(c)

The labels are displayed in the figure:
Figure 350
We can call the function with an input of 3:
# Initializes Global variable
x = 3
# Makes function call and return function a y y = square(x)
y
3 if you square + 1 10
10
We can call the function with an input of 2 in a different manner:
# Directly enter a number as parameter

square(2)
2 if you square + 1 5
5
If there is no return statement, the function returns None. The following two functions are
equivalent:
# Define functions, one with return value None and other without return value
def MJ():
print('Michael Jackson')
def MJ1():
print('Michael Jackson') return(None)
# See the output

MJ()
Michael Jackson
# See the output

MJ1()
Michael Jackson

Printing the function after a call reveals a None is the default return statement:
# See what functions returns are

print(MJ())
print(MJ1())
Michael Jackson
None
Michael Jackson
None
Create a function con that concatenates two strings using the addition operation:
# Define the function for combining strings

def con(a, b):
return(a + b)
# Test on the con() function

con("This ", "is")
'This is'
[Tip] How do I learn more about the pre-defined functions in Python?

We will be introducing a variety of pre-defined functions to you as you learn more about Python. There are
just too many functions, so there's no way we can teach them all in one sitting. But if you'd like to take a quick
peek, here's a short reference card for some of the commonly-used pre-defined functions: Reference
Functions Make Things Simple
Consider the two lines of code in Block 1 and Block 2: the procedure for each block is identical.
The only thing that is different is the variable names and values.
Block 1:
# a and b calculation block1

a1 = 4
b1 = 5
c1 = a1 + b1 + 2 * a1 * b1 - 1
if(c1 < 0):
c1 = 0
else:
c1 = 5
c1
5
Block 2:
# a and b calculation block2

a2 = 0
b2 = 0
c2 = a2 + b2 + 2 * a2 * b2 - 1
if(c2 < 0):
c2 = 0
else:
c2 = 5
c2
0

We can replace the lines of code with a function. A function combines many instructions into a
single line of code. Once a function is defined, it can be used repeatedly. You can invoke the same
function many times in your program. You can save your function and use it in another program
or use someone else’s function. The lines of code in code Block 1 and code Block 2 can be
replaced by the following function:
# Make a Function for the calculation above

def Equation(a,b):
c = a + b + 2 * a * b - 1
if(c < 0):
c = 0
else:
c = 5
return(c)
This function takes two inputs, a and b, then applies several operations to return c. We simply
define the function, replace the instructions with the function, and input the new values of a1,
b1 and a2, b2 as inputs. The entire process is demonstrated in the figure:
Figure 351
Code Blocks 1 and Block 2 can now be replaced with code Block 3 and code Block 4.
Block 3:
a1 = 4
b1 = 5
c1 = Equation(a1, b1) c1
5
Block 4:
a2 = 0
b2 = 0
c2 = Equation(a2, b2)
c2
0
Pre-defined functions
There are many pre-defined functions in Python, so let's start with the simple ones.

Print()
The print() function:
# Build-in function print()

album_ratings = [10.0, 8.5, 9.5, 7.0, 7.0, 9.5, 9.0, 9.5]
print(album_ratings)
[10.0, 8.5, 9.5, 7.0, 7.0, 9.5, 9.0, 9.5]
Sum()
The sum() function adds all the elements in a list or tuple:
# Use sum() to add every element in a list or tuple together

sum(album_ratings)
70.0
Len()
The len() function returns the length of a list or tuple:
# Show the length of the list or tuple

len(album_ratings)
8
Using if/else Statements and Loops in Functions
The return() function is particularly useful if you have any IF statements in the function, when
you want your output to be dependent on some condition:
# Function example
def type_of_album(artist, album, year_released):
print(artist, album, year_released)
if year_released > 1980:
return "Modern"
else:
return "Oldie"
x = type_of_album("Michael Jackson", "Thriller", 1980)

print(x)
Michael Jackson Thriller 1980
Oldie
We can use a loop in a function. For example, we can print out each element in a list:
# Print the list using for loop

def PrintList(the_list):
for element in the_list:
print(element)
# Implement the printlist function PrintList(['1', 1, 'the man', "abc"])
1
1
the man abc

String comparison in Functions
The relational operators compare the Unicode values of the characters of the strings from the
zeroth index till the end of the string. It then returns a boolean value according to the operator
used.
#Compare Two Strings Directly using in operator

# add string
string= "Michael Jackson is the best"
# Define a funtion
def check_string(text):
# Use if else statement and 'in' operatore to compare the string

if text in string:
return 'String matched'
else:
return 'String not matched'
check_string("Michael Jackson is the best")
This program uses a user-defined function named compareStrings() to compare two strings.
This function receives both strings as its argument and returns 1 if both strings are equal using
== operator
#Compare two strings using == operator and function

def compareStrings(x, y):
# Use if else statement to compare x and y
if x==y:
return 1
# Declare two different variables as string1 and string2 and pass string in it
string1 = "Michael Jackson is the best"
string2 = "Michael Jackson is the best"
# Declare a variable to store result after comparing both the strings

check = compareStrings(string1, string2)
#Use if else statement to compare the string

if check==1:
print("\nString Matched")
else:
print("\nString not Matched")
String Matched
Count the Frequency of Words Appearing in a String Using a Dictionary.

Find the count of occurrence of any word in our string using python. This is what we are going
to do in this section: count the number of word in a given string and print it.
Lets suppose we have a ‘string’ and the ‘word’ and we need to find the count of occurrence of
this word in our string using python.
(1) The first thing, we will do is define a function and and then create a list that will be empty
initially. Next, we will add a code to convert the string to a list. Python string has a split()
method. It takes a string and some separator to return a list. Now we will declare an empty
dictionary.
Next we will add code using for loop to iterate the words and value will will count the frequency
of each words in the string and store them to the dictionary.
Finally we will print the dictionary.

# Python Program to Count words in a String using Dictionary
def freq(string):
#step1: A list variable is declared and initialized to an empty list.

words = []
#step2: Break the string into list of words

words = string.split() # or string.lower().split()
#step3: Declare a dictionary

Dict = {}
#step4: Use for loop to iterate words and values to the dictionary
for key in words:
Dict[key] = words.count(key)
#step5: Print the dictionary

print("The Frequency of words is:",Dict)
#step6: Call function and pass string in it

freq("Mary had a little lamb Little lamb, little lamb Mary had a little lamb.Its
fleece was white as snow And everywhere that Mary went Mary went, Mary went \
Everywhere that Mary went The lamb was sure to go")
The Frequency of words is: {'Mary': 6, 'had': 2, 'a': 2, 'little': 3, 'lamb': 3, 'Little': 1,
'lamb,': 1, 'lamb.Its': 1, 'fleece': 1, 'was': 2, 'white': 1, 'as': 1, 'snow': 1, 'And': 1,
'everywhere': 1, 'that': 2, 'went': 3, 'went,': 1, 'Everywhere': 1, 'The': 1, 'sure': 1, 'to': 1,
'go': 1}
Setting default argument values in your custom functions
You can set a default value for arguments in your function. For example, in the
isGoodRating() function, what if we wanted to create a threshold for what we consider to be a
good rating? Perhaps by default, we should have a default rating of 4:
# Example for setting param with default value
def isGoodRating(rating=4):
if(rating < 7):
print("this album sucks it's rating is",rating)
else:
print("this album is good its rating is",rating)
# Test the value with default value and with input
isGoodRating()
isGoodRating(10)
this album sucks it's rating is 4
this album is good its rating is 10

Global variables
So far, we've been creating variables within functions, but we have not discussed variables
outside the function. These are called global variables. Let's try to see what
printer1 returns:
# Example of global variable

artist = "Michael Jackson"
def printer1(artist):
internal_var1 = artist
print(artist, "is an artist")
printer1(artist)
# try runningthe following code
#printer1(internal_var1)
Michael Jackson is an artist
We got a Name Error: name 'internal_var' is not defined. Why?
It's because all the variables we create in the function is a local variable, meaning that the
variable assignment does not persist outside the function.
But there is a way to create global variables from within a function as follows:
artist = "Michael Jackson"

def printer(artist):
global internal_var
internal_var= "Whitney Houston"
print(artist,"is an artist")
printer(artist)
printer(internal_var)
Michael Jackson is an artist
Whitney Houston is an artist
Scope of a Variable
The scope of a variable is the part of that program where that variable is accessible. Variables
that are declared outside of all function definitions, such as the myFavouriteBand
variable in the code shown here, are accessible from anywhere within the program. As a result,
such variables are said to have global scope, and are known as global variables.
myFavouriteBand is a global variable, so it is accessible from within the getBandRating function,
and we can use it to determine a band's rating. We can also use it outside of the function, such as
when we pass it to the print function to display it:
# Example of global variable

myFavouriteBand = "AC/DC"
def getBandRating(bandname):
if bandname == myFavouriteBand:
return 10.0
else:
return 0.0
print("AC/DC's rating is:", getBandRating("AC/DC"))
print("Deep Purple's rating is:",getBandRating("Deep Purple"))
print("My favourite band is:", myFavouriteBand)
AC/DC's rating is: 10.0
Deep Purple's rating is: 0.0
My favourite band is: AC/DC
Take a look at this modified version of our code. Now the myFavouriteBand variable is defined
within the getBandRating function. A variable that is defined within a function is said to be a
local variable of that function. That means that it is only accessible from within the
function in which it is defined. Our getBandRating function will still work, because
myFavouriteBand is still defined within the function. However, we can no
longer print myFavouriteBand outside our function, because it is a local variable of our
getBandRating function; it is only defined within the getBandRating function:
# Deleting the variable "myFavouriteBand" from the previous example to

demonstrate an example of a local variable
del myFavouriteBand
# Example of local variable
return 10.0
else:
return 0.0
print("AC/DC's rating is: ", getBandRating("AC/DC"))

print("Deep Purple's rating is: ", getBandRating("Deep Purple"))
print("My favourite band is", myFavouriteBand)

---------------------------------------------------------------------------
Cell In[37], line 16
14 print("AC/DC's rating is: ", getBandRating("AC/DC"))
15 print("Deep Purple's rating is: ", getBandRating("Deep Purple"))
---> 16 print("My favourite band is", myFavouriteBand)
NameError: name 'myFavouriteBand' is not defined
Finally, take a look at this example. We now have two myFavouriteBand variable definitions.
The first one of these has a global scope, and the second of them is a local variable within the
getBandRating function. Within the getBandRating function, the local variable takes
precedence. Deep Purple will receive a rating of 10.0 when passed to the getBandRating
function. However, outside of the getBandRating function, the getBandRating s local
variable is not defined, so the myFavouriteBand variable we print is the global variable, which
has a value of AC/DC:
# Example of global variable and local variable with the same name
myFavouriteBand = "Deep Purple"
return 10.0
else:
return 0.0
print("AC/DC's rating is:",getBandRating("AC/DC"))

print("Deep Purple's rating is: ",getBandRating("Deep Purple"))
print("My favourite band is:",myFavouriteBand)

My favourite band is: AC/DC

Collections and Functions
When the number of arguments are unknown for a function, They can all be packed into a tuple
as shown:
def printAll(*args): # All the arguments are 'packed' into args which can be treated
like a tuple
print("No of arguments:", len(args))
for argument in args:
print(argument)
#printAll with 3 arguments
printAll('Horsefeather','Adonis','Bone')
#printAll with 4 arguments
printAll('Sidecar','Long Island','Mudslide','Carriage')
No of arguments: 3
Horsefeather
Adonis
Bone
No of arguments: 4
Sidecar
Long Island
Mudslide
Carriage
Similarly, The arguments can also be packed into a dictionary as shown:
def printDictionary(**args):
for key in args:
print(key + " : " + args[key])
printDictionary(Country='Canada',Province='Ontario',City='Toronto')
Country : Canada
Province : Ontario
City : Toronto
Functions can be incredibly powerful and versatile. They can accept (and return) data types,
objects and even other functions as arguements. Consider the example below:
def addItems(list):
list.append("Three")
list.append("Four")
myList = ["One","Two"]
addItems(myList)
myList
['One', 'Two', 'Three', 'Four']
Note how the changes made to the list are not limited to the functions scope. This occurs as it is
the lists reference that is passed to the function - Any changes made are on the original instance
of the list. Therefore, one should be cautious when passing mutable objects into functions.

Quiz on Functions
Come up with a function that divides the first input by the second input:
def div(a, b):

return(a/b)
Use the function con for the following question.
def con(a, b):
return(a + b)
Can the con function we defined before be used to add two integers or strings?
#yes, it can. Example:
con(2, 2)
4
Can the con function we defined before be used to concatenate lists or tuples?

#Yes, for example:
con(['a', 1], ['b', 1])

['a', 1, 'b', 1]
Write a function code to find total count of word little in the given string: "Mary had a
little lamb Little lamb, little lamb Mary had a little lamb.Its fleece
was white as snow And everywhere that Mary went Mary went, Mary went
Everywhere that Mary went The lamb was sure to go"**
# Python Program to Count words in a String using Dictionary

def freq(string,passedkey):
#step1: A list variable is declared and initialized to an empty list.
words = []
#step2: Break the string into list of words
words = string.split() # or string.lower().split()
#step3: Declare a dictionary
Dict = {}
#step4: Use for loop to iterate words and values to the dictionary
for key in words:
if(key == passedkey):
Dict[key] = words.count(key)
#step5: Print the dictionary
print("Total Count:",Dict)
#step6: Call function and pass string in it
freq("Mary had a little lamb Little lamb, little lamb Mary had a little lamb.Its
fleece was white as snow And everywhere that Mary went Mary went, Mary went \
Everywhere that Mary went The lamb was sure to go","little")
Total Count: {'little': 3}
The last exercise!


PRACTICE QUIZ: FUNCTIONS
Question 1
What does the following function return: len(['A','B',1]) ?
o 1
o 2
o 4
o 3
Correct: The function returns the number of elements in the list; in this case, the number of
elements is 3.
Question 2
What does the following function return: len([sum([0,0,1])]) ?
o Error
o 0
o 3
o 1
Correct: The function returns the length of the sum of the elements in the list.
Question 3
What is the value of list L after the following code segment
is run?
L=[1,3,2]
sorted(L)
o [3,2,1]
o [0,0,0]
o [1,3,2]
o [1,3,2]
Correct: sorted is a function that returns a new list. It does not change the list L.

Question 4
What result does the following code produce?
def print_function(A):
for a in A:
print(a + '1')
print_function(['a', 'b', 'c'])
o abc1
o a1
b1
c1
o a1
o a
b
c
Correct: The function concatenates the string with the number 1.
Question 5
What is the output of the following lines of code?
def Print(A):
for a in A:
print(a+'1')
Print(['a','b','c'])
o a1
o a1 b1 c1
o abc
Correct: Correct, the function concatenates a string.

VIDEO 013: EXCEPTION HANDLING 3:49
Hello. Welcome to Exception Handling.
Figure 352
WHAT YOU WILL LEARN

After watching the video, you will be able to:
• Explain Exception Handling,
• Demonstrate the use of exception handling, and
• Understand the basics of exception handling.
Figure 353
INTRODUCTION
Have you ever mistakenly entered a number when you were supposed to enter text in an input
field? Most of us have either in error or when testing out a program, but do you know why it gave
an error message instead of completing and terminating the program? In order for the error
message to appear an event was triggered in the background. This event was activated because
the program tried to perform a computation on the name entry and realized the entry contained
numbers and not letters.
By encasing this code in an exception handler, the program knew how to deal with this type of
error and was able to output the error message to continue along with the program.
This is one of many errors that can happen when asking for user input, so let’s see how
exception handling works.
Figure 354

TRY…EXCEPT
Let’s first explore the try…except statement. This type of statement will first attempt to execute
the code in the “try” block, but if an error occurs it will kick out and begin searching for the
exception that matches the error.
Figure 355
Once it finds the correct exception to handle the error it will then execute that line of code.
Figure 356
For example, let’s say you are writing a program that will open and write a file.
Figure 357

After starting the program an error occurred as the data was not able to be read.
Figure 358
Because of this error the program skipped over the code lines under the “try”
statement and went directly to the exception line.
Figure 359
Since this error fell within the IOError guidelines it printed “Unable to open or read the data in
the file.” to our console.
Figure 360
When writing simple programs we can sometimes get away with only one except statement,
but what happens if another error occurs that is not caught by the IOError?

If that happened we would need to add another except statement. For this except statement
you will notice that the type of error to catch is not specified. While this may seem a logical step
so the program will catch all errors and not terminate this is not a best practice.
Figure 361
For example, let’s say our small program was just one section of a much larger program that
was over a thousand lines of code. Our task was to debug the program as it kept throwing an
error causing a disruption for our users.
Figure 362
When investigating the program you found this error kept appearing. Because this error had no
details you ended up spending hours trying to pinpoint and fix the error.
Figure 363

TRY…EXCEPT…ELSE
So far in our program we have defined that an error message should print out if an error occurs,
but we don’t receive any messages that the program executed properly. This is where we can
now add an else statement to give us that notification.
Figure 364
By adding this else statement it will provide us a notification to the console that “The file was
written successfully”.
Figure 365
TRY…EXCEPT…ELSE…FINALLY
Now that we have defined what will happen if our program executes properly, or if an error
occurs there is one last statement to add. For this example since we are opening a file the last
thing we need to do is close the file. By adding a finally statement it will tell the program to close
the file no matter the end result and print “File is now closed” to our console.
Figure 366

RECAP
In the video, you learned:
• How to write a try…except statement,

• Why is it important to always define errors when creating exceptions, and
• How to add an else and finally statement.
Figure 367

READING: EXCEPTION HANDLING
EXCEPTION HANDLING IN PYTHON
Estimated time needed: 10 Minutes
Objectives
1. Understanding Exceptions
2. Distinguishing Errors from Exceptions
3. Familiarity with Common Python Exceptions
4. Managing Exceptions Effectively
In the world of programming, errors and unexpected situations are certain. Python, a popular
and versatile programming language, equips developers with a powerful toolset to manage
these unforeseen scenarios through exceptions and error handling.
What are exceptions?
Exceptions are alerts when something unexpected happens while running a program. It could
be a mistake in the code or a situation that was not planned for. Python can
raise these alerts automatically, but we can also trigger them on purpose using the raise
command. The cool part is that we can prevent our program from crashing by handling exceptions.
Errors vs. Exceptions
Hold on, what is the difference between errors and exceptions? Well, errors are usually big
problems that come from the computer or the system. They often make the program stop
working completely. On the other hand, exceptions are more like issues we can control. They
happen because of something we did in our code and can usually be fixed, so the program keeps
going.
Here is the difference between Errors and exceptions:
Aspect Errors Exceptions
Origin Errors are typically caused by Exceptions are usually a result of

the environment, hardware, or problematic code execution within the
operating system. program.
Nature Errors are often severe and can Exceptions are generally less severe and
lead to program crashes or can be caught and handled to prevent
abnormal termination. program termination.

Handling Errors are not usually caught or Exceptions can be caught using try-
handled by the program itself. except blocks and dealt with gracefully,
allowing the program to continue
execution.
Examples Examples include “SyntaxError Examples include “ZeroDivisionError”

” due to incorrect syntax or “ when dividing by zero, or “
NameError” when a variable is FileNotFoundError” when attempting to
not defined. open a non-existent file.
Categorization Errors are not classified into Exceptions are categorized into various
categories. classes, such as “ArithmeticError,” “
IOError,” ValueError,” etc., based on
their nature.
Common Exceptions in Python
Here are a few examples of exceptions we often run into and can handle using this tool:
ZeroDivisionError:
This error arises when an attempt is made to divide a number by zero. Division by zero is
undefined in mathematics, causing an arithmetic error. For example:
result = 10 / 0 print(result)
# Raises ZeroDivisionError
ValueError:
This error occurs when an inappropriate value is used within the code. An example of this is
when trying to convert a non-numeric string to an integer: For example:
num = int("abc")
# Raises ValueError
FileNotFoundError:
This exception is encountered when an attempt is made to access a file that does not exist. For
example:
with open("nonexistent_file.txt", "r") as file:

content = file.read() # Raises FileNotFoundError

IndexError:
An IndexError occurs when an index is used to access an element in a list that is outside the valid
index range. For example:
my_list = [1, 2, 3]
value = my_list[1] # No IndexError, within range missing = my_list[5] # Raises
IndexError
KeyError:
The KeyError arises when an attempt is made to access a nonexistent key in a dictionary.
For example:
my_dict = {"name": "Alice", "age": 30}
value = my_dict.get("city") # No KeyError, using .get() method missing = my_dict["city"] #
Raises KeyError
TypeError:
The TypeError occurs when an object is used in an incompatible manner. An example includes
trying to concatenate a string and an integer: For example:
result = "hello" + 5 # Raises TypeError
AttributeError:
An AttributeError occurs when an attribute or method is accessed on an object that doesn't
possess that specific attribute or method. For example:
text = "example"
length = len(text) # No AttributeError, correct method usage missing =
text.some_method() # Raises AttributeError
ImportError:
This error is encountered when an attempt is made to import a module that is unavailable. For
example:
import non_existent_module
Note: Please remember, the exceptions you will encounter are not limited to just
these. There are many more in Python. However, there is no need to worry. By using
the technique provided below and following the correct syntax, you will be able to
handle any exceptions that come your way.

HANDLING EXCEPTIONS:
Python has a handy tool called try and except that helps us manage exceptions.
Try and Except : You can use the try and except blocks to prevent your program from crashing
due to exceptions.
Here's how they work:
1. The code that may result in an exception is contained in the try block.
2. If an exception occurs, the code directly jumps to except block.
3. In the except block, you can define how to handle the exception gracefully, like
displaying an error message or taking alternative actions.
4. After the except block, the program continues executing the remaining code.
Example: Attempting to divide by zero
# using Try- except try:

# Attempting to divide 10 by 0 result = 10 / 0
except ZeroDivisionError:
# Handling the ZeroDivisionError and printing an error message print("Error: Cannot
divide by zero")
# This line will be executed regardless of whether an exception occurred
print("outside of try and except block")
NEXT STEP
As we finish up this reading, you are ready to move on to the next part where you will practice
handling errors. For better learning, try out different types of data in the lab. This way, you will
encounter various errors and learn how to deal with them effectively. This knowledge will help
you write stronger and more reliable code in the future.

HANDS-ON LAB: EXCEPTION HANDLING
EXCEPTION HANDLING
OBJECTIVES
• Understand exceptions
• Handle the exceptions
TABLE OF CONTENTS
• What is an Exception?
• Exception Handling
WHAT IS AN EXCEPTION?
In this section you will learn about what an exception is and see examples of them.
Definition
An exception is an error that occurs during the execution of code. This error causes the code to
raise an exception and if not prepared to handle it will halt the execution of the code.
Examples
Run each piece of code and observe the exception raised:
1/0
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
/tmp/ipykernel_67/2354412189.py in <module>
----> 1 1/0
ZeroDivisionError: division by zero
ZeroDivisionError occurs when you try to divide by zero.
y = a + 5
----> 1 y = a + 5
NameError: name 'a' is not defined
NameError -- in this case, it means that you tried to use the variable a when it was not defined.
a = [1, 2, 3] a[10]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
1 a = [1, 2, 3]
----> 2 a[10]
IndexError: list index out of range

IndexError -- in this case, it occured because you tried to access data from a list using an index
that does not exist for this list.
There are many more exceptions that are built into Python, here is a list of them
https://docs.python.org/3/library/exceptions.html
EXCEPTION HANDLING
In this section you will learn how to handle exceptions. You will understand how to make your
program perform specified tasks instead of halting code execution when an exception is
encountered.
Try Except
A try except will allow you to execute code that might raise an exception and in the case of any
exception or a specific one we can handle or catch the exception and execute specific code. This
will allow us to continue the execution of our program even if there is an exception.
Python tries to execute the code in the try block. In this case if there is any exception raised by
the code in the try block, it will be caught and the code block in the except block will be
executed. After that, the code that comes after the try except will be executed.
# potential code before try catch
try:
# code to try to execute except:
# code to execute if there is an exception
# code that will still execute if there is an exception
Try Except Example

In this example we are trying to divide a number given by the user, save the outcome in the
variable a, and then we would like to print the result of the operation. When taking user input
and dividing a number by it there are a couple of exceptions that can be raised. For example if we
divide by zero. Try running the following block of code with b as a number. An exception will only
be raised if b is zero.
a = 1
try:
b = int(input("Please enter a number to divide a"))
a = a/b
print("Success a=",a)
except:
print("There was an error")
Please enter a number to divide a 0
There was an error
Try Except Specific
A specific try except allows you to catch certain exceptions and also execute certain code
depending on the exception. This is useful if you do not want to deal with some exceptions and
the execution should halt. It can also help you find errors in your code that you might not be
aware of. Furthermore, it can help you differentiate responses to different exceptions. In this
case, the code after the try except might not run depending on the error.

Do not run, just to illustrate:
try:
# code to try to execute
except (ZeroDivisionError, NameError):
# code to execute if there is an exception of the given types
# code that will execute if there is no exception or a one that we are handling
File "/tmp/ipykernel_425/2752586831.py", line 5
except (ZeroDivisionError, NameError):
^
IndentationError: expected an indented block
try:
# code to execute if there is a ZeroDivisionError
except NameError:
# code to execute if there is a NameError
^
You can also have an empty except at the end to catch an unexpected exception:
Do not run, just to illustrate:
try:
except NameError:
except:
# code to execute if ther is any exception
Try Except Specific Example

This is the same example as above, but now we will add differentiated messages depending on
the exception, letting the user know what is wrong with the input.
a = 1
try:
a = a/b
print("Success a=",a)
print("The number you provided cant divide 1 because it is 0")
except ValueError:
print("You did not provide a number")

except:
print("Something went wrong")
The number you provided cant divide 1 because it is 0
Try Except Else and Finally
else allows one to check if there was no exception when executing the try block. This is useful
when we want to execute something only if there were no errors.
Do not run, just to illustrate
try:
except NameError:
except:
else:
# code to execute if there is no exception
^
finally allows us to always execute something even if there is an exception or not. This is
usually used to signify the end of the try except.
try:
except NameError:
except:
else:
# code to execute if there is no exception
finally:
# code to execute at the end of the try except no matter what
^

Try Except Else and Finally Example
You might have noticed that even if there is an error the value of a is always printed. Let's use
the else and print the value of a only if there is no error.
a = 1
try:
a = a/b
except ValueError:
except:
else:
print("success a=",a)
success a= 1.0
Now let’s let the user know that we are done processing their answer. Using the finally, let's add
a print.
a = 1
try:
a = a/b
except ValueError:
except:
else:
print("success a=",a)
finally:
print("Processing Complete")
success a= 1.0
Processing Complete

PRACTICE EXERCISES
Exercise 1: Handling ZeroDivisionError
Imagine you have two numbers and want to determine what happens when you divide one
number by the other. To do this, you need to create a Python function called safe_divide. You
give this f u n c t i o n 2 numbers, a 'numerator' and a 'denominator'. The 'numerator' is
the number you want to divide, and the 'denominator' is the number you want to divide
by. Use the user input method of Python to take the values.
The function should be able to do the division for you and give you the result. But here's the
catch: if you try to divide by zero (which is not allowed in math), the function should be smart
enough to catch that and tell you that it's not possible to divide by zero. Instead of showing an
error, it should return None, which means 'nothing' or 'no value', and print "Error: Cannot
divide by Zero.
Hint:
Follow these:
• Define a function to perform the division and use two arguments.
• Use the try-except block to handle ZeroDivisionError.
• Return the result of the division if no error occurs.
• Return None if division by zero occurs.
• take user input for numerator and denominator values.
• call and print your function with the user inputs.
**Note: Test with different inputs to validate error handling.**
#Type your code here
def safe_divide(numerator,denominator):
try:
result = numerator / denominator
return result
print("Error: Cannot divide by zero.")
return None
# Test case
numerator=int(input("Enter the numerator value:"))
denominator=int(input("Enter the denominator value:"))
print(safe_divide(numerator,denominator))
Enter the numerator value: 2
Enter the denominator value: 0
Error: Cannot divide by zero.
None
Note: Practice handling exceptions by trying different input types like using integers,
strings, zero, negative values, or other data types.

Exercise 2: Handling ValueError
Imagine you have a number and want to calculate its square root. To do this, you need to
create a Python function. You give this function one number, 'number1'.
The function should generate the square root value if you provide a positive integer or float value
as input. However, the function should be clever enough to detect the mistake if you enter a
negative value. It should kindly inform you with a message saying, 'Invalid input! Please
enter a positive integer or a float value.
Hint:
Follow these:
* Define a function to perform square root of the argument.
* Use try-except block for error handling.
* Use `sqrt()` function from the `math` package to calculate the square root. Catch ValueError and display
the error message."
* Take user input of the value, number1.
* Test with negative numbers to validate error handling.

import math
def perform_calculation(number1):
try:
result = math.sqrt(number1)
print(f"Result: {result}")
except ValueError:
print("Error: Invalid input! Please enter a positive integer or a float
value.")
# Test case
number1=float(input("Enter the number:"))
perform_calculation(number1)
Enter the number: 4
Result: 2.0
Note:- Practice handling exceptions by trying different input types like using integers,

Exercise 3: Handling Generic Exceptions
Imagine you have a number and want to perform a complex mathematical task. The calculation
requires dividing the value of the input argument "num" by the difference between "num" and 5,
and the result has to be stored in a variable called "result".
You have to define a function so that it can perform that complex mathematical task. The function
should handle any potential errors that occur during the calculation. To do this, you can use a
try-except block. If any exception arises during the calculation, it should catch the error using the
generic exception class "Exception" as "e". When an exception occurs, the function should
display "An error occurred during calculation.
Follow these:
* Define a function for the complex calculation and pass any argument.
* Use a try-except block for error handling.
* Perform the calculation and store the result in "result."
* Catch any exceptions using Exception as e.
* Display "An error occurred during calculation." when an exception is caught.
* take user input
* Call the defined function with the user input
**Note:- Test with different inputs to validate error handling.**
def complex_calculation(num): try:

result = num / (num - 5) print (f"Result: {result}")
except Exception as e:
print("An error occurred during calculation.") # Test case
user_input = float(input("Enter a number: ")) complex_calculation(user_input)
Enter a number: 5
An error occurred during calculation.
Note: Practice handling exceptions by trying different input types like using integers,

PRACTICE QUIZ: EXCEPTION HANDLING
Question 1
Why do we use exception handlers?
o To terminate a program
o To catch errors within a program
o To write a file
o To read a file
Correct: Exception handlers catch errors in the codes.
Question 2
What is the purpose of a try…except statement?
o Executes only when a particular condition is true
o Crash a program when errors occur
o Executes the code block only if a certain condition exists
o Catch and handle exceptions when an error occurs Correct: It handles code crashes in
case of errors.
Question 3
Consider the following code:
If the user enters the value of `b’ as 0, what is expected as the output?
a = 1 try:
b = int(input("Please enter a number to divide a")) a = a/b
print("Success a=",a) except:
print("There was an error")
o ZeroDivisionError
o Success a=1/0
o There was an error
o Success a=NaN
Correct: This division will generate an error, leading to the exception part.

VIDEO 014: OBJECTS AND CLASSES (10:52)
In this module, we're going to talk about objects and classes.
Figure 368
BUILT-IN TYPES IN PYTHON

Python has many different kinds of data types:
• integers,
• floats,
• strings,
• lists,
• dictionaries,
• booleans.
In Python, each is an object.
Figure 369
Every object has the following:

• A type,
• An internal representation,
• A set of functions called methods to interact with the data.
An object is an instance of a particular type.
For example, we have two types: type1 and type2.
We can have several objects of type1 as shown in yellow. Each object is an instance of
type1.

We also have several objects of type2 shown in green. Each object is an instance of type2.
Figure 370
OBJECTS: TYPES
Let's do several less abstract examples. Every time we create an integer, we are creating an
instance of type integer, or we are creating an integer object. In this case, we are creating five
instances of type integer or five integer objects.
Figure 371
Similarly, every time we create a list, we are creating an instance of type list, or we are creating
a list object. In this case, we are creating five instances of type list or five list objects.
Figure 372

We could find out the type of an object by using the type command. In this case, we have an
object of type list, we have an object of type integer, we have an object of type string. Finally, we
have an object of type dictionary.
Figure 373
METHODS
• A class or type's methods are functions that every instance of that class or type provides.
• It's how you interact with the object. We have been using methods all this time, for
example, on lists.
• Sorting is an example of a method that interacts with the data in the object.
Consider the list ratings, the data is a series of numbers contained within the list. The method
sort will change the data within the object. We call the method by adding a period at the end of
the object's name, and the method's name we would like to call with parentheses.
Figure 374

We have the ratings list represented in orange. The data contained in the list is a sequence of
numbers. We call the sort method, this changes the data contained in the object. You can say it
changes the state of the object.
Figure 375
We can call the reverse method on the list, changing the list again. We call the method, reversing
the order of the sequence within the object. In many cases, you don't have to know the inner
workings of the class and its methods, you just have to know how to use them. Next, we will
cover how to construct your own classes.
Figure 376
CLASSES
You can create your own type or class in Python.
Figure 377

In this section, you'll create a class. The class has data attributes. The class has methods. We
then create instances or instances of that class or objects. The class data attributes define the
class.
Figure 378
Let's create two classes:

• The first class will be a circle
• The second will be a rectangle.
Let's think about what constitutes a circle: Examining this image, all we need is a radius to define
a circle, and let's add color to make it easier to distinguish between different instances of the
class later. Therefore, our class data attributes are radius and color.
Similarly, examining the image in order to define a rectangle, we need the height and width. We
will also add color to distinguish between instances later. Therefore, the data attributes are
color, height, and width.
Figure 379

CREATE A CLASS: CIRCLE
To create the class circle, you will need to include the class definition. This tells Python you're
creating your own class, the name of the class. For this course in parentheses, you will always
place the term object, this is the parent of the class.
Figure 380
For the class rectangle, we changed the name of the class, but the rest is kept the same.
Figure 381

Defining a Class
Classes are outlines we have to set the attributes to create objects.
Figure 382
Attributes and Objects
We can create an object that is an instance of type circle. The color data attribute is red, and the
data attribute radius is four. We could also create a second object that is an instance of type
circle. In this case, the color data attribute is green, and the data attribute radius is two.
Figure 383
We can also create an object that is an instance of type rectangle. The color data attribute is
blue, and the data attribute of height and width is two. The second object is also an instance of
type rectangle. In this case, the color data attribute is yellow, and the height is one, and the width
is three.
Figure 384

Instances of a Class: Objects
We now have different objects of class circle or type circle. We also have different objects of
class rectangle or type rectangle.
Figure 385
Let us continue building the circle class in Python. We define our class. We then initialize each
instance of the class with data attributes, radius, and color using the class constructor.
Figure 386
The self parameter refers to the newly created instance of the class. The parameters, radius, and
color can be used in the constructors body to access the values passed to the class constructor
when the class is constructed. We could set the value of the radius and color data attributes to
the values passed to the constructor method.
The function init is a constructor. It's a special function that tells Python you are making a new
class. There are other special functions in Python to make more complex classes. The radius and
color parameters are used to initialize the radius and color data attributes of the class instance.
Figure 387

CREATE A CLASS: RECTANGLE
Similarly, we can define the class rectangle in Python. The name of the class is different.
This time, the class data attributes are color, height, and width.
Figure 388
CREATE AN INSTANCE OF A CLASS: CIRCLE

After we've created the class, in order to create an object of class circle, we introduce a variable.
This will be the name of the object.
Figure 389
We create the object by using the object constructor.
Figure 390

The object constructor consists of the name of the class as well as the parameters. These are
the data attributes.
Figure 391
When we create a circle object, we call the code like a function.
Figure 392
The arguments passed to the circle constructor are used to initialize the data attributes
of the newly created circle instance.
Figure 393

It is helpful to think of self as a box that contains all the data attributes of the object.
Figure 394
Typing the object's name followed by a dot and the data attribute name gives us the data
attribute value, for example, radius. In this case, the radius is 10. We can do the same for color.
Figure 395
We can see the relationship between the self parameter and the object.
Figure 396
In Python, we can also set or change the data attribute directly. Typing the object's name
followed by a dot and the data attribute name and set it equal to the corresponding value. We
can verify that the color data attribute has changed. Usually, in order to change the data in an
object, we define methods in the class.

Figure 397

METHODS
Let's discuss methods.
Figure 398
We have seen how data attributes consist of the data defining the objects. Methods are functions
that interact and change the data attributes, changing or using the data attributes of the object.
Figure 399
Let's say we would like to change the size of a circle. This involves changing the radius attribute.
Figure 400

We add a method, add_radius to the class circle. The method has a function that requires the
self as well as other parameters. In this case, we are going to add a value to the radius, we denote
that value as r. We are going to add r to the data attribute radius.
Figure 401
Let's see how this part of the code works when we create an object and call the add_radius
method.
Figure 402
As before, we create an object with the object constructor. We pass two arguments to the
constructor. The radius is set to two and the color is set to red. In the constructor's body, the data
attributes are set.
Figure 403
We can use the box analogy to see the current state of the object. We call the method by adding
a dot followed by the method, name, and parentheses. In this case, the argument of the function

is the amount we would like to add. We do not need to worry about the self parameter when
calling the method. Just like with the constructor, Python will take care of that for us. In many
cases, there may not be any parameters other than self specified in the method's definition. So,
we don't pass any arguments when calling the function.
Figure 404
Internally, the method is called with a value of 8, and the proper self object.
Figure 405
The method assigns a new value to self radius. This changes the object, in particular, the
radius data attribute.
Figure 406

When we call the add_radius method, this changes the object by changing the value of the
radius data attribute.
Figure 407
We can add default values to the parameters of a class as constructor.
Figure 408
In the labs, we also create the method called drawCircle. See the lab for the implementation
of drawCircle.
Figure 409

In the labs, we can create a new object of type circle using the constructor. The color will be red
and the radius will be 3. We can access the data attribute radius. We can access the attribute
color. Finally, we can use the method drawCircle to draw the circle.
Figure 410
Similarly, we can create a new object of type circle. We can access the data attribute of radius.
We can access the data attribute color. We can use the method drawCircle to draw the circle.
Figure 411
In summary, (1) we have created an object of class circle called RedCircle with a radius attribute
of 3, and a color attribute of red. (2) We also created an object of class circle called BlueCircle,
with a radius attribute of 10 and a color attribute of blue.
Figure 412

In the lab, we have a similar class for rectangle. We can create a new object of type rectangle using
the constructor. We can access a data attribute of height. We can also access the data attribute
of width. We could do the same for the data attribute of color. We can use the method
drawRectangle to draw the rectangle.
Figure 413
So, we have a class, an object that is a realization or instantiation of that class. For example, we
can create two objects of class Circle, or two objects of class Rectangle.
Figure 414
The dir function is useful for obtaining the list of data attributes and methods associated with a
class. The object you're interested in is passed as an argument. The return value is a list of the
objects data attributes. The attribute surrounded by underscores are for internal use, and you
shouldn't have to worry about them. The regular looking attributes are the ones you should
concern yourself with. These are the objects, methods, and data attributes. There is a lot more
you can do with objects in Python. Check Python.org for more info.
Figure 415

READING: OBJECTS AND CLASSES
PYTHON OBJECTS AND CLASSES
OBJECTIVES
In this reading, you will learn about:
• Fundamental concepts of Python objects and classes.
• Structure of classes and object code.
• Real-world examples related to objects and classes.
INTRODUCTION TO CLASSES AND OBJECT

Python is an object-oriented programming (OOP) language that uses a paradigm centered
around objects and classes. Let's look at these fundamental concepts.
CLASSES
A class is a blueprint or template for creating objects. It defines the structure and behavior that
its objects will have.
Think of a class as a cookie cutter and objects as the cookies cut from that template. In Python,
you can create classes using the class keyword.
CREATING CLASSES
When you create a class, you specify the attributes (data) and methods (functions) that objects
of that class will have.
Attributes are defined as variables within the class, and methods are defined as functions.
For example, you can design a "Car" class with attributes such as "color" and "speed," along with
methods like "accelerate."
OBJECTS
An object is a fundamental unit in Python that represents a real-world entity or concept.
Objects can be tangible (like a car) or abstract (like a student's grade). Every object has two main
characteristics:
State
The attributes or data that describe the object. For your "Car" object, this might include attributes
like "color", "speed", and "fuel level".
Behavior
The actions or methods that the object can perform. In Python, methods are functions that
belong to objects and can change the object's state or perform specific operations.

INSTANTIATING OBJECTS
• Once you've defined a class, you can create individual objects (instances) based on that
class.
• Each object is independent and has its own set of attributes and methods.
• To create an object, you use the class name followed by parentheses, so: "my_car =
Car()"
INTERACTING WITH OBJECTS
You interact with objects by calling their methods or accessing their attributes using dot notation.
For example, if you have a Car object named my_car, you can set its color with
my_car.color = "blue" and accelerate it with my_car.accelerate() if there's an accelerate method
defined in the class.
STRUCTURE OF CLASSES AND OBJECT CODE

Please don't directly copy and use this code because it is a template for explanation and not for
specific results.
Class declaration (class ClassName)
• The class keyword is used to declare a class in Python.

• ClassName is the name of the class, typically following CamelCase naming
conventions.
class ClassName:
Class attributes (class_attribute = value)
• Class attributes are variables shared among all class instances (objects).
• They are defined within the class but outside of any methods.
class ClassName:
# Class attributes (shared by all instances) class_attribute = value
Constructor method (def init (self, attribute1, attribute2, …):)
• The init method is a special method known as the constructor. It initializes the
instance attributes (also called instance variables) when an object is created.
• The self parameter is the first parameter of the constructor, referring to the instance
being created.
• attribute1, attribute2, and so on are parameters passed to the constructor when
creating an object.
• Inside the constructor, self.attribute1, self.attribute2, and so on are used to assign
values to instance attributes.

class ClassName:
# Constructor method (initialize instance attributes) def init (self, attribute1,

attribute2, ...):
pass # ...
Instance attributes (self.attribute1 = attribute1)
• Instance attributes are variables that store data specific to each class instance.
• They are initialized within the __init__ method using the self keyword followed by
the attribute name.
• These attributes hold unique data for each object created from the class.
class ClassName:

attribute2, ...):
self.attribute1 = attribute1 self.attribute2 = attribute2 # ...
Instance methods (def method1(self, parameter1, parameter2, …):)
• Instance methods are functions defined within the class.

• They operate on the instance's data (instance attributes) and can perform actions
specific to instances.
• The self parameter is required in instance methods, allowing them to access instance
attributes and call other methods within the class.
class ClassName:

attribute2, ...):
# Instance methods (functions)

def method1(self, parameter1, parameter2, ...): # Method logic
pass
Using the same steps you can define multiple instance methods.
class ClassName:

attribute2, ...):
# Instance methods (functions)

pass

pass
Note: Now, you have successfully created a dummy class.

Creating objects (Instances)
• To create objects (instances) of the class, you call the class like a function and provide
arguments the constructor requires.
• Each object is a distinct instance of the class, with its own instance attributes and the
ability to call methods defined in the class.
# Create objects (instances) of the class object1 = ClassName(arg1, arg2, ...) object2
= ClassName(arg1, arg2, ...)
Calling methods on objects
• In this section, you will call methods on objects,

specifically object1 and object2.
• The methods method1 and method2 are defined in the ClassName class, and you're
calling them on object1 and object2 respectively.
• You pass values param1_value and param2_value as arguments to these methods.
These arguments are used within the method's logic.
Method 1: Using dot notation

• This is the most straightforward way to call an object's method. In this, use the dot
notation (object.method()) to invoke the method on the object directly.
• For example, result1 = object1.method1(param1_value, param2_value,
• ...) calls method1 on object1.
# Calling methods on objects

# Method 1: Using dot notation
result1 = object1.method1(param1_value, param2_value, ...) result2 =
object2.method2(param1_value, param2_value, ...)
Method 2: Assigning object methods to variables

• Here's an alternative way to call an object's method by assigning the method reference
to a variable.
• method_reference = object1.method1 assigns the method method1 of object1 to
the variable method_reference.
• Later, call the method using the variable like this:
result3 = method_reference(param1_value, param2_value, …).
# Method 2: Assigning object methods to variables

method_reference = object1.method1 # Assign the method to a variable result3 =
method_reference(param1_value, param2_value, ...)
Accessing object atributes

• Here, you are accessing an object's attribute using dot notation.
• attribute_value = object1.attribute1 retrieves the value of the attribute attribute1
from object1 and assigns it to the variable attribute_value.

# Accessing object attributes
attribute_value = object1.attribute1 # Access the attribute using dot notation
Modifying object attributes

• You will modify an object's attribute using dot notation.
• object1.attribute2 = new_value sets the attribute attribute2 of object1 to the new value
new_value.
# Modifying object attributes

object1.attribute2 = new_value # Change the value of an attribute using dot notation
Accessing class attributes (shared by all instances)

• Finally, access a class attribute shared by all class instances.
• class_attr_value = ClassName.class_attribute accesses the class attribute
class_attribute from the ClassName class and assigns its value to the variable.
• class_attr_value.
# Accessing class attributes (shared by all instances) class_attr_value =

ClassName.class_attribute
Real-world example
Let's write a python program that simulates a simple car class, allowing you to create car
instances, accelerate them, and display their current speeds
1. Let's start by defining a Car class that includes the following attributes and methods:
o Class attribute max_speed, which is set to 120 km/h.
o Constructor method init that takes parameters for the car's make, model,
color, and an optional speed (defaulting to 0). This method initializes instance
attributes for make, model, color, and speed.
o Method accelerate(self, acceleration) that allows the car to accelerate. If the
acceleration does not exceed the max_speed, update the car's speed attribute.
Otherwise, set the speed to the max_speed.
o Method get_speed(self) that returns the current speed of the car.
class Car:
# Class attribute (shared by all instances) max_speed = 120 # Maximum speed in km/h
# Constructor method (initialize instance attributes) def init (self, make, model,
color, speed=0):
self.make = make self.model = model self.color = color
self.speed = speed # Initial speed is set to 0
# Method for accelerating the car def accelerate(self, acceleration):

if self.speed + acceleration <= Car.max_speed: self.speed += acceleration
else:
self.speed = Car.max_speed
# Method to get the current speed of the car def get_speed(self):

return self.speed

2. Now, you will instantiate two objects of the Car class, each with the following
characteristics:
• car1: Make = "Toyota", Model = "Camry", Color = "Blue"
• car2: Make = "Honda", Model = "Civic", Color = "Red"
# Create objects (instances) of the Car class car1 = Car("Toyota", "Camry", "Blue")
car2 = Car("Honda", "Civic", "Red")
3. Using the accelerate method, you will increase the speed of car1 by 30 km/h and car2 by 20
km/h.
# Accelerate the cars car1.accelerate(30) car2.accelerate(20)
4. Lastly, you will display the current speed of each car by

utilizing the get_speed method.
# Print the current speeds of the cars

print(f"{car1.make} {car1.model} is currently at {car1.get_speed()} km/h.")
print(f"{car2.make} {car2.model} is currently at {car2.get_speed()} km/h.")
Next steps
In conclusion, this reading provides a fundamental understanding of objects and classes in

Python, essential concepts in object-oriented programming. Classes serve as blueprints for
creating objects, encapsulating data attributes and methods. Objects represent real-world
entities and possess their unique state and behavior. The structured code example presented in
the reading outlines the key elements of a class, including class attributes, the constructor
method for initializing instance attributes, and instance methods for defining object-specific
functionality.
In the upcoming laboratory session, you can apply the concepts of objects and classes to gain
hands-on experience.

HANDS-ON LAB: OBJECTS AND CLASSES
CLASSES AND OBJECTS IN PYTHON
OBJECTIVES
• Work with classes and objects
• Identify and define attributes and methods
TABLE OF CONTENTS
• Introduction to Classes and Objects
• Creating a class
• Instances of a Class: Objects and Attributes
• Methods
• Creating a class
• Creating an instance of a class Circle
• The Rectangle Class
Introduction to Classes and Objects
Creating a Class
The first step in creating a class is giving it a name. In this notebook, we will create two classes:
Circle and Rectangle. We need to determine all the data that make up that class, which we call
attributes. Think about this step as creating a blue print that we will use to create objects. In
figure 1 we see two classes, Circle and Rectangle. Each has their attributes, which are variables.
The class Circle has the attribute radius and color, while the Rectangle class has the attribute
height and width. Let’s use the visual examples of these shapes before we get to the code, as this
will help you get accustomed to the vocabulary.
Figure 1: Classes circle and rectangle, and each has their own attributes. The class Circle has the attribute radius and color,
the class Rectangle has the attributes height and width.

Instances of a Class: Objects and Attributes
An instance of an object is the realization of a class, and in Figure 2 we see three instances of
the class circle. We give each object a name: red circle, yellow circle, and green circle. Each object
has different attributes, so let's focus on the color attribute for each object.
Figure 2: Three instances of the class Circle, or three objects of type Circle.
The color attribute for the red Circle is the color red, for the green Circle object the color attribute
is green, and for the yellow Circle the color attribute is yellow.
Methods
Methods give you a way to change or interact with the object; they are functions that interact
with objects. For example, let’s say we would like to increase the radius of a circle by a specified
amount. We can create a method called add_radius(r) that increases the radius by r. This is
shown in figure 3, where after applying the method to the "orange circle object", the radius of
the object increases accordingly. The “dot” notation means to apply the method to the object,
which is essentially applying a function to the information in the object.
Figure 3: Applying the method “add_radius” to the object orange circle object.

Creating a Class
Now we are going to create a class Circle, but first, we are going to import a library to draw the
objects:
# Import the library
import matplotlib.pyplot as plt

%matplotlib inline
1. The first step in creating your own class is to use the class keyword, then the name
of the class as shown in Figure 4. In this course the class parent will always be object:
Figure 4: Creating a class Circle.
2. The next step is a special method called a constructor init , which is used to initialize the
object. The inputs are data attributes. The term self contains all the attributes in the set. For
example the self.color gives the value of the attribute color and self.radius will give you the
radius of the object. We also have the method add_radius() with the parameter r, the
method adds the value of r to the attribute radius. To access the radius we use the
syntax self.radius. The labeled syntax is summarized in Figure 5:
Figure 5: Labeled syntax of the object circle.
The actual object is shown below. We include the method drawCircle to display the image of a
circle. We set the default radius to 3 and the default color to blue:
# Create a class Circle

class Circle(object):
# Constructor
def __init__(self, radius=3, color='blue'):
self.radius = radius
self.color = color

# Method
def add_radius(self, r):
self.radius = self.radius + r
return(self.radius)
# Method
def drawCircle(self):
plt.gca().add_patch(plt.Circle((0, 0), radius=self.radius, fc=self.color))
plt.axis('scaled')
plt.show()
Creating an instance of a class Circle
Let’s create the object RedCircle of type Circle to do the following:
# Create an object RedCircle

RedCircle = Circle(10, 'red')
We can use the dir command to get a list of the object's methods. Many of them are default
Python methods.
# Find out the methods can be used on the object RedCircle

dir(RedCircle)
['__class__',
'__delattr__',
'__dict__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getstate__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__le__',
'__lt__',
'__module__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'__weakref__',
'add_radius',
'color',
'drawCircle',
'radius']
We can look at the data attributes of the object:
# Print the object attribute radius

RedCircle.radius
10
# Print the object attribute color

RedCircle.color
'red'

We can change the object's data attributes:
# Set the object attribute radius

RedCircle.radius = 1
RedCircle.radius
1
We can draw the object by using the method drawCircle():
# Call the method drawCircle

RedCircle.drawCircle()
Figure 416
We can increase the radius of the circle by applying the method add_radius(). Let's increases the
radius by 2 and then by 5:
# Use method to change the object attribute radius
print('Radius of object:',RedCircle.radius)
RedCircle.add_radius(2)
print('Radius of object of after applying the method
add_radius(2):',RedCircle.radius)
RedCircle.add_radius(5)
print('Radius of object of after applying the method
add_radius(5):',RedCircle.radius)
Radius of object: 1
Radius of object of after applying the method add_radius(2): 3
Radius of object of after applying the method add_radius(5): 8
Let’s create a blue circle. As the default color is blue, all we have to do is specify what the radius
is:
# Create a blue circle with a given radius

BlueCircle = Circle(radius=100)
As before, we can access the attributes of the instance of the class by using the dot notation:
# Print the object attribute radius

BlueCircle.radius
100

BlueCircle.color
'blue'

We can draw the object by using the method drawCircle():
# Call the method drawCircle

BlueCircle.drawCircle()
Figure 417
Compare the x and y axis of the figure to the figure for RedCircle; they are different.
The Rectangle Class
Let's create a class rectangle with the attributes of height, width, and color. We will only add the
method to draw the rectangle object:
# Create a new Rectangle class for creating a rectangle object
class Rectangle(object):
# Constructor
def __init__(self, width=2, height=3, color='r'):
self.height = height
self.width = width
self.color = color
# Method
def drawRectangle(self):
plt.gca().add_patch(plt.Rectangle((0, 0), self.width, self.height
,fc=self.color))
plt.axis('scaled')
plt.show()
Let’s create the object SkinnyBlueRectangle of type Rectangle. Its width will be 2 and height will
be 3, and the color will be blue:
# Create a new object rectangle

SkinnyBlueRectangle = Rectangle(2, 3, 'blue')
As before we can access the attributes of the instance of the class by using the dot notation:
# Print the object attribute height

SkinnyBlueRectangle.height
# Print the object attribute width

SkinnyBlueRectangle.width

SkinnyBlueRectangle.color
'blue'
We can draw the object:
# Use the drawRectangle method to draw the shape SkinnyBlueRectangle.drawRectangle()
Figure 418
Let’s create the object FatYellowRectangle of type Rectangle:
# Create a new object rectangle

FatYellowRectangle = Rectangle(20, 5, 'yellow')
We can access the attributes of the instance of the class by using the dot notation:
# Print the object attribute height

FatYellowRectangle.height
5
# Print the object attribute width

FatYellowRectangle.width
20

FatYellowRectangle.color
'yellow'
We can draw the object:
# Use the drawRectangle method to draw the shape
FatYellowRectangle.drawRectangle()
Figure 419

Scenario: Car dealership's inventory management system
You are working on a Python program to simulate a car dealership's inventory management
system. The system aims to model cars and their attributes accurately.
Task-1. You are tasked with creating a Python program to represent vehicles using a class. Each
car should have attributes for maximum speed and mileage.

class Vehicle:
def __init__(self, max_speed, mileage):
self.max_speed = max_speed
self.mileage = mileage
Task-2. Update the class with the default color for all vehicles," white".

class Vehicle:
color = "white"
Task-3. Additionally, you need to create methods in the Vehicle class to assign seating capacity
to a vehicle.

class Vehicle:
color = "white"
self.seating_capacity = None
def assign_seating_capacity(self, seating_capacity):

self.seating_capacity = seating_capacity
Task-4. Create a method to display all the properties of an object of the class.

class Vehicle:
color = "white"

def display_properties(self):
print("Properties of the Vehicle:")
print("Color:", self.color)
print("Maximum Speed:", self.max_speed)
print("Mileage:", self.mileage)
print("Seating Capacity:", self.seating_capacity)
Task-5. Additionally, you need to create two objects of the Vehicle class object that should have
a max speed of 200kph and mileage of 50000kmpl with five seating capacities, and another car
object should have a max speed of 180kph and 75000kmpl with four seating capacities.
class Vehicle:
color = "white"


def display_properties(self):
print("Properties of the Vehicle:")
print("Color:", self.color)
print("Maximum Speed:", self.max_speed)
print("Mileage:", self.mileage)
print("Seating Capacity:", self.seating_capacity)
# Creating objects of the Vehicle class

vehicle1 = Vehicle(200, 50000)
vehicle1.assign_seating_capacity(5)
vehicle1.display_properties()
vehicle2 = Vehicle(180, 75000)

vehicle2.assign_seating_capacity(4)
vehicle2.display_properties()
The last exercise!

PRACTICE QUIZ: OBJECTS AND CLASSES
Question 1
Which of the following statements will create an object ‘C1’ for the class that uses radius as 4 and
color as ‘yellow’?
# Constructor
self.color = color
# Method
return self.radius
o C1 = Circle(‘yellow’,4)
o C1.radius = Circle.radius(4)
o C1.color = Circle.color(‘yellow’)
o C1 = Circle()
o C1 = Circle(4, ‘yellow’)
Correct! C1 = Circle(4, ‘yellow’) correctly creates an instance of the Circle class with C1 having
a radius of 4 and its color set to ‘yellow.’
Question 2
Consider the execution of the following lines of code.
CircleObject = Circle() CircleObject.radius = 10
What are the values of the radius and color attributes for the CircleObject after their execution?
# Constructor
self.color = color
# Method
return self.radius
o 3, 'red'
o 10, ‘red’
o 10, 'blue'
o 3, ‘blue’
Correct! The radius attribute is updated to 10 while the color attribute is kept as
default ‘blue.’

Question 3
What is the color attribute of the object V1?
class Vehicle:
color = "white"


V1 = Vehicle(150, 25)
o Error in creation of object

o 'white'
o 25
o 150
Correct! The default setting for the ‘color’ attribute is ‘white,’ eliminating the need to
pass it while creating the object.
Question 4
Which of the following options would create an appropriate object that points to a red, 5-
seater vehicle with a maximum speed of 200kmph and a mileage of 20kmpl?
class Vehicle:
color = "white"


V1 = Vehicle(150, 25)
o Car = Vehicle(200,20)
Car.assign_seating_capacity(5)
Car.color = ‘red’
Car.assign_seating_capacity(5)
Car.color = ‘red’
o Car = Vehicle(200, 20)
Correct! All attributes are correctly assigned here.

Question 5
What is the value printed upon execution of the code shown below?
class Graph():
def __init__(self, id):
self.id = id
self.id = 80
val = Graph(200)
print(val.id)
o 0
o 200
o Invalid Syntax
o 80
Correct! The value of the attribute is overwritten to 80 every time the object is created,
irrespective of the value of the attribute passed.

PRACTICE LAB: TEXT ANALYSIS
SCENARIO: TEXT ANALYSIS
WHAT IS TEXT ANALYSIS?

Text analysis, also known as text mining or text analytics, refers to the process of extracting
meaningful information and insights from textual data.
OBJECTIVES
After completing this lab, you will be able to use Python commands to perform text analysis. This
includes converting the text to lowercase and then finding and counting the frequency of all
unique words, as well as a specified word.
SETUP
For this lab, we will be using the following data types:
• List
• Strings
• Classes and objects
Let's consider a real-life scenario where you are analyzing customer feedback for a product. You
have a large dataset of customer reviews in the form of strings, and you want to extract useful
information from them using the three identified tasks:
Task 1. String in lower case: You want to Pre-process the customer feedback by converting all
the text to lowercase. This step helps standardize the text. Lower casing the text allows you to
focus on the content rather than the specific letter casing.
Task 2. Frequency of all words in a given string: After converting the text to lowercase, you
want to determine the frequency of each word in the customer feedback. This information will
help you identify which words are used more frequently, indicating the key aspects or topics that
customers are mentioning in their reviews. By analyzing the word frequencies, you can gain
insights into the most common issues raised by customers.
Task 3. Frequency of a specific word: In addition to analyzing the overall word frequencies, you
want to specifically track the frequency of a particular word that is relevant to your analysis. For
example, you might be interested in monitoring how often the word "reliable" appears in the
customer reviews to gauge customer sentiment about the product's reliability. By focusing on
the frequency of a specific word, you can gain a deeper understanding of customer opinions or
preferences related to that particular aspect.

By performing these tasks on the customer feedback dataset, you can gain valuable insights into
customer sentiment.
PART-A
Note:- In Part-A, you won't be getting any output as we are just storing the string and
creating a class.
Step-1 Define a string.
"Lorem ipsum dolor! diam amet, consetetur Lorem magna. sed diam nonumy eirmod tempor. diam et
labore? et diam magna. et diam amet."
Hint:- Use a variable and store the above string.
#Press Shift+Enter to run the code

givenstring="Lorem ipsum dolor! diam amet, consetetur Lorem magna. sed diam nonumy
eirmod tempor. diam et labore? et diam magna. et diam amet."
For achieving the tasks mentioned in the scenario, We need to create a class with 3 different
methods.
Step-2 Define the class and its attributes:
1. Create a class named TextAnalyzer.

2. Define the constructor init method that takes a text argument.
# Please do not run this code cell as it is incomplete and will produce an error.
# Let's create a class called TextAnalyzer to analyze text.

class TextAnalyzer(object):
# The __init__ method initializes the class with a 'text' parameter.
# We will store the provided 'text' as an instance variable.
def __init__(self, text):
Step-3 Implement a code to Format the text in Lowercase:
1. Inside the constructor, we will convert the text argument to lowercase using the
lower() method.
2. Then, will Remove punctuation marks (periods, exclamation marks, commas,
and question marks) from the text using the replace() method.
3. At last, we will Assign the formatted text to a new attribute called fmtText.
Here we will be Updating the above TextAnalyzer class with points mentioned above.
# Press Shift+Enter to run the code.

class TextAnalzer(object):
def __init__ (self, text):

# remove punctuation
formattedText =
text.replace('.','').replace('!','').replace('?','').replace(',','')
# make text lowercase
formattedText = formattedText.lower()

Step-4 Implement a code to count the Frequency of all unique words:
• In this step, we will Implement the freqAll() method with the below parameters:
1. Split the fmtText attribute into individual words using the split() method.
2. Create an empty dictionary to store the word frequency.
3. Iterate over the list of words and update the frequency dictionary accordingly.
4. Use count method for counting the occurence
5. Return the frequency dictionary.
Update the above TextAnalyzer class with points mentioned above.
#Press shift+Enter to run the code


formattedText =

self.fmtText = formattedText
def freqAll(self):
# split text into words
wordList = self.fmtText.split(' ')
# Create dictionary
freqMap = {}
for word in set(wordList): # use set to remove duplicates in list
freqMap[word] = wordList.count(word)
return freqMap
Step-5 Implement a code to count the Frequency of a specific word:
• In step-5, we have to Implement the freqOf(word) method that takes a word argument:
1. Create method and pass the word that need to be found.
2. Get the freqAll method for look for count and check if that word is in the list.
3. Return the count.
Update the above TextAnalyzer class with points mentioned above.
#Press Shift+Enter to run the code


formattedText =

self.fmtText = formattedText
def freqAll(self):
# split text into words

wordList = self.fmtText.split(' ')
# Create dictionary
freqMap = {}
for word in set(wordList): # use set to remove duplicates in list
freqMap[word] = wordList.count(word)
return freqMap
def freqOf(self,word):
# get frequency map
freqDict = self.freqAll()
if word in freqDict:
return freqDict[word]
else:
return 0
Now, we have successfully created a class with 3 methods.
PART-B
In Part B, we will be calling the functions created in Part A, allowing the functions to execute and
generate output.
Step-1 Create an instance of TextAnalyzer Class.
Instantiate the TextAnalyzer class by passing the given string as an argument.
# type your code here

analyzed = TextAnalyzer(givenstring)
Step-2 Call the function that converts the data into lowercase
# Press Shift+Enter to run the code. print("Formatted Text:", analyzed.fmtText)

Formatted Text: lorem ipsum dolor diam amet consetetur lorem magna sed diam nonumy eirmod tempor
diam et labore et diam magna et diam amet
We have successfully converted string into lowercase.
Step-3 Call the function that counts the frequency of all unique words from
the data.
# Press Shift+Enter to run the code.

freqMap = analyzed.freqAll()
print(freqMap)
{'consetetur': 1, 'tempor': 1, 'et': 3, 'eirmod': 1, 'dolor': 1, 'lorem': 2, 'amet': 2, 'magna':
2, 'sed': 1, 'diam': 5, 'nonumy': 1, 'labore': 1, 'ipsum': 1}
We have successfully calculated the frequency of all unique words in the string.
Step-4 Call the function that counts the frequency of specific word.
Here, we will call the function that counts the frequency of the word "lorem"
Print the output.**
# type your code here word = "lorem"

frequency = analyzed.freqOf(word)

print("The word",word,"appears",frequency,"times.")
The word lorem appears 2 times.
We have successfully calculated the frequency of all specified words.

MODULE 3 SUMMARY: PYTHON PROGRAMMING FUNDAMENTALS
• Python conditions use “if” statements to execute code based on true/false
• conditions created by comparisons and Boolean expressions.
• Comparison operations require using comparison operators equal to "=", greater
than ">", less than "<".
• An exclamation mark "!" is used to define inequalities of a variable.
• You can compare integers, strings, and floats.
• Python branching directs program flow by using conditional statements (for example,
if, else, elif) to execute different code blocks based on conditions or tests.
• You can use the "if" statement with conditions to define actions if true.
• To perform actions based on true or false output, you can use the "else" statement with
conditions.
• The elif statement allows for additional checks only if the initial condition is false.
• To execute various operations on Boolean values, we use Boolean logic operators.
• Python loops are control structures that automate repetitive tasks and iterate over data
structures like lists or dictionaries.
• The range() function generates a sequence of numbers with a specified start, stop, and
step value for loops in Python.
• A for loop in Python iterates over a sequence, such as a list, tuple, or string, and
executes a block of code for each item in the sequence.
• A while loop in Python executes a block of code as long as a specified condition
remains true.
• Python functions are reusable code blocks that perform specific tasks, take input
parameters, and often return results, enhancing code modularity and reusability.
• You may or may not have written the codes that are often included in
• functions.
• Python has a set of built-in functions such as "len" to find the length of a sequence or
"sum" to find the total sum of a sequence.
• The "sorted" function creates a new sorted list, while "sort" sorts items in the original
list.
• You can also create your own functions in Python.
• To ensure clarity and organization and facilitate understanding and maintenance of the
code, developers must document functions using a documentation string enclosed in
three quotes.
• The help command will return the documentation defined for a particular
• function.
• A function can have multiple parameters.
• “No return” statement in the function means that the function will return
• nothing.

• The "No work" function does not execute any task. You can use the "pass" keyword to
meet the requirement of a non-empty body.
• A function will usually perform more than one task.
• In Python, the scope of a variable determines where you can access or modify that
variable. Global scope allows access from anywhere, while local scope restricts it to a
block or function.
• In Python, a programmer defines a local variable within a specific block or
• function, which can only be accessed or modified within that block or function.
• In Python, a global variable is a variable defined at the top level of a program that any
part of the code can access or modify.
• Exception handling in Python is a mechanism for managing and responding
• to errors and exceptions that may occur during program execution, preventing them
from crashing the program.
• In Python, you use the "try-except" statement to attempt a block of code and specify
alternative actions to execute if an error occurs, allowing you to handle exceptions.
• In Python, you use the "try-except-else" statement to attempt a block of code, handle
exceptions in the "except" block, and execute code in the "else" block when no
exceptions occur.
• Python developers use the "try-except-else-finally" statement to attempt a block
of code, catch exceptions in the "except" block, execute code in the "else" block when
no exceptions occur, and ensure that the "finally" block always runs, regardless of
whether an exception raised or not.
• In Python, objects are instances of classes that encapsulate data and
• behavior, serving as the foundation for creating and working with various data types
and custom data structures.
• To determine the type of an object in Python, you can use the command.
• Any changes made within the method of the object may result in a change in object
type.
• Classes in Python are blueprints for creating objects, defining their attributes and
methods, enabling code organization, and object-oriented programming.
• Function "init" is a special method used to initialize data attributes.
• We can create instances of a class in Python.
• Data attributes consist of the data defining the objects.
• Methods are functions that interact and change the data attributes.
• The method has a function that requires the self as well as other parameters.

READING: CHEAT SHEET - PYTHON PROGRAMMING
FUNDAMENTALS
Package/Method Description Syntax and Code Example
AND Returns `True` if both Syntax:

statement1 and statement2
1. statement1 and statement2
are `True`. Otherwise, returns
`False`. Example:
1. marks = 90
2. attendance_percentage = 87
4. if marks >= 80 and
attendance_percentage >= 85:
5. print("qualify for honors")
6. else:
7. print("Not qualified for

honors")
9. # Output = qualify for honors
Class Definition Defines a blueprint for Syntax:
creating objects and defining 1. class ClassName: # Class

their attributes and attributes and methods
behaviors.
Example:
1. class Person:
2. def init (self, name,

age):
3. self.name = name
4. self.age = age
Define Function A `function` is a reusable Syntax:
block of code that performs a 1. def function_name(parameters): #

specific task or set of tasks Function body
when called.
Example:
1. def greet(name): print("Hello,",

name)

Equal(==) Checks if two values are Syntax:
equal.
1. variable1 == variable2
Example 1:
1. 5 == 5
returns True
Example 2:
1. age = 25 age == 30
returns False
For Loop A `for` loop repeatedly Syntax:

executes a block of code for
1. for variable in sequence: # Code
a specified number of to repeat
iterations or over a sequence

Example 1:
of elements (list, range,
1. for num in range(1, 10):
string, etc.).
2. print(num)
Example 2:
1. fruits = ["apple", "banana",

"orange", "grape", "kiwi"]
2. for fruit in fruits:
3. print(fruit)
Function Call A function call is the act of Syntax:

executing the code within the
1. function_name(arguments)
function using the provided
arguments. Example:
1. greet("Alice")
Greater Than or Checks if the value of Syntax:

Equal To(>=) variable1 is greater than or
1. variable1 >= variable2
equal to variable2.
Example 1:
1. 5 >= 5 and 9 >= 5
returns True
Example 2:

1. quantity = 105
2. minimum = 100
3. quantity >= minimum
returns True
Greater Than(>) Checks if the value of Syntax:

variable1 is greater than
1. variable1 > variable2
variable2.
Example 1: 9 > 6
returns True
Example 2:
1. age = 20
2. max_age = 25
3. age > max_age
returns False
If Statement Executes code block ìf` the Syntax:

condition is `True`. 1. if condition: #code block for if
statement
Example:
1. if temperature > 30:
2. print("It's a hot day!")
If-Elif-Else Executes the first code block Syntax:

if condition1 is `True`,
1. if condition1:
otherwise checks condition2,
2. # Code if condition1 is True
and so on. If no condition is
`True`, the else block is 4. elif condition2:
executed.
5. # Code if condition2 is True
7. else:
8. # Code if no condition is True
Example:

1. score = 85 # Example score
2. if score >= 90:
3. print("You got an A!")
4. elif score >= 80:
5. print("You got a B.")
6. else:
7. print("You need to work

harder.")
9. # Output = You got a B.
If-Else Executes the first code block if Syntax:

Statement the condition is `True`, otherwise 1. if condition: # Code, if condition
the second block. is True
2. else: # Code, if condition is

False
Example:
1. if age >= 18:
2. print("You're an adult.")
3. else:
4. print("You're not an adult

yet.")
Less Than or Checks if the value of variable1 is Syntax:

Equal To(<=) less than or equal to variable2.
1. variable1 <= variable2
Example 1:
1. 5 <= 5 and 3 <= 5
returns True
Example 2:
1. size = 38
2. max_size = 40
3. size <= max_size
returns True

Less Than(<) Checks if the value of variable1 is Syntax:
less than variable2.
1. variable1 < variable2
Example 1:
1. 4 < 6
returns True
Example 2:
1. score = 60
2. passing_score = 65
3. score < passing_score
returns True
Loop Controls `break` exits the loop Syntax:

prematurely. `continue` skips the
1. for: # Code to repeat
rest of the current iteration and
2. if # boolean statement
moves to the next iteration.
3. break
5. for: # Code to repeat
6. if # boolean statement
7. continue
Example 1:
2. if num == 3:
3. break
4. print(num)
Example 2:
2. if num == 3:
3. continue
4. print(num)
Syntax:

NOT Returns `True` if variable is 1. !variable
`False`, and vice versa.

Example:
1. !isLocked
returns True if the variable is False (i.e.,

unlocked).
Not Equal(!=) Checks if two values are not Syntax:

equal.
1. variable1 != variable2
Example:
1. a = 10
2. b = 20
3. a != b
returns True
Example 2:
1. count=0
2. count != 0
returns False
Object Creates an instance of a class Syntax:

Creation (object) using the class
1. object_name = ClassName(arguments)
constructor.
Example:
1. person1 = Person("Alice", 25)
OR Returns `True` if either Syntax:

statement1 or statement2 (or
1. statement1 || statement2
both) are `True`. Otherwise,
returns `False`. Example:
1. "Farewell Party Invitation"
2. Grade = 12 grade == 11 or grade ==

12
returns True
range() Generates a sequence of Syntax:

numbers
1. range(stop)
within a specified range.

2. range(start, stop)
3. range(start, stop, step)
Example:
1. range(5) #generates a sequence of

integers from 0 to 4.
2. range(2, 10) #generates a

sequence
of integers from 2 to 9.
3. range(1, 11, 2) #generates odd
integers from 1 to 9.
Return `Return` is a keyword used to Syntax:

Statement send a value back from a function
1. return value
to its caller.
Example:
1. def add(a, b): return a + b
2. result = add(3, 5)
Try-Except Tries to execute the code in the Syntax:

Block try block. If an exception of the
specified type occurs, the code 1. try: # Code that might raise an
exception except
in
2. ExceptionType: # Code to handle
the except block is executed. the
exception
Example:
1. try:
2. num = int(input("Enter a
number: "))
3. except ValueError:
4. print("Invalid input. Please

enter a valid number.")
Try-Except Code in the èlse` block is Syntax:
with Else Block executed if no exception occurs 1. try: # Code that might raise an
in the try block. exception except

the
exception
3. else: # Code to execute if no

exception occurs
Example:
1. try:
2. num = int(input("Enter a
number: "))
3. except ValueError:
4. print("Invalid input. Please

enter a valid number")
5. else:
6. print("You entered:", num)
Try-Except Code in the `finally` block always Syntax:
with Finally executes, regardless of whether 1. try: # Code that might raise an
Block an exception occurred. exception except

the
exception
3. finally: # Code that always

executes
Example:
7. 7
1. try:
2. file = open("data.txt", "r")
3. data = file.read()
4. except FileNotFoundError:
5. print("File not found.")
6. finally:
7. file.close()
While Loop A `while` loop repeatedly Syntax:

executes a block of code as long
1. while condition: # Code to repeat
as a specified condition remains
`True`. Example:
1. count = 0 while count < 5:
2. print(count) count += 1

READING: GLOSSARY: PYTHON PROGRAMMING FUNDAMENTALS
This comprehensive glossary also includes additional industry- recognized terms not used in
Term Definition
Analogy Refers to a concept or comparison outside the scope of

the programming language itself, used to explain or relate one
concept to
another in a more understandable way.
Attributes Attributes in Python refer to the characteristics or properties of

an object, and they can be accessed using dot notation.
Branching Branching in Python is a process of altering the flow of a

program based on conditions, typically using if, elif, and else
statements.
Comparison Comparison operators in Python are used to compare values and
operators return Boolean results (True or False), including operators like ==

(equal),!= (not equal), < (less than), > (greater than), <= (less than
or equal to), and >= (greater than or equal to).
Conditions Conditions in Python are used to make decisions in code,

executing
specific blocks of code based on whether a given expression

evaluates to True or False.
Enumerate In Python, "enumerate" is a built-in function that adds a counter to

an iterable, allowing you to loop through both the elements and
their
corresponding indices.
Exception handling Exception handling in Python is a mechanism for gracefully

managing and responding to errors or exceptional conditions that
may occur during program execution.

Explicitly In Python, the term "explicitly" refers to performing an action
or specifying something in a clear, unambiguous, and direct
manner.
For loops For loops in Python are used for iterating over a sequence (such
as a
list, tuple, or string) or other iterable objects, executing a set

of statements for each item in the sequence.
Global variable Global variables in Python are variables defined outside of any
function or block and can be accessed and modified from any part
of the code.
Incremented "Incremented" in Python means to increase the value of a variable

by
a specified amount, typically done using the += operator or by

adding a fixed value.
Indent In Python, "indent" refers to the use of whitespace at the beginning

of
a line to signify the structure and scope of code blocks, such as

loops and functions.
Indices In Python, "indices" refer to the position or location of elements in

a sequence, like a string, list, or tuple, starting with 0 for the first
element.
Iterate In Python, "iterate" means to repeatedly perform a set of

operations or steps on each item in a collection, such as a list, tuple,
or dictionary,
typically using loops or iterators.
Local variables Local variables in Python are variables defined within a specific
function or block of code and are only accessible within that
function or block.
Logic operators Logic operators in Python are used to perform logical operations
on Boolean values, including operators like and (logical AND), or
(logical
OR), and not (logical NOT).

Loops Loops in Python are constructs for repeating a block of code,
enabling the execution of the same code multiple times.
Parameters Parameters in Python are placeholders in a function definition, used

to accept and work with values provided to the function when it is
called.
Programming Programming fundamentals in Python involve variables, control

Fundamentals structures, functions, data structures, input/output, and error
handling
for building software.
Range function The range function in Python generates a sequence of numbers

that can be used for iterating in a loop and is typically used as range
(start, stop, step), where it creates numbers from start to stop-1
with the given step increment.
Scope of function The "scope of a function" in Python refers to the region of code
where a variable defined within that function is accessible or
visible.
Sequences Sequences in Python are ordered collections of items that can

include data types like strings, lists, and tuples, allowing for
indexing and
iteration.
Syntax In Python, "explicitly" means to state something clearly and

directly, leaving no room for ambiguity or implicit interpretation.
While loops While loops in Python are used to repeatedly execute a block of
code as long as a specified condition is true.

MODULE 4: WORKING WITH DATA IN PYTHON
This module explains the basics of working with data in Python and begins the path with
learning how to read and write files. Continue the module and uncover the best Python libraries
that will aid in data manipulation and mathematical operations.
LEARNING OBJECTIVES
In this module you will learn about:
• Explain how Pandas use data frames.
• Use Pandas library for data analysis.
• Read text files using Python libraries including "open" and "with".
• Utilize NumPy to create one-dimensional and two-dimensional arrays.
• Write and save files in Python.

VIDEO 015: READING FILES WITH OPEN (3:43)
In this section, we will use Python's built-in open function to create a file object,
and obtain the data from a "txt" file.
Figure 420
We will use Python's open function to get a file object. We can apply a method to that object to
read data from the file.
Figure 421
We can open the file, Example1.txt, as follows.

We use the open function.
Figure 422

The first argument is the file path.
Figure 423
This is made up of the file name, and the file directory.
Figure 424
Figure 425
At :40 the voice over says "r' for reading but the slide shows "w" for writing.

The second parameter is the mode. Common values used include 'r' for reading, 'w' for writing,
and 'a' for appending. We will use 'r' for reading.
Figure 426
Finally, we have the file object.
Figure 427
We can now use the file object to obtain information about the file.
Figure 428

We can use the data attribute name to get the name of the file. The result is a string that contains
the name of the file. We can see what mode the object is in using the data attribute mode, and
'r' is shown representing read.
You should always close the file object using the method close. This may get tedious sometimes,
so let's use the "with" statement.
Figure 429
Using a "with" statement to open a file is better practice because it automatically closes the file.
Figure 430
The code will run everything in the indent block, then closes the file. This code reads the file,
Example1.txt. We can use the file object, "File1."
Figure 431

The code will perform all operations in the indent block then close the file at the end of the
indent.
Figure 432
Figure 433
The method "read" stores the values of the file in the variable "file_stuff" as a string.
Figure 434

You can print the file content.
Figure 435
You can check if the file content is closed, but you cannot read from it outside the indent.
Figure 436
But you can print the file content outside the indent as well.
Figure 437

We can print the file content. We will see the following. When we examine the raw string, we
will see the slash-n. This is so Python knows to start a new line.
Figure 438
We can output every line as an element in a list using the method "readlines." The first line
corresponds to the first element in the list. The second line corresponds to the second element
in the list, and so on. We can use the method "readline" to read the first line of the file.
Figure 439
If we run this command, it will store the first line in the variable "file_stuff" then print the first
line.
Figure 440

We can use the method "readline" twice. The first time it's called, it will save the first line in the
variable "file_stuff," and then print the first line. The second time it's called, it will save the
second line in the variable "file_stuff," and then print the second line.
Figure 441
We can use a loop to print out each line individually as follows.Let's represent every character
in a string as a grid. We can specify the number of characters we would like to read from the
string as an argument to the method "readline."
Figure 442
When we use a four as an argument in the method "readline," we print out the first four
characters in the file.
Figure 443
At 3:06 in the video, output to the function readlines() is wrongly mentioned. The output is relevant to the function read().

Each time we call the method, we will progress through the text. If we call a method with the
arguments 16, the first 16 characters are printed out, and then the new line. If we call the method
a second time, the next five characters are printed out. Finally, if we call the method the last time
with the argument nine, the last nine characters are printed out.
Figure 444
Check out the labs for more examples of methods and other file types.

HANDS-ON LAB: READING FILES WITH OPEN
READING FILES PYTHON
Objectives

• Read text files using Python libraries
Table of Contents
• Download Data
• Reading Text Files
• A Better Way to Open a File
Download Data
## Uncomment these if working locally, else let the following code cell run.
# import urllib.request
# url = 'https://cf-courses-data.s3.us.cloud-object-
storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-
SkillsNetwork/labs/Module%204/data/example1.txt'
# filename = 'Example1.txt'
# urllib.request.urlretrieve(url, filename)
## Download Example file

# !wget -O /resources/data/Example1.txt https://cf-courses-data.s3.us.cloud-object-
SkillsNetwork/labs/Module%204/data/example1.txt
from pyodide.http import pyfetch
import pandas as pd
filename = "https://cf-courses-data.s3.us.cloud-object-
SkillsNetwork/labs/Module%204/data/example1.txt"
async def download(url, filename):
response = await pyfetch(url)
if response.status == 200:
with open(filename, "wb") as f:
f.write(await response.bytes())
await download(filename, "example1.txt")
print("done")

Reading Text Files
One way to read or write a file in Python is to use the built-in open function. The open function
provides a File object that contains the methods and attributes you need in order to read, save,
and manipulate the file. In this notebook, we will only cover .txt files. The first parameter you
need is the file path and the file name. An example is shown as follow:
The mode argument is optional and the default value is r. In this notebook we only cover two
modes:
• **r**: Read mode for reading files
• **w**: Write mode for writing files
For the next example, we will use the text file Example1.txt. The file is shown as follows:
We read the file:
# Read the Example1.txt

example1 = "example1.txt"
file1 = open(example1, "r")
We can view the attributes of the file.

The name of the file:
# Print the path of file
file1.name
'example1.txt'
The mode the file object is in:
# Print the mode of file, either 'r' or 'w'
file1.mode
'r'
We can read the file and assign it to a variable :
# Read the file
FileContent = file1.read()
FileContent
'This is line 1 \nThis is line 2\nThis is line 3'

The /n means that there is a new line. We can print the file:
# Print the file with '\n' as a new line
print(FileContent)
This is line 1
This is line 2
This is line 3
The file is of type string:
# Type of file content
type(FileContent)
str
It is very important that the file is closed in the end. This frees up resources and ensures
consistency across different python versions.
A Better Way to Open a File
Using the with statement is better practice, it automatically closes the file even if the code
encounters an exception. The code will run everything in the indent block then close the file
object.
# Open file using with
with open(example1, "r") as file1:

FileContent = file1.read()
print(FileContent)
This is line 1
This is line 2
This is line 3
The file object is closed, you can verify it by running the following cell:
# Verify if the file is closed
file1.closed
True
We can see the info in the file:
# See the content of file
print(FileContent)
This is line 1
This is line 2
This is line 3

The syntax is a little confusing as the file object is after the as statement. We also don’t explicitly
close the file. Therefore, we summarize the steps in a figure:
Figure 445
We don’t have to read the entire file, for example, we can read the first 4 characters by entering
three as a parameter to the method .read():
# Read first four characters

print(file1.read(4))
This
Once the method.read(4) is called the first 4 characters are called. If we call the method
again, the next 4 characters are called. The output for the following cell will demonstrate the
process for different inputs to the method read():
# Read certain amount of characters

This
is
line 1
This is line 2
The process is illustrated in the below figure, and each color represents the part of the file read
after the method read() is called:
Figure 446

Here is an example using the same file, but instead we read 16, 5, and then 9 characters at a
time:
# Read certain amount of characters

This is line 1
This
is line 2
We can also read one line of the file at a time using the method readline():
# Read one line
first line: This is line 1
We can also pass an argument to readline() to specify the number of charecters we want to
read. However, unlike read(), readline() can only read one line at most.

print(file1.readline(20)) # does not read past the end of line
print(file1.read(20)) # Returns the next 20 chars
This is line 1
This is line 2
This
We can use a loop to iterate through each line:
# Iterate through the lines
with open(example1,"r") as file1:

i = 0;
for line in file1:
print("Iteration", str(i), ": ", line)
i = i + 1
Iteration 0 : This is line 1
We can use the method readlines() to save the text file to a list:
# Read all lines and save as a list

FileasList = file1.readlines()
Each element of the list corresponds to a line of text:
# Print the first line
FileasList[0]
'This is line 1 \n'

Print the second line
FileasList[1]
# Print the third line
FileasList[2]
'This is line 3'
The last exercise!


VIDEO 016: WRITING FILES WITH OPEN (2:54)
We can also write to files using the open function.
Figure 447
We will use Python's open function to get a file object to create a text file.
Figure 448
We can apply method 'write' to write data to that file. As a result, text will be written to the file.
Figure 449
We can create the file Example2.txt as follows:

(1) We use the 'open' function.
Figure 450
The first argument is the file path.
Figure 451
This is made up of the file name – if you have that file in your directory, it will be overwritten –
and the file directory.
Figure 452

Figure 453
(2) We set the mode parameter to W for writing.
Figure 454
Finally, we have the file object.
Figure 455

As before, we use the 'with' statement. The code will run everything in the indent block, then
close the file.
Figure 456
(1) We create the file object, File1.
Figure 457
(2) We use the open function.
Figure 458

This creates a file Example2.txt in your directory.
Figure 459
(3) We use the method write, to write data into the file.
Figure 460
The argument is the text we would like input into the file.
Figure 461

If we use the write method successively, each time it is called, it will write to the file.
The first time it is called, we will write, "This is line A" with a slash-n to represent a new line.
The second time we call the method, it will write, "This is line B" then it will close the file.
Figure 462
We can write each element in a list to a file.
Figure 463
As before, we use a 'with' command and the open function to create a file.
Figure 464

The list Lines has three elements consisting of text.
Figure 465
We use a 'for' loop to read each element of the first lines and pass it to the variable line.
Figure 466
The first iteration of the loop writes the first element of the list to the file Example2.
Figure 467

The second iteration writes the second element of the list and so on.
Figure 468
Figure 469
At the end of the loop, the file will be closed.

We can set the mode to appended using a lowercase 'a'. This will not create a new file, but just
use the existing file.
Figure 470

If we call the method write, it will just write to the existing file, then add "This is line C" then
close the file.
Figure 471
We can copy one file to a new file as follows.
Figure 472
First, we read the file Example1 and interact with it via the file object, readfile.
Figure 473
Then we create a new file Example3 and use the file object writefile to interact with it.
The for loop takes a line from the file object readfile and stores it in the file Example3 using the
file object writefile.
Figure 474
Figure 475
The first iteration copies the first line.
Figure 476

The second iteration copies the second line, till the end of the file is reached. Then both files are
closed.
Figure 477
Figure 478
Check out the labs for more examples.

HANDS-ON LAB: WRITING FILES WITH OPEN
WRITE AND SAVE FILES IN PYTHON
OBJECTIVES
• Write to files using Python libraries
TABLE OF CONTENTS
• Writing Files
• Appending Files
• Additional File modes
• Copy a File
WRITING FILES
We can open a file object using the method write() to save the text file to a list. To write to a file,
the mode argument must be set to w. Let’s write a file Example2.txt with the line: “This is
line A”
# Write line to file

exmp2 = 'Example2.txt'
with open(exmp2, 'w') as writefile:
writefile.write("This is line A")
We can read the file to see if it worked:
# Read file
with open(exmp2, 'r') as testwritefile:

print(testwritefile.read())
This is line A
We can write multiple lines:
# Write lines to file
with open(exmp2, 'w') as writefile:

writefile.write("This is line A\n")
writefile.write("This is line B\n")
The method .write() works similar to the method .readline(), except instead of reading a new line
it writes a new line. The process is illustrated in the figure. The different color coding of the grid
represents a new line added to the file after each method call.
Figure 479

You can check the file to see if your results are correct
# Check whether write to file
with open(exmp2, 'r') as testwritefile:

This is line A
This is line B
We write a list to a .txt file as follows:
# Sample list of text
Lines = ["This is line A\n", "This is line B\n", "This is line C\n"]
Lines
['This is line A\n', 'This is line B\n', 'This is line C\n']
# Write the strings in the list to text file
with open('Example2.txt', 'w') as writefile:

for line in Lines:
print(line)
writefile.write(line)
This is line A
This is line B
This is line C
We can verify the file is written by reading it and printing out the values:
# Verify if writing to file is successfully executed
with open('Example2.txt', 'r') as testwritefile:

However, note that setting the mode to w overwrites all the existing data in the file.
with open('Example2.txt', 'w') as writefile:

writefile.write("Overwrite\n")
Overwrite
ne A
This is line B
This is line C
APPENDING FILES
We can write to files without losing any of the existing data as follows by setting the mode
argument to append: a. you can append a new line as follows:
# Write a new line to text file
with open('Example2.txt', 'a') as testwritefile:

testwritefile.write("This is line C\n")
testwritefile.write("This is line D\n")
testwritefile.write("This is line E\n")

You can verify the file has changed by running the following cell:
# Verify if the new line is in the text file

Overwrite
ne A
This is line B
This is line C
This is line C
This is line D
This is line E
ADDITIONAL MODES
It's fairly ineffecient to open the file in a or w and then reopening it in r to read any lines. Luckily
we can access the file in the following modes:
• r+ : Reading and writing. Cannot truncate the file.

• w+ : Writing and reading. Truncates the file.
• a+ : Appending and Reading. Creates a new file, if none exists. You dont have to dwell
on the specifics of each mode for this lab.
Let's try out the a+ mode:
with open('Example2.txt', 'a+') as testwritefile:

testwritefile.write("This is line E\n")
There were no errors but read() also did not output anything. This is because of our location in
the file.
Most of the file methods we've looked at work in a certain location in the file. .write() writes at
a certain location in the file. .read() reads at a certain location in the file and so on. You can think
of this as moving your pointer around in the notepad to make changes at specific location.
Opening the file in w is akin to opening the .txt file, moving your cursor to the beginning of the
text file, writing new text and deleting everything that follows. Whereas opening the file in a is
similiar to opening the .txt file, moving your cursor to the very end and then adding the new
pieces of text.
It is often very useful to know where the 'cursor' is in a file and be able to control it. The following
methods allow us to do precisely this -
• .tell() - returns the current position in bytes

• .seek(offset,from) - changes the position by 'offset' bytes with respect to 'from'. From
can take the value of 0,1,2 corresponding to beginning, relative to current position and
end

Now lets revisit a+
with open('Example2.txt', 'a+') as testwritefile:

print("Initial Location: {}".format(testwritefile.tell()))
data = testwritefile.read()
if (not data): #empty strings return false in python
print('Read nothing')
else:
testwritefile.seek(0,0) # move 0 bytes from beginning.
print("\nNew Location : {}".format(testwritefile.tell()))

data = testwritefile.read()
if (not data):
print('Read nothing')
else:
print(data)
print("Location after read: {}".format(testwritefile.tell()) )

Initial Location: 150
Read nothing
New Location : 0
Overwrite
ne A
This is line B
This is line C
This is line C
This is line D
This is line E
This is line C
This is line D
This is line E
This is line E
Location after read: 150
Finally, a note on the difference between w+ and r+. Both of these modes allow access to read
and write methods, however, opening a file in w+ overwrites it and deletes all pre-existing data.
To work with a file on existing data, use r+ and a+. While using r+, it can be useful to add a
.truncate() method at the end of your data. This will reduce the file to your data and delete
everything that follows.
In the following code block, Run the code as it is first and then run it with the .truncate().
with open('Example2.txt', 'r+') as testwritefile:

data = testwritefile.readlines()
testwritefile.seek(0,0) #write at beginning of file
testwritefile.write("Line 1" + "\n")

testwritefile.write("finished\n")
#Uncomment the line below
#testwritefile.truncate()
testwritefile.seek(0,0)
Line 1
Line 2
Line 3
finished
This is line C
This is line C
This is line D
This is line E
This is line C
This is line D

This is line E
This is line E
COPY A FILE
Let's copy the file Example2.txt to the file Example3.txt:
# Copy file to another
with open('Example2.txt','r') as readfile:

with open('Example3.txt','w') as writefile:
for line in readfile:
writefile.write(line)
We can read the file to see if everything works:
# Verify if the copy is successfully executed
with open('Example3.txt','r') as testwritefile:

Line 1
Line 2
Line 3
finished
This is line C
This is line C
This is line D
This is line E
This is line C
This is line D
This is line E
This is line E
After reading files, we can also write data into files and save them in different file formats like
.txt, .csv, .xls (for excel files) etc. You will come across these in further examples
NOTE: If you wish to open and view the example3.txt file, download this lab here and
run it locally on your machine. Then go to the working directory to ensure the
example3.txt file exists and contains the summary data that we wrote.

EXERCISE
Your local university's Raptors fan club maintains a register of its active members on a .txt
document. Every month they update the file by removing the members who are not active. You
have been tasked with automating this with your Python skills.
Given the file currentMem, Remove each member with a 'no' in their Active column. Keep track
of each of the removed members and append them to the exMem file. Make sure that the format
of the original files in preserved. (Hint: Do this by reading/writing whole lines and ensuring the
header remains )
Run the code block below prior to starting the exercise. The skeleton code has been provided
for you. Edit only the cleanFiles function.
#Run this prior to starting the exercise

from random import randint as rnd
memReg = 'members.txt'
exReg = 'inactive.txt'
fee =('yes','no')
def genFiles(current,old):
with open(current,'w+') as writefile:
writefile.write('Membership No Date Joined Active \n')
data = "{:^13} {:<11} {:<6}\n"
for rowno in range(20):

date = str(rnd(2015,2020))+ '-' + str(rnd(1,12))+'-'+str(rnd(1,25))
writefile.write(data.format(rnd(10000,99999),date,fee[rnd(0,1)]))
with open(old,'w+') as writefile:

writefile.write('Membership No Date Joined Active \n')
data = "{:^13} {:<11} {:<6}\n"
for rowno in range(3):
date = str(rnd(2015,2020))+ '-' + str(rnd(1,12))+'-'+str(rnd(1,25))
writefile.write(data.format(rnd(10000,99999),date,fee[1]))
genFiles(memReg,exReg)
Now that you've run the prerequisite code cell above, which prepared the files for this exercise,
you are ready to move on to the implementation.
Exercise: Implement the cleanFiles function in the code cell below.
'''
The two arguments for this function are the files:
- currentMem: File containing list of current members
- exMem: File containing list of old members
This function should remove all rows from currentMem containing 'no'
in the 'Active' column and appends them to exMem.
'''
def cleanFiles(currentMem, exMem):
# TODO: Open the currentMem file as in r+ mode
#TODO: Open the exMem file in a+ mode
#TODO: Read each member in the currentMem (1 member per row) file into a
list.
# Hint: Recall that the first line in the file is the header.
#TODO: iterate through the members and create a new list of the innactive
members
# Go to the beginning of the currentMem file

# TODO: Iterate through the members list.
# If a member is inactive, add them to exMem, otherwise write them into
currentMem
pass # Remove this line when done implementation
# The code below is to help you view the files.

# Do not modify this code for this exercise.
cleanFiles(memReg,exReg)
headers = "Membership No Date Joined Active \n"

with open(memReg,'r') as readFile:
print("Active Members: \n\n")
print(readFile.read())
with open(exReg,'r') as readFile:

print("Inactive Members: \n\n")
The code cell below is to verify your solution. Please do not modify the code and run it to test
your implementation of cleanFiles.
def testMsg(passed):
if passed:
return 'Test Passed'
else :
return 'Test Failed'
testWrite = "testWrite.txt"
testAppend = "testAppend.txt"
passed = True
genFiles(testWrite,testAppend)
with open(testWrite,'r') as file:

ogWrite = file.readlines()
with open(testAppend,'r') as file:

ogAppend = file.readlines()
try:
cleanFiles(testWrite,testAppend)
except:
print('Error')
with open(testWrite,'r') as file:

clWrite = file.readlines()
with open(testAppend,'r') as file:

clAppend = file.readlines()
# checking if total no of rows is same, including headers
if (len(ogWrite) + len(ogAppend) != len(clWrite) + len(clAppend)):

print("The number of rows do not add up. Make sure your final files have the same
header and format.")
passed = False
for line in clWrite:

if 'no' in line:
passed = False
print("Inactive members in file")
break
else:
if line not in ogWrite:
print("Data in file does not match original file")
passed = False
print ("{}".format(testMsg(passed)))
Implementation of cleanFiles:
def cleanFiles(currentMem,exMem):
with open(currentMem,'r+') as writeFile:
with open(exMem,'a+') as appendFile:
#get the data
writeFile.seek(0)
members = writeFile.readlines()
#remove header
header = members[0]
members.pop(0)
inactive = [member for member in members if ('no' in member)]

'''
The above is the same as
for member in members:

if 'no' in member:
inactive.append(member)
'''
#go to the beginning of the write file
writeFile.seek(0)
writeFile.write(header)
for member in members:
if (member in inactive):
appendFile.write(member)
else:
writeFile.write(member)
writeFile.truncate()
cleanFiles(memReg,exReg)
# code to help you see the files
headers = "Membership No Date Joined Active \n"
with open(memReg,'r') as readFile:

print("Active Members: \n\n")
with open(exReg,'r') as readFile:

print("Inactive Members: \n\n")
The last exercise!
Congratulations, you have completed this lesson and hands-on lab in Python.

PRACTICE QUIZ: READING AND WRITING FILES WITH OPEN
Question 1
What are the most common modes used when opening a file?
o (a)ppend, (c)lose, (w)rite
o (s)ave, (r)ead, (w)rite
o (a)ppend, (r)edline, (w)rite
o (a)ppend, (r)ead, (w)rite
Correct:(a)ppend, (r)ead, (w)rite are the three modes of operation.
Question 2
Which data attribute retrieves the file's title?
o file1.open()
o file1.close()
o file1.mode
o file1.name
Correct:The name attribute returns the filename.
Question 3
Which command instructs Python to initiate a new line?
o \q
o \b
o \e
o \n
Correct:In Python \n instructs the code to begin a new line.
Question 4
Which attribute facilitates the input of data into a file?
o file1.close()
o file1.read()
o file1.open()
o file1.write()
Correct:The “write” method writes data into a file.

VIDEO 017: PANDAS: LOADING DATA (4:51)
Dependencies or libraries are prewritten code to help solve problems. In the video, we will
introduce pandas, a popular library for data analysis.
Figure 480
IMPORTING PANDAS
You can import the library or dependency like pandas using the following command.
Start with the import command followed by the name of the library. You now have access to a
large number of pre-built classes and functions. This assumes the library is installed.
Figure 481
In our lab environment, all the necessary libraries are installed. Let us say we would like to load
a CSV file using pandas build in function, read_csv. A CSV is a typical file type used to store data.
Simply type the word pandas, then a dot and the name of the function with all the inputs.
Figure 482

Typing pandas all the time make it difficult, so you can use the As statement to shorten the name
of the library. In this case, use the standard abbreviation pd. Now type pd, next, type a dot
followed by the name of the function you would like to use. In this case, read_csv.
Figure 483
You are not limited to the abbreviation pd. In this case, we use the term banana. However, we
will stick with pd for the rest of the video. Let us examine this code in more detail.
Figure 484
DATAFRAMES
One way pandas allows you to work with data is in a dataframe. Let us review the process of
going from a CSV file to a dataframe.
Figure 485

This variable stores the path of the CSV.
Figure 486
It is used as an argument to the read_csv function.
Figure 487
The result is stored to the variable df. This is short for dataframe. Now that you have the data
in a dataframe, you can work with it.
Figure 488

You can use the method head to examine the first five rows of a dataframe.
Figure 489
The process for loading an Excel file is similar. Use the path of the Excel file, the function
read_excel.
Figure 490
Figure 491

The result is a dataframe. A dataframe is comprised of rows and columns.
Figure 492
You can create a dataframe out of a dictionary.
Figure 493
The keys correspond to the column labels.
Figure 494

The values are lists corresponding to the rows.
Figure 495
You can then cast the dictionary to a dataframe using the function DataFrame.
Figure 496
Notice the direct correspondence between the table.
Figure 497

The keys correspond to the table headers.
Figure 498
The values are lists corresponding to the rows.
Figure 499
USING DATAFRAMES
Create a new dataframe consisting of one column.
Figure 500

Enclose the dataframe name, in this case df, and the name of the column header in double
brackets.
Figure 501
The result is a new dataframe comprised of the original column.
Figure 502
You can do the same thing for multiple columns.
Figure 503

Enclose the dataframe name, in this case df, and the name of multiple column headers in double
brackets.
Figure 504
The result is a new dataframe comprised of the specified columns.
Figure 505
ILOC
One way to access unique elements is with the iloc method. The first input is an integer
representing the row index, and the second is the integer representing the column index.
Figure 506

You can access the first row and first column as follows.
Figure 507
You can access the second row and first column as follows.
Figure 508
You can access the first row, third column as follows
Figure 509

and you can access the second row, third column as follows.
Figure 510
You can use the name of the row index and the column as well.
Figure 511
LOC
You can access the first row of the column named artist as follows.
Similarly, you can access the second row of the column named artist.
Figure 512

You can do the same for the column released.
Figure 513
WORKING WITH DATAFRAMES

Loc can also be used if the index is not an integer. We create a new dataframe called df_new.
Figure 514
We replace the index 1, 2, 3 and so on with a, b, c.
Figure 515

You can access the index a, that is the first row of the column named artist as follows.
Figure 516
Similarly, you can access the index b or the second row of the column named artist.
Figure 517
You can do the same for the column released.
Figure 518

You can also slice dataframes and assign the values to a new dataframe.
Figure 519
Assign the first two rows and the first three columns to the variable z.
Figure 520
The result is a dataframe comprised of the selected rows and columns.
Figure 521

SLICE
You can also slice dataframes and assign the values to a new dataframe using loc.
Figure 522
The code assigns the first 3 rows and all columns in between the columns named artist and
released.
Figure 523
The result is a new dataframe, z, with the corresponding values.
Figure 524
Check out the labs for more examples.

VIDEO 018: PANDAS: WORKING WITH AND SAVING DATA (2:06)
When we have a data frame we can work with the data and save the results in other formats.
Figure 525
LIST UNIQUE VALUES

Consider the stack of 13 blocks of different colors.
Figure 526
We can see there are 3 unique colors. Let's say you would like to find out how many unique
elements are in a column of a data frame. This may be much more difficult because instead of
13 elements, you may have millions. Pandas has the method unique to determine the unique
elements in a column of a data frame.
Figure 527

EXTRACTING
Lets say we would like to determine the unique year of the albums in the data set. We enter the
name of the data frame, then enter the name of the column Released within brackets.
Figure 528
Then we apply the method unique.
Figure 529
The result is all of the unique elements in the column Released.
Figure 530

CREATING A DATABASE
Let's say we would like to create a new database consisting of songs from the 1980s and after.
Figure 531
We can look at the column Released for songs made after 1979,
Figure 532
then select the corresponding rows. We can accomplish this within one line of code in Pandas.
But let's break up the steps.
Figure 533

APPLYING INEQUALITIY OPERATORS
We can use the inequality operators for the entire data frame in Pandas. The result is a series of
Boolean values.
Figure 534
For our case, we simply specify the column Released and the inequality for the albums after
1979.
Figure 535
The result is a series of Boolean values.
Figure 536
The result is true when the condition is true and false otherwise.
Figure 537
We can select the specified columns in one line.
Figure 538
We simply use the dataframes names and square brackets we placed the previously mentioned
inequality and assign it to the variable df1.
Figure 539

We now have a new data frame, where each album was released after 1979.
Figure 540
SAVE AS CSV
We can save the new data frame using the method to_csv.
Figure 541
The argument is the name of the csv file. Make sure you include a .csv extension.
Figure 542
There are other functions to save the data frame in other formats.

PRACTICE LAB: SELECTING DATA IN A DATAFRAME
OBJECTIVES
• Use Pandas Library to create DataFrame and Series
• Locate data in the DataFrame using loc() and iloc() functions
• Use slicing
Exercise 1: Pandas: DataFrame and Series
Pandas is a popular library for data analysis built on top of the Python programming language.
Pandas generally provide two data structures for manipulating data, They are:
• DataFrame
• Series
A DataFrame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows
and columns.
• A Pandas DataFrame will be created by loading the datasets from existing storage.
• Storage can be SQL Database, CSV file, an Excel file, etc.
• It can also be created from the lists, dictionary, and from a list of dictionaries.
Series represents a one-dimensional array of indexed data. It has two main components :
1. An array of actual data.
2. An associated array of indexes or data labels.
The index is used to access individual data values. You can also get a column of a dataframe as
a Series. You can think of a Pandas series as a 1-D dataframe.
# let us import the Pandas Library

import pandas as pd
Once you’ve imported pandas, you can then use the functions built in it to create and analyze
data.
In this practice lab, we will learn how to create a DataFrame out of a dictionary.
Let us consider a dictionary 'x' with keys and values as shown below.
We then create a dataframe from the dictionary using the function pd.DataFrame(dict)
#Define a dictionary 'x'
x = {'Name': ['Rose','John', 'Jane', 'Mary'], 'ID': [1, 2, 3, 4], 'Department':

['Architect Group', 'Software Group', 'Design Team', 'Infrastructure'],
'Salary':[100000, 80000, 50000, 60000]}
#casting the dictionary to a DataFrame

df = pd.DataFrame(x)
#display the result df

Df

Name ID Department Salary
0 Rose 1 Architect Group 100000
1 John 2 Software Group 80000
2 Jane 3 Design Team 50000
3 Mary 4 Infrastructure 60000
We can see the direct correspondence between the table. The keys correspond to the column
labels and the values or lists corresponding to the rows.
Column Selection:
To select a column in Pandas DataFrame, we can either access the columns by calling them by
their columns name.
Let's Retrieve the data present in the ID column.
#Retrieving the "ID" column and assigning it to a variable x

x = df[['ID']]
x
ID
0 1
1 2
2 3
3 4
Let's use the type() function and check the type of the variable.
#check the type of x

type(x)
pandas.core.frame.DataFrame
The output shows us that the type of the variable is a DataFrame object.
Access to multiple columns

Let us retrieve the data for Department, Salary and ID columns
#Retrieving the Department, Salary and ID columns and assigning it to a variable z
z = df[['Department','Salary','ID']]
z
Department Salary ID
0 Architect Group 100000 1
1 Software Group 80000 2
2 Design Team 50000 3
3 Infrastructure 60000 4

Try it yourself
Problem 1: Create a dataframe to display the result as below:
Figure 543
#write your code here

a = {'Student':['David', 'Samuel', 'Terry', 'Evan'],
'Age':['27', '24', '22', '32'],
'Country':['UK', 'Canada', 'China', 'USA'],
'Course':['Python','Data Structures','Machine Learning','Web Development'],
'Marks':['85','72','89','76']}
df1 = pd.DataFrame(a)
df1
Student Age Country Course Marks

0 David 27 UK Python 85
1 Samuel 24 Canada Data Structure 72
2 Terry 22 China Machine Learning 89
3 Evan 32 USA Web Development 76
Problem 2: Retrieve the Marks column and assign it to a variable b

b = df1[['Marks']]
b
Marks
0 85
1 72
2 89
3 76
Problem 3: Retrieve the Country and Course columns and assign it to a variable c

c = df1[['Country','Course']]
c
Country Course
0 UK Python
1 Canada Data Structure
2 China Machine Learning
3 USA Web Development

To view the column as a series, just use one bracket:
# Get the Name column as a series Object
x = df['Name']
x
0 Rose
1 John
2 Jane
3 Mary
Name: Name, dtype: object
#check the type of x

type(x)
pandas.core.series.Series
The output shows us that the type of the variable is a Series object.
Exercise 2: loc() and iloc() functions
loc() is a label-based data selecting method which means that we have to pass the name of the
row or column that we want to select. This method includes the last element of the range passed
in it.
Simple syntax for your understanding:
loc[row_label, column_label]
iloc() is an indexed-based selecting method which means that we have to pass integer index in
the method to select a specific row/column. This method does not include the last element of
the range passed in it.
Simple syntax for your understanding:
iloc[row_index, column_index]
Examples
Let us see some examples on the same.
# Access the value on the first row and the first column
df.iloc[0, 0]
'Rose'
# Access the value on the first row and the third column
df.iloc[0,2]
'Architect Group'
# Access the column using the name

df.loc[0, 'Salary']
100000

Let us create a new dataframe called 'df1' and assign 'df' to it. Now, let us set the "Name" column
as an index column using the method set_index().
df2=df
df2=df2.set_index("Name")
#To display the first 5 rows of new dataframe
df2.head()
Name ID Department Salary

Rose 1 Architect Group 100000
John 2 Software Group 80000
jane 3 Design Team 50000
Mary 4 Infrastructure 60000
#Now, let us access the column using the name

df2.loc['Jane', 'Salary']
50000
Try it yourself
Use the loc() function,to get the Department of Jane in the newly created dataframe df2.

df2.loc['Jane', 'Department']
'Design Team'
Use the iloc() function,to get the Salary of Mary in the newly created dataframe df2.

df2.iloc[3,2]
60000
Exercise 3: Slicing
Slicing uses the [] operator to select a set of rows and/or columns from a DataFrame.
To slice out a set of rows, you use this syntax: data[start:stop],
here the start represents the index from where to consider, and stop represents the index one
step BEYOND the row you want to select. You can perform slicing using both the index and the
name of the column.
NOTE: When slicing in pandas, the start bound is included in the output.
So if you want to select rows 0, 1, and 2 your code would look like this: df.iloc[0:3].
It means you are telling Python to start at index 0 and select rows 0, 1, 2 up to but not including
3.
NOTE: Labels must be found in the DataFrame or you will get a KeyError.
Indexing by labels(i.e. using loc()) differs from indexing by integers (i.e. using iloc()). With loc(),
both the start bound and the stop bound are inclusive. When using loc(), integers can be used,
but the integers refer to the index label and not the position.

For example, using loc() and select 1:4 will get a different result than using iloc() to select rows
1:4.
We can also select a specific data value using a row and column location within the
DataFrame and iloc indexing.
# let us do the slicing using old dataframe df
df.iloc[0:2, 0:3]
Name ID Department
0 Rose 1 Architect Group
1 John 2 Software Group
#let us do the slicing using loc() function on old dataframe df where index column is
having labels as 0,1,2
df.loc[0:2,'ID':'Department']
ID Department
0 1 Architect Group
1 2 Software Group
2 3 Design Team
#let us do the slicing using loc() function on new dataframe df1 where index column
is Name having labels: Rose, John and Jane
df2.loc['Rose':'Jane', 'ID':'Department']
Name ID Department
Rose 1 Architect Group
John 2 Software Group
jane 3 Design Team
Try it yourself ¶
using loc() function, do slicing on old dataframe df to retrieve the Name, ID and department of
index column having labels as 2,3

df.loc[2:3,'Name':'Department']
Name ID Department
jane 3 Design Team
Mary 4 Infrastructure
Congratulations, you have completed this lesson and the practice lab on Pandas.

HANDS ON LAB: PANDAS
INTRODUCTION TO PANDAS IN PYTHON
OBJECTIVES
• Use Pandas to access and view data
TABLE OF CONTENTS
• Introduction of Pandas
• Viewing Data and Accessing Data
• Quiz on DataFrame
ABOUT THE DATASET

The table has one row for each album and several columns.
• artist: Name of the artist
• album: Name of the album
• released_year: Year the album was released
• length_min_sec: Length of the album (hours,minutes,seconds)
• genre: Genre of the album
• music_recording_sales_millions: Music recording sales (millions in USD) on
[SONG://DATABASE]
• claimed_sales_millions: Album's claimed sales (millions in USD) on
[SONG://DATABASE]
• date_released: Date on which the album was released
• soundtrack: Indicates if the album is the movie soundtrack (Y) or (N)
• rating_of_friends: Indicates the rating from your friends from 1 to 10
You can see the dataset here:
Figure 544

INTRODUCTION OF PANDAS
# Dependency needed to install file
!pip install xlrd

!pip install openpyxl
Requirement already satisfied: xlrd in /home/jupyterlab/conda/envs/python/lib/python3.7/site-
packages (1.2.0)
Collecting openpyxl
Downloading openpyxl-3.1.2-py2.py3-none-any.whl (249 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 250.0/250.0 kB 28.5 MB/s
eta 0:00:00
Collecting et-xmlfile (from openpyxl)
Downloading et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-1.1.0 openpyxl-3.1.2
# Import required library
import pandas as pd
After the import command, we now have access to a large number of pre-built classes and
functions. This assumes the library is installed; in our lab environment all the necessary libraries
are installed. One way pandas allow you to work with data is a dataframe. Let's go through the
process to go from a comma separated values (.csv) file to a dataframe. This
variable csv_path stores the path of the .csv, which is used as an argument to
the read_csv function. The result is stored in the object df, this is a common short form used for
a variable referring to a Pandas dataframe.
# Read data from CSV file
csv_path = 'https://cf-courses-data.s3.us.cloud-object-
SkillsNetwork/labs/Module%204/data/TopSellingAlbums.csv'
df = pd.read_csv(csv_path)
We can use the method head() to examine the first five rows of a dataframe:
# Print first five rows of the dataframe
df.head()
Figure 545

We use the path of the excel file and the function read_excel. The result is a data frame as
before:
# Read data from Excel File and print the first five rows
xlsx_path = 'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-
data/CognitiveClass/PY0101EN/Chapter%204/Datasets/TopSellingAlbums.xlsx'
df = pd.read_excel(xlsx_path)
df.head()
Figure 546
We can access the column Length and assign it a new dataframe x:
The process is shown in the figure:
Figure 547

VIEWING DATA AND ACCESSING DATA
You can also get a column as a series. You can think of a Pandas series as a 1-D dataframe. Just
use one bracket:
# Get the column as a series
x = df['Length']
x
0 00:42:19
1 00:42:11
2 00:42:49
3 00:57:44
4 00:46:33
5 00:43:08
6 01:15:54
7 00:40:01
Name: Length, dtype: object
You can also get a column as a dataframe. For example, we can assign the column Artist:
# Get the column as a dataframe
x = df[['Artist']]
type(x)
You can do the same thing for multiple columns; we just put the dataframe name, in this case, df,
and the name of the multiple column headers enclosed in double brackets. The result is a new
dataframe comprised of the specified columns:
# Access to multiple columns
y = df[['Artist','Length','Genre']]
y
Figure 548

The process is shown in the figure:
Figure 549
One way to access unique elements is the iloc method, where you can access the 1st row and
the 1st column as follows:
# Access the value on the first row and the first column
df.iloc[0, 0]
'Michael Jackson'
You can access the 2nd row and the 1st column as follows:
# Access the value on the second row and the first column
df.iloc[1,0]
'AC/DC'
You can access the 1st row and the 3rd column as follows:
# Access the value on the first row and the third column
df.iloc[0,2]
1982
# Access the value on the second row and the third column
df.iloc[1,2]
1980
This is shown in the following image
Figure 550

You can access the column using the name as well, the following are the same as above:

df.loc[0, 'Artist']
'Michael Jackson'

df.loc[1, 'Artist']
'AC/DC'

df.loc[0, 'Released']
1982

df.loc[1, 'Released']
1980
Figure 551
You can perform slicing using both the index and the name of the column:
# Slicing the dataframe

df.iloc[0:2, 0:3]
Figure 552

Figure 553
# Slicing the dataframe using name
df.loc[0:2, 'Artist':'Released']
Figure 554

QUIZ ON DATAFRAME
Use a variable q to store the column Rating as a dataframe
q = df[['Rating']]
q
Assign the variable q to the dataframe that is made up of the column Released and Artist:
q = df[['Released', 'Artist']]
q
Access the 2nd row and the 3rd column of df:

df.iloc[1, 2]
1980
Use the following list to convert the dataframe index df to characters and assign it to df_new;
find the element corresponding to the row index a and column 'Artist'. Then select the
rows a through d for the column 'Artist'
df_new=df
df_new.index=new_index
df_new.loc['a', 'Artist']
df_new.loc['a':'d', 'Artist']
a Michael Jackson
b AC/DC
c Pink Floyd
d Whitney Houston
Name: Artist, dtype: object
The last exercise!


PRACTICE QUIZ: PANDAS
Question 1
What Python object do you cast to a data frame?
o list
o tuple
o Dictionary
o set
Correct: Dictionary can be cast to a data frame in Python.
Question 2
How would you access the first row and first column in the DataFrame df?
o df.ix[1,1]
o df.ix[1,0]
o df.ix[0,1]
o df.ix[0,0]
Correct: As Python uses zero-based indexing, the output here will be the first row and first
column.
Question 3
What is the proper way to load a CSV file using pandas?
o pandas.import_csv(‘data.csv’)
o pandas.load_csv(‘data.csv’)
o pandas.read_csv(‘data.csv’)
o pandas.from_csv(‘data.csv’)
Correct: pandas.read_csv('data.csv') loads CSV files using Pandas.
Question 4
Assume that you have a data frame containing details of various musical artists, their famous
albums, their genres, and various other parameters. Here, 'Genre' is the fifth column in the
sequence and there is an entry of “Disco” in the 7th row of the data. How would you select the
Genre disco?
o df.iloc[7, ‘Genre’]
o df.iloc[7, 5]
o df.iloc[6, 4]
o df.iloc[6, ‘genre’]
Correct: Correct! df.iloc[6, 4] will return the genre “disco.”

Question 5
Assume that you have a data frame containing details of various musical artists, their famous
albums, their genres, and various other parameters. Here, 'Album' is the second column. How
do we retrieve records from row 3 through row 6?
o df.loc[2:5, ‘Album’]
o df.iloc[2:6, 3]
o df.loc[2:5, 1]
o df.loc[2, ‘Album’]
Correct: df.loc[2:5, ‘Album’] will return the desired result.

VIDEO 019: ONE DIMENSIONAL NUMPY (11:23)
In the video we will be covering Numpy in 1D, in particular ND arrays.
Numpy is a library for scientific computing. It has many useful functions. There are many other
advantages like speed and memory. Numpy is also the basis for pandas. So, check out our
pandas video.
Figure 555
OBJECTIVES
In the video we will be covering
• the basics and array creation;
• indexing and slicing;
• basic operations;
• universal functions.
Figure 556

THE BASICS AND ARRAY CREATION
Let's go over how to create a Numpy array.
Figure 557
A Python list is a container that allows you to store and access data. Each element is associated
with an index. We can access each element using a square bracket as follows.
Figure 558
A Numpy array or ND array is similar to a list. It's usually fixed in size and each element is of the
same type, in this case integers.
Figure 559

We can cast a list to a Numpy array by first importing Numpy.
Figure 560
We then cast the list as follows:
Figure 561
We can access the data via an index.
Figure 562

As with the list, we can access each element with an integer and a square bracket.
Figure 563
The value of a is stored as follows.
Figure 564
If we check the type of the array we get, numpy.ndarray.
Figure 565

As Numpy arrays contain data of the same type, we can use the attribute dtype to obtain the
data type of the array's elements. In this case a 64-bit integer.
Figure 566
Let's review some basic array attributes using the array a.
Figure 567
The attribute size is the number of elements in the array. As there are five elements the result is
five. The next two attributes will make more sense when we get to higher dimensions, but let's
review them. The attribute ndim represents the number of array dimensions or the rank of the
array, in this case one. The attribute shape is a tuple of integers indicating the size of the array
in each dimension.
Figure 568

We can create a Numpy array with real numbers. When we check the type of the array, we get
numpy.ndarray. If we examine the attribute D type, we see float64 as the elements are not
integers.
There are many other attributes; check out numpy.org.
Figure 569
INDEXING AND SLICING

Let's review some indexing and slicing methods.
Figure 570
We can change the first element of the array to 100 as follows.
Figure 571

The array's first value is now 100.
Figure 572
We can change the fifth element of the array as follows:
Figure 573
The fifth element is now zero.
Figure 574

Like lists and tuples we can slice a Numpy array.
Figure 575
The elements of the array correspond to the following index.
Figure 576
We can select the elements from one to three and assign it to a new Numpy array d as follows.
Figure 577

The elements in d correspond to the index.
Figure 578
Like lists, we do not count the element corresponding to the last index.
Figure 579
We can assign the corresponding indices to new values as follows.
Figure 580

The array c now has new values.
Figure 581

BASIC OPERATIONS
See the labs or numpy.org for more examples of what you can do with Numpy.
Numpy makes it easier to do many operations that are commonly performed in data science. The
same operations are usually computationally faster and require less memory in Numpy
compared to regular Python. Let's review some of these operations on one-dimensional arrays.
We will look at many of the operations in the context of Euclidian vectors to make things more
interesting.
Figure 582
VECTOR ADDITION AND SUBTRACTION

Vector addition is a widely used operation in data science.
Figure 583
Consider the vector u with two elements, the elements are distinguished by the different colors.
Similarly, consider the vector v with two components. In vector addition, we create a new vector
in this case z. The first component of z is the addition of the first component of vectors u and v.
Similarly, the second component is the sum of the second components of u and v. This new
vector z is now a linear combination of the vector u and v.
Figure 584

Representing vector addition with line segment or arrows is helpful. The first vector is
represented in red. The vector will point in the direction of the two components. The first
component of the vector is one. As a result the arrow is offset one unit from the origin in the
horizontal direction. The second component is zero, we represent this component in the vertical
direction. As this component is zero, the vector does not point in the vertical direction. We
represent the second vector in blue. The first component is zero, therefore the arrow does not
point to the horizontal direction. The second component is one. As a result the vector points in
the vertical direction one unit. When we add the vector u and v, we get the new vector z. We
add the first component, this corresponds to the horizontal direction. We also add the second
component.
Figure 585
It's helpful to use the tip to tail method when adding vectors, placing the tail of the vector v on
the tip of vector u.
Figure 586
The new vector z is constructed by connecting the base of the first vector u with the tail of the
second v.
Figure 587

The following three lines of code will add the two lists and place the result in the list z.
Figure 588
Vector Addition
We can also perform vector addition with one line of Numpy code. It would require multiple lines
to perform vector addition1 on two lists as shown on the right side of the screen. In addition, the
Numpy code will run much faster. This is important if you have lots of data.
Figure 589
Vector Subtraction
We can also perform vector subtraction by changing the addition sign to a subtraction sign. It
would require multiple lines to perform vector subtraction on two lists as shown on the right
side of the screen.
Figure 590
1
At 5:44 in the video the transcript and voice over should be "addition" instead of "subtraction."

ARRAY MULTIPLICATION WITH SCALAR
Vector multiplication with a scalar is another commonly performed operation.
Figure 591
Consider the vector y, each component is specified by a different color. We simply multiply the
vector by a scalar value, in this case two. Each component of the vector is multiplied by two, in
this case each component is doubled.
Figure 592
We can use the line segment or arrows to visualize what's going on. The original vector y is in
purple.
Figure 593

After multiplying it by a scalar value of two, the vector is stretched out by two units as shown in
red. The new vector is twice as long in each direction.
Figure 594
Vector multiplication with a scalar only requires one line of code using Numpy. It would require
multiple lines to perform the same task as shown with Python lists as shown on the right side
of the screen. In addition, the operation would also be much slower.
Figure 595
Product of 2 Numpy Arrays
Hadamard Product
Hadamard product is another widely used operation in data science. Consider the following two
vectors, u and v. The Hadamard product of u and v is a new vector z. The first component of z is
the product of the first element of u and v. Similarly, the second component is the product of the
second element of u and v. The resultant vector consists of the entry wise product of u and v.
Figure 596

We can also perform Hadamard product with one line of code in Numpy. It would require
multiple lines to perform Hadamard product on two lists as shown on the right side of the screen.
Figure 597
Dot Product
The dot product is another widely used operation in data science. Consider the vector u and v,
the dot product is a single number given by the following term and represents how similar two
vectors are. We multiply the first component from v and u, we then multiply the second
component and add the result together. The result is a number that represents how similar the
two vectors are.
Figure 598
We can also perform dot product using the Numpy function dot and assign it with the variable
result as follows.
Figure 599

ADDING CONSTANT TO A NUMPY ARRAY
Consider the array u, the array contains the following elements. If we add a scalar value to the
array, Numpy will add that value to each element. This property is known as broadcasting.
Figure 600
UNIVERSAL FUNCTIONS
A universal function is a function that operates on ND arrays.
Figure 601
We can apply a universal function to a Numpy array. Consider the arrays a, we can calculate the
mean or average value of all the elements in a using the method mean. This corresponds to the
average of all the elements. In this case the result is zero.
Figure 602

Max()
There are many other functions. For example, consider the Numpy arrays b. We can find the
maximum value using the method max()2. We see the largest value is five, therefore the method
max returns a five.
Figure 603
We can use Numpy to create functions that map Numpy arrays to new Numpy arrays.
Let's implement some code on the left side of the screen and use the right side of the screen to
demonstrate what's going on. We can access the value of pi in Numpy as follows. We can
create the following Numpy array in radians. This array corresponds to the following vector.
We can apply the function sine to the array x and assign the values to the array y. This applies
the sine function to each element in the array, this corresponds to applying the sine function to
each component of the vector. The result is a new array y, where each value corresponds to a
sine function being applied to each element in the array x.
Figure 604
2
At 8:54 in the video the transcript and voice over should be "method max" instead of "method five."

Linspace()
A useful function for plotting mathematical functions is linspace. Linspace returns evenly spaced
numbers over a specified interval.
Figure 605
We specify the starting point of the sequence,
Figure 606
the ending point of the sequence.
Figure 607

The parameter num indicates the number of samples to generate, in this case five.
Figure 608
The space between samples is one.
Figure 609
If we change the parameter num to 9, we get 9 evenly spaced numbers over the interval from -
2 to 2. The result is the difference between subsequent samples is 0.5 as opposed to 1 as before.
Figure 610

We can use the function linspace to generate 100 evenly spaced samples from the interval zero
to two pi. We can use the Numpy function sine to map the array x to a new array y.
We can import the library pyplot as plt to help us plot the function. As we are using a Jupiter
notebook, we use the command matplotlib inline to display the plot. The following command
plots a graph.
Figure 611
The first input corresponds to the values for the horizontal or x-axis.
Figure 612
The second input corresponds to the values for the vertical or y-axis.
Figure 613
There's a lot more you can do with Numpy. Check out the labs and numpy.org for more.
Thanks for watching the video.

HANDS-ON LAB: ONE DIMENSIONAL NUMPY
1D Numpy in Python
Objectives

• Import and use the numpy library
• Perform operations with numpy
Table of Contents
• What is Numpy?
o Type
o Assign Value
o Slicing
o Assign Value with List
o Other Attributes
• Numpy Array Operations
o Array Addition
o Array Multiplication
o Product of Two Numpy Arrays
o Dot Product
o Adding Constant to a Numpy Array
• Mathematical Functions
• Linspace
What is Numpy?
NumPy is a Python library used for working with arrays, linear algebra, fourier transform, and
matrices. NumPy stands for Numerical Python and it is an open source project. The array object
in NumPy is called ndarray, it provides a lot of supporting functions that make working with
ndarray very easy.
Arrays are very frequently used in data science, where speed and resources are very important.
NumPy is usually imported under the np alias.
It's usually fixed in size and each element is of the same type. We can cast a list to a numpy
array by first importing numpy:
# import numpy library
import numpy as np

We then cast the list as follows:
# Create a numpy array
a = np.array([0, 1, 2, 3, 4])
a
array([0, 1, 2, 3, 4])
Each element is of the same type, in this case integers:
Figure 614
As with lists, we can access each element via a square bracket:
# Print each element
print("a[0]:", a[0])
print("a[1]:", a[1])
print("a[2]:", a[2])
print("a[3]:", a[3])
print("a[4]:", a[4])
a[0]: 0
a[1]: 1
a[2]: 2
a[3]: 3
a[4]: 4
Checking NumPy Version

The version string is stored under version attribute.
print(np.__version__)
1.26.1
Type
If we check the type of the array we get numpy.ndarray:
# Check the type of the array
type(a)
numpy.ndarray
As numpy arrays contain data of the same type, we can use the attribute "dtype" to obtain the
data type of the array’s elements. In this case, it's a 64-bit integer:
# Check the type of the values stored in numpy array
a.dtype
dtype('int32')

Try it yourself
Check the type of the array and Value type for the given array c
b = np.array([3.1, 11.02, 6.2, 213.2, 5.2])
# Enter your code here

type(b)
b.dtype
dtype('float64')
If we examine the attribute dtype we see float 64, as the elements are not integers:
Assign value
We can change the value of the array. Consider the array c:
# Create numpy array
c = np.array([20, 1, 2, 3, 4])
c
array([20, 1, 2, 3, 4])
We can change the first element of the array to 100 as follows:
# Assign the first element to 100
c[0] = 100
c
array([100, 1, 2, 3, 4]
We can change the 5th element of the array to 0 as follows:
# Assign the 5th element to 0
c[4] = 0
c
array([100, 1, 2, 3, 0])
Try it yourself
Assign the value 20 for the second element in the given array.
a = np.array([10, 2, 30, 40,50])

a[1]=20
a
array([10, 20, 30, 40, 50])

Slicing
Like lists, we can slice the numpy array. Slicing in python means taking the elements from the
given index to another given index.
We pass slice like this: [start:end].The element at end index is not being included in the output.
We can select the elements from 1 to 3 and assign it to a new numpy array d as follows:
# Slicing the numpy array
d = c[1:4]
d
array([1, 2, 3])
We can assign the corresponding indexes to new values as follows:
# Set the fourth element and fifth element to 300 and 400
c[3:5] = 300, 400

c
array([100, 1, 2, 300, 400])
We can also define the steps in slicing, like this: [start:end:step].
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])
[2,4]
If we don't pass start its considered 0
print(arr[:4])
[1 2 3 4]
If we don't pass end it considers till the length of array.
print(arr[4:])
[5 6 7]
If we don't pass step its considered 1
print(arr[1:5:])
[2 3 4 5]
Try it yourself
Print the even elements in the given array.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

print(arr[1:8:2])
[2 4 6 8]

Assign Value with List
Similarly, we can use a list to select more than one specific index. The list select contains
several values:
# Create the index list

select = [0, 2, 3, 4]
select
[0, 2, 3, 4]
We can use the list as an argument in the brackets. The output is the elements corresponding
to the particular indexes:
# Use List to select elements
d = c[select]
d
array([100, 2, 300, 400])
We can assign the specified elements to a new value. For example, we can assign the values to
100 000 as follows:
# Assign the specified elements to new value
c[select] = 100000
c
array([100000, 1, 100000, 100000, 100000])
Other Attributes
Let's review some basic array attributes using the array a:

a = np.array([0, 1, 2, 3, 4])
a
array([0, 1, 2, 3, 4])
The attribute size is the number of elements in the array:

# Get the size of numpy array
a.size
5
The next two attributes will make more sense when we get to higher dimensions but let's review
them. The attribute ndim represents the number of array dimensions, or the rank of the array. In
this case, one:
# Get the number of dimensions of numpy array
a.ndim
1
The attribute shape is a tuple of integers indicating the size of the array in each dimension:
# Get the shape/size of numpy array
a.shape
(5,)

Try it yourself
Find the size, dimension and shape for the given array b
b = np.array([10, 20, 30, 40, 50, 60, 70])

b.size
b.ndim
b.shape
(7,)
Numpy Statistical Functions
a = np.array([1, -1, 1, -1])
# Get the mean of numpy array
mean = a.mean()
mean
0.0
# Get the standard deviation of numpy array
standard_deviation=a.std()
standard_deviation
1.0
b = np.array([-1, 2, 3, 4, 5])
b
array([-1, 2, 3, 4, 5])
# Get the biggest value in the numpy array
max_b = b.max()
max_b
5
# Get the smallest value in the numpy array
min_b = b.min()
min_b
-1

Try it yourself
Find the sum of maximum and minimum value in the given numpy array
c = np.array([-10, 201, 43, 94, 502])

max_c = c.max()
max_c
min_c = c.min()
min_c
Sum = (max_c +min_c)

Sum
492
Numpy Array Operations

You could use arithmetic operators directly between NumPy arrays
Array Addition
Consider the numpy array u:
u = np.array([1, 0])
u
array([1, 0])
Consider the numpy array v:
v = np.array([0, 1])
v
array([0, 1])
We can add the two arrays and assign it to z:
# Numpy Array Addition
z = np.add(u, v)
z
array([1, 1])
The operation is equivalent to vector addition:
# Plotting functions
import time
import sys
import numpy as np
def Plotvec1(u, z, v):
ax = plt.axes() # to generate the full window axes

ax.arrow(0, 0, *u, head_width=0.05, color='r', head_length=0.1)# Add an arrow to
the U Axes with arrow head width 0.05, color red and arrow head length 0.1
plt.text(*(u + 0.1), 'u')#Adds the text u to the Axes

ax.arrow(0, 0, *v, head_width=0.05, color='b', head_length=0.1)# Add an arrow to
the v Axes with arrow head width 0.05, color red and arrow head length 0.1
plt.text(*(v + 0.1), 'v')#Adds the text v to the Axes
ax.arrow(0, 0, *z, head_width=0.05, head_length=0.1)

plt.text(*(z + 0.1), 'z')#Adds the text z to the Axes
plt.ylim(-2, 2)#set the ylim to bottom(-2), top(2)
plt.xlim(-2, 2)#set the xlim to left(-2), right(2)
# Plot numpy arrays
Plotvec1(u, z, v)
Figure 615
Try it yourself
Perform addition operation on the given numpy array arr1 and arr2:
arr1 = np.array([10, 11, 12, 13, 14, 15])

arr2 = np.array([20, 21, 22, 23, 24, 25])

arr3 = np.add(arr1, arr2)
arr3
array([30, 32, 34, 36, 38, 40])
Array Subtraction
Consider the numpy array a:
a = np.array([10, 20, 30])

a
array([10, 20, 30])
Consider the numpy array b:
b = np.array([5, 10, 15])

b
array([ 5, 10, 15])

We can subtract the two arrays and assign it to c:
c = np.subtract(a, b)
print(c)
[ 5 10 15]
Try it yourself
Perform subtraction operation on the given numpy array arr1 and arr2:
arr1 = np.array([10, 20, 30, 40, 50, 60])

arr2 = np.array([20, 21, 22, 23, 24, 25])

arr3 = np.subtract(arr1, arr2)
arr3
array([-10, -1, 8, 17, 26, 35])
Array Multiplication
Consider the vector numpy array y:
x = np.array([1, 2])
x
array([1, 2])
y = np.array([2, 1])
y
array([2, 1])
We can multiply every element in the array by 2:
# Numpy Array Multiplication
z = np.multiply(x, y)
z
array([2, 2])
This is equivalent to multiplying a vector by a scalar.
Try it yourself
Perform multiply operation on the given numpy array arr1 and arr2:
arr1 = np.array([10, 20, 30, 40, 50, 60])

arr2 = np.array([2, 1, 2, 3, 4, 5])

arr3 = np.multiply(arr1, arr2)
arr3
array([ 20, 20, 60, 120, 200, 300])

Array Division
Consider the vector numpy array a:
a = np.array([10, 20, 30])

a
array([10, 20, 30])
Consider the vector numpy array b:
b = np.array([2, 10, 5])

b
array([ 2, 10, 5])
We can divide the two arrays and assign it to c:
c = np.divide(a, b)
c
array([5., 2., 6.])
Try it yourself
Perform division operation on the given numpy array arr1 and arr2:
arr1 = np.array([10, 20, 30, 40, 50, 60])

arr2 = np.array([3, 5, 10, 8, 2, 33])
arr3 = np.divide(arr1, arr2)
arr3
array([ 3.33333333, 4. , 3. , 5. , 25. , 1.81818182])
Dot Product
The dot product of the two numpy arrays u and v is given by:
X = np.array([1, 2])
Y = np.array([3, 2])
#Elements of X
print(X[0])
print(X[1])
1
2
# Calculate the dot product
np.dot(X, Y)
7
#Elements of Y
print(Y[0])
print(Y[1])
3
2

We are performing the dot product which is shown as below
Figure 616
Try it yourself
Perform dot operation on the given numpy array ar1 and ar2:
arr1 = np.array([3, 5])

arr2 = np.array([2, 4])

arr3 = np.dot(arr1, arr2)
arr3
26
Adding Constant to a Numpy Array

Consider the following array:
# Create a constant to numpy array
u = np.array([1, 2, 3, -1])
u
array([ 1, 2, 3, -1])
Adding the constant 1 to each element in the array:
# Add the constant to array
u + 1
The process is summarised in the following figure:
Figure 617

Try it yourself
Add Constant 5 to the given numpy array arr:
arr = np.array([1, 2, 3, -1])

arr + 5
array([6, 7, 8, 4])
Mathematical Functions
We can access the value of pi in numpy as follows :
# The value of pi
np.pi
3.141592653589793
We can create the following numpy array in Radians:
# Create the numpy array in radians
x = np.array([0, np.pi/2 , np.pi])
We can apply the function sin to the array x and assign the values to the array y; this applies the
sine function to each element in the array:
# Calculate the sin of each elements
y = np.sin(x)
y
array([0.0000000e+00, 1.0000000e+00, 1.2246468e-16])
Linspace
A useful function for plotting mathematical functions is linspace. Linspace returns evenly spaced
numbers over a specified interval.
numpy.linspace(start, stop, num = int value)
start : start of interval range
stop : end of interval range
num : Number of samples to generate.
# Makeup a numpy array within [-2, 2] and 5 elements

np.linspace(-2, 2, num=5)
array([-2., -1., 0., 1., 2.])
If we change the parameter num to 9, we get 9 evenly spaced numbers over the interval from -
2 to 2:
# Make a numpy array within [-2, 2] and 9 elements

p.linspace(-2, 2, num=9)
array([-2. , -1.5, -1. , -0.5, 0. , 0.5, 1. , 1.5, 2. ])

We can use the function linspace to generate 100 evenly spaced samples from the interval 0 to
2π:
# Make a numpy array within [0, 2π] and 100 elements

x = np.linspace(0, 2*np.pi, num=100)
We can apply the sine function to each element in the array x and assign it to the array y:
# Calculate the sine of x list

y = np.sin(x)
# Plot the result

plt.plot(x, y)
[<matplotlib.lines.Line2D at 0x468c478>]
Figure 618
Try it yourself
Make a numpy array within [5, 4] and 6 elements

np.linspace(5, 4, num=6)
array([5. , 4.8, 4.6, 4.4, 4.2, 4. ])
Iterating 1-D Arrays

Iterating means going through elements one by one.
If we iterate on a 1-D array it will go through each element one by one.
If we execute the numpy array, we get in the array format
rr1 = np.array([1, 2, 3])

print(arr1)
[1 2 3]
But if you want to result in the form of the list, then you can use for loop:
for x in arr1:
print(x)
1
2
3

Quiz on 1D Numpy Array
1. Implement the following vector subtraction in numpy: u-v
u = np.array([1, 0])
v = np.array([0, 1])
u - v
array([ 1, -1])
2. Multiply the numpy array z with -2:
z = np.array([2, 4])
-2 * z
array([-4, -8])
3. Consider the list [1, 2, 3, 4, 5] and [1, 0, 1, 0, 1]. Cast both lists to a numpy array
then multiply them together:
a = np.array([1, 2, 3, 4, 5])
b = np.array([1, 0, 1, 0, 1])
a * b
array([1, 0, 3, 0, 5])
# Import the libraries
import time
import sys
import numpy as np
def Plotvec2(a,b):
ax = plt.axes()# to generate the full window axes
ax.arrow(0, 0, *a, head_width=0.05, color ='r', head_length=0.1)#Add an arrow to
the a Axes with arrow head width 0.05, color red and arrow head length 0.1
plt.text(*(a + 0.1), 'a')
ax.arrow(0, 0, *b, head_width=0.05, color ='b', head_length=0.1)#Add an arrow to
the b Axes with arrow head width 0.05, color blue and arrow head length 0.1
plt.text(*(b + 0.1), 'b')
plt.ylim(-2, 2)#set the ylim to bottom(-2), top(2)
plt.xlim(-2, 2)#set the xlim to left(-2), right(2)

4. Convert the list [-1, 1] and [1, 1] to numpy arrays a and b. Then, plot the arrays as vectors using
the fuction Plotvec2 and find their dot product:

a = np.array([-1, 1])
b = np.array([1, 1])
Plotvec2(a, b)
print("The dot product is", np.dot(a,b))
The dot product is 0
Figure 619
5. Convert the list [1, 0] and [0, 1] to numpy arrays a and b. Then, plot the arrays as vectors
using the function Plotvec2 and find their dot product:

a = np.array([1, 0])
Plotvec2(a, b)
print("The dot product is", np.dot(a, b))
Figure 620

6. Convert the list [1, 1] and [0, 1] to numpy arrays a and b. Then plot the arrays as vectors
using the fuction Plotvec2 and find their dot product:

a = np.array([1, 1])
Plotvec2(a, b)
print("The dot product is", np.dot(a, b))
Figure 621
7. Why are the results of the dot product for [-1, 1] and [1, 1] and the dot product for [1,
0] and [0, 1] zero, but not zero for the dot product for [1, 1] and [0, 1]?
Hint: Study the corresponding figures, pay attention to the direction the arrows are pointing to.
Answer: The vectors used for question 4 and 5 are perpendicular. As a result, the dot product is
zero.
8. Convert the list [1, 2, 3] and [8, 9, 10] to numpy arrays arr1 and arr2. Then
perform Addition , Subtraction , Multiplication , Division and Dot Operation on
the arr1 and arr2.

arr1 = np.array([1, 2, 3])
arr2 = np.array([8, 9, 10])
arr3 = np.add(arr1, arr2)

print("add::",arr3)
arr4 = np.subtract(arr1, arr2)

print("subtract:",arr4)
arr5 = np.multiply(arr1, arr2)

print("multiply:",arr5)
arr6 = np.divide(arr1, arr2)

print("divide:",arr6)

arr7 = np.dot(arr1, arr2)
print("dot:",arr7)
add:: [ 9 11 13]
subtract: [-7 -7 -7]
multiply: [ 8 18 30]
divide: [0.125 0.22222222 0.3 ]
dot: 56
9. Convert the list [1, 2, 3, 4, 5] and [6, 7, 8, 9, 10] to numpy arrays arr1 and arr2.
Then find the even and odd numbers from arr1 and arr2.

arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([6, 7, 8, 9, 10])
#Starting index in slice is 1 as first even element(2) in array1 is at index 1

even_arr1 = arr1[1:5:2]
print("even for array1",even_arr1)
#Starting index in slice is 0 as first odd element(1) in array1 is at index 0

odd_arr1=arr1[0:5:2]
print("odd for array1",odd_arr1)
#Starting index in slice is 0 as first even element(6) in array2 is at index 0

even_arr2 = arr2[0:5:2]
print("even for array2",even_arr2)
#Starting index in slice is 1 as first odd element(7) in array2 is at index 1

odd_arr2=arr2[1:5:2]
print("odd for array2",odd_arr2)
even for array1 [2 4]
odd for array1 [1 3 5]
even for array2 [ 6 8 10]
odd for array2 [7 9]
The last exercise!


VIDEO 020: 2-DIMENSIONAL NUMPY ARRAYS (7:13)
We can create numpy arrays with more than one dimension. This section will focus only on 2D
arrays but you can use numpy to build arrays of much higher dimensions.
Figure 622
TABLE OF CONTENTS
In this video, we will cover
• the basics and array creation in 2D,
• indexing and slicing in 2D,
• and basic operations in 2D.
Figure 623
THE BASICS AND ARRAY CREATION IN 2D

Consider the list a, the list contains three nested lists each of equal size. Each list is color-coded
for simplicity.
Figure 624

We can cast the list to a numpy array as follows.
Figure 625
It is helpful to visualize the numpy array as a rectangular array each nested lists corresponds to
a different row of the matrix.
Figure 626
We can use the attribute ndim to obtain the number of axes or dimensions referred to as the
rank. The term rank does not refer to the number of linearly independent columns like a matrix.
It's useful to think of ndim as the number of nested lists.
Figure 627

The first list represents the first dimension.
Figure 628
This list contains another set of lists. This represents the second dimension or axis. The number
of lists the list contains does not have to do with the dimension but the shape of the list.
Figure 629
As with a 1D array, the attribute shape returns a tuple.
Figure 630

It's helpful to use the rectangular representation as well.
Figure 631
The first element in the tuple corresponds to the number of nested lists contained in the original
list or the number of rows in the rectangular representation, in this case 3.
Figure 632
The second element corresponds to the size of each of the nested list or the number of columns
in the rectangular array zero.
Figure 633

The convention is to label this axis 0.
Figure 634
and this axis 1 as follows.
Figure 635
We can also use the attribute size to get the size of the array. We see there are three rows and
three columns.
Figure 636

Multiplying the number of columns and rows together, we get the total number of elements, in
this case 9. Check out the labs for arrays of different shapes and other attributes.
Figure 637
We can use rectangular brackets to access the different elements of the array. The following
image demonstrates the relationship between the indexing conventions for the lists like
representation. The index in the first bracket corresponds to the different nested lists each a
different color. The second bracket corresponds to the index of a particular element within the
nested list.
Figure 638
Using the rectangular representation,
Figure 639

The first index corresponds to the row index.
Figure 640
The second index corresponds to the column index.
Figure 641
We could also use a single bracket to access the elements as follows.
Figure 642

Consider the following syntax.
Figure 643
This index corresponds to the second row,
Figure 644
and this index the third column,
Figure 645

the value is 23.
Figure 646
Consider this example,
Figure 647
this index corresponds to the first row
Figure 648

and the second index corresponds to the first column,
Figure 649
and a value of 11.
Figure 650
SLICING
We can also use slicing in numpy arrays.
Figure 651

The first index corresponds to the first row.
Figure 652
The second index accesses the first two columns.
Figure 653
Consider this example,
Figure 654

the first index corresponds to the first two rows.
Figure 655
The second index accesses the last column.
Figure 656

BASIC OPERATIONS IN 2D
Add
We can also add arrays, the process is identical to matrix addition. Consider the matrix X, each
element is colored differently. Consider the matrix Y. Similarly, each element is colored
differently. We can add the matrices. This corresponds to adding the elements in the same
position, i.e adding elements contained in the same color boxes together. The result is a new
matrix that has the same size as matrix Y or X. Each element in this new matrix is the sum of the
corresponding elements in X and Y.
Figure 657
To add two arrays in numpy, we define the array in this case X. Then we define the second array
Y, we add the arrays. The result is identical to matrix addition.
Figure 658

Multiply
Multiplying a numpy array by a scalar is identical to multiplying a matrix by a scalar. Consider

the matrix Y. If we multiply the matrix by this scalar two, we simply multiply every element in
the matrix by two. The result is a new matrix of the same size where each element is multiplied
by two.
Figure 659
Consider the array Y. We first define the array,
Figure 660
we multiply the array by a scalar as follows and assign it to the variable Z.
Figure 661

The result is a new array where each element is multiplied by two.
Figure 662
Multiplication of two arrays corresponds to an element-wise product, or Hadamard product.

Consider array X and array Y.
Figure 663
Hadarmad Product
Hadamard product corresponds to multiplying each of the elements in the same position i.e
multiplying elements contained in the same color boxes together. The result is a new matrix that
is the same size as matrix Y or X. Each element in this new matrix is the product of the
corresponding elements in X and Y.
Figure 664

Consider the array X and Y.
Figure 665
We can find the product of two arrays X and Y in one line, and assign it to the variable Z as
follows.
Figure 666
The result is identical to Hadamard product.
Figure 667

Matrix Multiplication
We can also perform matrix multiplication with Numpy arrays. Matrix multiplication is a little
more
complex but let's provide a basic overview. Consider the matrix A where each row is a different
color. Also, consider the matrix B where each column is a different color.
Figure 668
In linear algebra, before we multiply matrix A by matrix B, we must make sure that the number
of columns in matrix A in this case three is equal to the number of rows in matrix B, in this case
three.
Figure 669
From matrix multiplication, to obtain the ith row and jth column of the new matrix, we take the
dot product of the ith row of A (blue arrow) with the jth columns of B (red arrow). For the first
column, first row we take the dot product of the first row of A with the first column of B as
follows. The result is zero.
Figure 670

For the first row and the second column of the new matrix, we take the dot product of the first
row of the matrix A, but this time we use the second column of matrix B, the result is two.
Figure 671
For the second row and the first column of the new matrix, we take the dot product of the second
row of the matrix A. With the first column of matrix B, the result is zero.
Figure 672
Finally, for the second row and the second column of the new matrix, we take the dot product
of the second row of the matrix A with the second column of matrix B, the result is two.
Figure 673

In numpy, we can define the numpy arrays A and B. We can perform matrix multiplication and
assign it to array C. The result is the array C. It corresponds to the matrix multiplication of array
A and B.
Figure 674
There is a lot more you can do with it in numpy. Checkout numpy.org.
Thanks for watching this video.

HANDS-ON LAB: TWO DIMENSIONAL NUMPY
2D NUMPY IN PYTHON
OBJECTIVES
• Operate comfortably with numpy
• Perform complex operations with numpy
TABLE OF CONTENTS
• Create a 2D Numpy Array
• Accessing different elements of a Numpy Array
• Basic Operations
CREATE A 2D NUMPY ARRAY
# Import the libraries

import numpy as np
Consider the list a, which contains three nested lists each of equal size.
# Create a list
a = [[11, 12, 13], [21, 22, 23], [31, 32, 33]]
a
[[11, 12, 13], [21, 22, 23], [31, 32, 33]]
We can cast the list to a Numpy Array as follows:
# Convert list to Numpy Array

# Every element is the same type
A = np.array(a)
A
array([[11, 12, 13],
[21, 22, 23],
[31, 32, 33]])
We can use the attribute ndim to obtain the number of axes or dimensions, referred to as the
rank.
# Show the numpy array dimensions

A.ndim
2
Attribute shape returns a tuple corresponding to the size or number of each dimension.
# Show the numpy array shape

A.shape
(3, 3)

The total number of elements in the array is given by the attribute size.
# Show the numpy array size

A.size
9
ACCESSING DIFFERENT ELEMENTS OF A NUMPY ARRAY

We can use rectangular brackets to access the different elements of the array. The
correspondence between the rectangular brackets and the list and the rectangular
representation is shown in the following figure for a 3x3 array:
Figure 675
We can access the 2nd-row, 3rd column as shown in the following figure:
Figure 676
We simply use the square brackets and the indices corresponding to the element we would like:
# Access the element on the second row and third column

A[1, 2]
23
We can also use the following notation to obtain the elements:
# Access the element on the second row and third column

A[1][2]
23
Figure 677

We can access the element as follows:
# Access the element on the first row and first column

A[0][0]
11
We can also use slicing in numpy arrays. Consider the following figure. We would like to obtain
the first two columns in the first row
Figure 678
This can be done with the following syntax:
# Access the element on the first row and first and second columns
A[0][0:2]
array([11, 12])
Similarly, we can obtain the first two rows of the 3rd column as follows:
# Access the element on the first and second rows and third column
A[0:2, 2]
array([13, 23])
Corresponding to the following figure:
Figure 679
BASIC OPERATIONS
We can also add arrays. The process is identical to matrix addition. Matrix addition of X and Y is
shown in the following figure:
Figure 680

The numpy array is given by X and Y
# Create a numpy array X
X = np.array([[1, 0], [0, 1]])

X
array([[1, 0],
[0, 1]])
# Create a numpy array Y
Y = np.array([[2, 1], [1, 2]])

Y
array([[2, 1],
[1, 2]])
We can add the numpy arrays as follows.
# Add X and Y
Z = X + Y
Z
array([[3, 1],
[1, 3]])
Multiplying a numpy array by a scaler is identical to multiplying a matrix by a scalar. If we

multiply the matrix Y by the scaler 2, we simply multiply every element in the matrix by 2, as
shown in the figure.
Figure 681
We can perform the same operation in numpy as follows
Y = np.array([[2, 1], [1, 2]])

Y
array([[2, 1],
[1, 2]])
# Multiply Y with 2
Z = 2 * Y
Z
array([[4, 2],
[2, 4]])

Multiplication of two arrays corresponds to an element-wise product or Hadamard product.
Consider matrix X and Y. The Hadamard product corresponds to multiplying each of the
elements in the same position, i.e. multiplying elements contained in the same color boxes
together. The result is a new matrix that is the same size as matrix Y or X, as shown in the
following figure.
Figure 682
We can perform element-wise product of the array X and Y as follows:
Y = np.array([[2, 1], [1, 2]])

Y
array([[2, 1],
[1, 2]])
# Create a numpy array X
X = np.array([[1, 0], [0, 1]])

X
array([[1, 0],
[0, 1]])
# Multiply X with Y
Z = X * Y
Z
array([[2, 0],
[0, 2]])
We can also perform matrix multiplication with the numpy arrays A and B as follows:
First, we define matrix A and B:
# Create a matrix A
A = np.array([[0, 1, 1], [1, 0, 1]])

A
array([[0, 1, 1],
[1, 0, 1]])
# Create a matrix B
B = np.array([[1, 1], [1, 1], [-1, 1]])

B
array([[ 1, 1],
[ 1, 1],
[-1, 1]])

We use the numpy function dot to multiply the arrays together.
# Calculate the dot product

Z = np.dot(A,B)
Z
array([[0, 2],
[0, 2]])
# Calculate the sine of Z

np.sin(Z)
array([[0. , 0.90929743],
[0. , 0.90929743]])
We use the numpy attribute T to calculate the transposed matrix
# Create a matrix C
C = np.array([[1,1],[2,2],[3,3]])
C
array([[1, 1],
[2, 2],
[3, 3]])
# Get the transposed of C

C.T
array([[1, 2, 3],
[1, 2, 3]])
QUIZ ON 2D NUMPY ARRAY

Consider the following list a, convert it to Numpy Array.

a = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
A = np.array(a)
A
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
Calculate the numpy array size.

A.size
12
Access the element on the first row and first and second columns.

A[0][0:2]
array([1, 2])
Perform matrix multiplication with the numpy arrays A and B.

B = np.array([[0, 1], [1, 0], [1, 1], [-1, 0]])
X = np.dot(A,B)
X
array([[ 1, 4],
[ 5, 12],
[ 9, 20]])

The last exercise!
READING: SOME CONTEXT ON APIS

SOME CONTEXT ON APIS
Estimated Effort: 5 mins
WHAT ARE APIS?

APIs, or Application Programming Interfaces, are a crucial part of software development. They
allow developers to create new applications by leveraging existing functionality from other
systems. APIs define how software components should interact and facilitate communication
between various products and services without requiring direct implementation.
IMPORTANCE OF APIS
APIs are essential for any engineer because they provide a way to access data and functionality
from other systems, which can save time and resources. For instance, APIs can be used to
integrate applications into the existing architecture of a server or application, allowing
developers to communicate between various products and services without requiring direct
implementation.
APIs are also important because they enable developers to create new applications by
leveraging existing functionality from other systems. This can help developers throughout the
engineering and development process of apps.
APIs are used in a wide range of applications, from social media platforms to e-commerce
websites. They are also used in mobile applications, web applications, and desktop applications.
APPLICATIONS OF APIS
APIs have a wide range of applications, some of which are:
3. Social media platforms: Social media platforms like Facebook, Twitter, and Instagram use
APIs to allow developers to access their data and functionality. This allows developers to
create applications that can interact with these platforms and provide additional
functionality to users.
4. E-commerce websites: E-commerce websites like Amazon and eBay use APIs to allow
developers to access their product catalogs and other data. This allows developers to create
applications that can interact with these platforms and provide additional functionality to
users.
5. Weather applications: Weather applications like AccuWeather and The Weather Channel
use APIs to access weather data from various sources. This allows developers to create
applications that can provide users with up-to-date weather information.
6. Maps and navigation applications: Maps and navigation applications like Google Maps
and Waze use APIs to access location data and other information. This allows developers
to create applications that can provide users with directions, traffic updates, and other
location-based information.

7. Payment gateways: Payment gateways like PayPal and Stripe use APIs to allow
developers to access their payment processing functionality. This allows developers to
create applications that can process payments securely and efficiently.
8. Messaging applications: Messaging applications like WhatsApp and Facebook Messenger
use APIs to allow developers to access their messaging functionality. This allows
developers to create applications that can interact with these platforms and provide
additional functionality to users.
CONCLUSION
In summary, APIs are an essential part of software development, and they provide a way to
access data and functionality from other systems. They are used in a wide range of applications
and can help developers save time and resources while creating new applications.

PRACTICE QUIZ: NUMPY IN PYTHON
Question 1
What Python library serves as a foundation for Pandas and is used for scientific computing?
o datetime
o Requests
o OS
o Numpy
Correct! Numpy serves as a foundation for Pandas and is used for scientific computing.
Question 2
What attribute retrieves the number of elements in a Numpy array?
o a.dtype
o a.ndim
o a.shape
o a.size
Correct: size will return the total number of elements in an array of any Dimension.
Question 3
How would you change the first element to 10 in this array c = np.array([100,1,2,3,0])?
o c[4]=10
o c[2]=10
o c[1]=10
o c[0]=10
Correct: Index 0 references the first element.

MODULE 4 SUMMARY: WORKING WITH DATA IN PYTHON
• Python uses the open() function and allows you to read and write files, providing access
to the content within the file for reading. It also allows overwriting it for writing and
specifies the file mode (for example, r for reading, w for writing, a for appending).
o To read a file, Python uses an open function along with r.
o Python uses the open with function to read and process a file attribute, that is,
from open to close.
o In Python, you use the open method to edit or overwrite a file.
o To write a file, Python uses the open function along with w.
o In Python, "a" indicates that the program has appended to the file.
o In Python, “\n” signifies that the code should start on a new line.
o Python uses various methods to print lines from attributes.
• Pandas is a powerful Python library for data manipulation and analysis, providing data
structures and functions to work with structured data like data frames and series.
o You import the file (panda) by using the import command followed by the file
name.
o In Python, you use the as command to provide a shorter name for the file.
o In Pandas, you use a data frame (df) to specify the files to read.
o DataFrames consist of rows and columns.
o You can create new DataFrames by using the column or columns of a specific
DataFrame.
o We can work with data in a DataFrames and save the results in different formats.
o In Python, you use the Unique method to determine unique elements in a column
of the DataFrames.
o You use the inequality operator along with df to assign a Boolean value to the
selected column in DataFrames.
o You save a new DataFrame as a different DataFrame, which may contain values
from an earlier DataFrame.
• NumPy is a Python library for numerical and matrix operations, offering
multidimensional array objects and a variety of mathematical functions to work with
data efficiently.
o NumPy is a basis for Pandas.
o A NumPy array or ND array is similar to a list, usually of a fixed size with the
same kind of element.
• A one-dimensional NumPy array is a linear sequence of elements with a single axis,
like a traditional list, but optimized for numerical computations and array operations.
o You can access elements in a NumPy using an index.
o You use the attribute dtype to get the data type of the array elements.
o You use nsize and ndim to get the size and dimension of the array, respectively.
o You can use indexing and slicing methods in NumPy.
o Vector additions are widely used operations in Python.
o Representing vector addition with line segments or arrows is useful.
o NumPy codes work much faster, which is helpful with lots of data.
o You perform vector subtraction by replacing the addition sign with a negative
sign.
o Multiplying an array by a scalar in Python entails multiplying each element of
the array by the scalar value, leading to a new array in which each element scales
by the scalar.
o Hadamard product refers to the element-wise multiplication of two arrays of
the same shape, resulting in a new array where each element is the product of
the corresponding elements in the input arrays.
o The dot product in Python is the sum of the element-wise products of two arrays,
often used for vector and matrix operations to find the scalar result of multiplying
corresponding elements and summing them.
o When working with NumPy, it is common to utilize libraries like Matplotlib to
create graphs and visualizations from numerical data stored in NumPy arrays.
• A two-dimensional NumPy array is a grid-like structure with rows and columns
suitable for representing data as a matrix or a table for numerical computations.
o In NumPy, "shape" refers to an array's dimensions (number of rows and
columns), indicating its size and structure.
o You use the attribute "size" to obtain the size of an array.
o You use rectangular attributes to access the various elements in an array.
o You use a scalar to multiply elements in NumPy.

WORKING WITH DATA IN PYTHON CHEAT SHEET
READING AND WRITING FILES
File opening modes Different modes to open files Syntax: r (reading) w (writing) a (appending) +
for specific operations. (updating: read/write) b (binary, otherwise text)
Examples:
with open("data.txt", "r") as file:
content = file.read()
print(content)
with open("output.txt", "w") as file:
file.write("Hello, world!")
with open("log.txt", "a") as file:
file.write("Log entry: Something
happened.")
with open("data.txt", "r+") as file:
file.write("Updated content: " +
content)
File reading methods Different methods to read Syntax:

file content in various ways. file.readlines() # reads all lines as a list
readline() # reads the next line as a string
file.read() # reads the entire file content as a string
Example:
lines = file.readlines()
next_line = file.readline()
File writing methods Different write methods to Syntax:

write content to a file. file.write(content) # writes a string to the file
file.writelines(lines) # writes a list of strings to the
file
Example:
lines = ["Hello\n", "World\n"]
with open("output.txt", "w") as file:
file.writelines(lines)
Iterating over lines Iterates through each line in Syntax:

the file using a `loop`. for line in file: # Code to process each line
Example:
for line in file:
print(line)
Open() and close() Opens a file, performs Syntax:
operations, and explicitly file = open(filename, mode) # Code that uses the
closes the file using the file
close() method. file.close()
Example:
file = open("data.txt", "r")
file.close()

with open() Opens a file using a with Syntax:
block, ensuring automatic
file closure after usage. with open(filename, mode) as file: # Code that uses
the file
Example:
Pandas
.read_csv() Reads data from a `.CSV` Syntax: dataframe_name = pd.read_csv("filename.csv")
file and creates a DataFrame. Example: df = pd.read_csv("data.csv")
.read_excel() Reads data from an Excel file Syntax:
and creates a DataFrame.
dataframe_name = pd.read_excel("filename.xlsx")
Example:
df = pd.read_excel("data.xlsx")
.to_csv() Writes DataFrame to a CSV Syntax:

file.
dataframe_name.to_csv("output.csv", index=False)
Example:
df.to_csv("output.csv", index=False)
Access Columns Accesses a specific column Syntax:

using [] in the DataFrame.
dataframe_name["column_name"] # Accesses
single column
dataframe_name[["column1", "column2"]] #
Accesses multiple columns
Example:
df["age"]
df[["name", "age"]]
describe() Generates statistics Syntax:

summary of numeric
columns in the DataFrame.
dataframe_name.describe()
Example:
df.describe()
drop() Removes specified rows or Syntax:

columns from the
DataFrame. axis=1 indicates
columns. axis=0 indicates
rows.
dataframe_name.drop(["column1", "column2"],
axis=1, inplace=True)
dataframe_name.drop(index=[row1, row2], axis=0,
inplace=True)
Example:
df.drop(["age", "salary"], axis=1,
inplace=True) # Will drop columns
df.drop(index=[5, 10], axis=0,
inplace=True) # Will drop rows
dropna() Removes rows with missing Syntax:
NaN values from the

DataFrame. axis=0 indicates
rows.
dataframe_name.dropna(axis=0, inplace=True)
Example:
df.dropna(axis=0, inplace=True)
duplicated() Duplicate or repetitive Syntax:
values or records within a
data set.
dataframe_name.duplicated()
Example:
duplicate_rows = df[df.duplicated()]
Filter Rows Creates a new DataFrame Syntax:
with rows that meet
specified conditions.
filtered_df =
dataframe_name[(Conditional_statements)]
Example:
filtered_df = df[(df["age"] > 30) & (df["salary"] <
50000)
groupby() Splits a DataFrame into Syntax:
groups based on specified
criteria, enabling subsequent
aggregation, transformation,
or analysis within each
group.
grouped = dataframe_name.groupby(by, axis=0,
level=None, as_index=True,
sort=True, group_keys=True, squeeze=False,
observed=False, dropna=True)
Example:
grouped = df.groupby(["category",
"region"]).agg({"sales": "sum"})
head() Displays the first n rows of Syntax:
the DataFrame.
dataframe_name.head(n)
Example:
df.head(5)
Import pandas Imports the Pandas library Syntax:
with the alias pd.
import pandas as pd
Example:
import pandas as pd
info() Provides information about Syntax:
the DataFrame, including
data types and memory
usage.
dataframe_name.info()
Example:
df.info()
merge() Merges two DataFrames Syntax:
based on multiple common
columns.

merged_df = pd.merge(df1, df2, on=["column1",
"column2"])
Example:
merged_df = pd.merge(sales, products,
on=["product_id", "category_id"])
print DataFrame Displays the content of the Syntax:
DataFrame.
print(df) # or just type df
Example:
print(df)
df
replace() Replaces specific values in a Syntax:
column with new values.
dataframe_name["column_name"].replace(old_value,
new_value, inplace=True)
Example:
df["status"].replace("In Progress", "Active",
inplace=True)
tail() Displays the last n rows of Syntax:
the DataFrame.
dataframe_name.tail(n)
Example:
df.tail(5)
Numpy
Importing NumPy Imports the NumPy library. Syntax:
import numpy as np
Example:
import numpy as np
np.array() Creates a one or multi- Syntax:
dimensional array,
array_1d = np.array([list1 values]) # 1D Array
array_2d = np.array([[list1 values], [list2 values]]) #
2D Array
Example:
array_1d = np.array([1, 2, 3]) # 1D Array
array_2d = np.array([[1, 2], [3, 4]]) # 2D Array
Numpy Array Attributes - Calculates the mean of Example:
array elements
- Calculates the sum of array np.mean(array)
elements
- Finds the minimum value in np.sum(array)
the array
- Finds the maximum value np.min(array
in the array
- Computes dot product of np.max(array)
two arrays
np.dot(array_1, array_2)

READING: GLOSSARY: WORKING WITH DATA IN PYTHON
Term Definition
.csv file A .csv (Comma-Separated Values) file is a plain text file format for storing tabular
data, where each line represents a row and uses commas to separate values in
different columns.
.txt file A .txt (Text) file is a common file format that contains plain text without specific
formatting, making it suitable for storing and editing textual data.
Append To "append" means to add or attach something to the end of an existing object,
typically used in the context of adding data to a file or elements to a data structure
like a list in Python.
Attribute An "attribute" in Python refers to a property or characteristic associated with an

object, which can be accessed using dot notation.
Broadcasting in NumPy Broadcasting in NumPy allows arrays with different shapes to be combined in
element-wise operations by automatically extending smaller arrays to match the
shape of larger ones, making operations more flexible.
Component In NumPy, a "component" typically refers to a specific element or value within a

multi-dimensional array, which can be accessed using indexing.
Computation Computation in NumPy involves performing numerical operations on arrays and

matrices, making it a powerful library for mathematical and scientific computing
in Python.
Data analysis Data analysis is the process of inspecting, cleaning, transforming, and interpreting
data to discover useful information, draw conclusions, and support decision-
making.
DataFrames A DataFrames in Pandas is a two-dimensional, tabular data structure for storing
and analyzing data, consisting of rows and columns.
Dependencies Dependencies in Pandas are external libraries or modules, such as NumPy, that
Pandas rely on for fundamental data manipulation and analysis functionality.
File attribute File attributes generally refer to properties or metadata associated with files,
which are managed at the operating system level.
File object A "file object" in Python represents an open file, allowing reading from or writing
to the file.
Grid In Python, a "grid" typically refers to a two-dimensional structure composed of

rows and columns, often used to represent data in a tabular format or for
organizing objects in a coordinate system.
Hadamard Product The Hadamard product is a mathematical operation that involves element-wise
multiplication of two matrices or arrays of the same shape, producing a new
matrix with each element being the product of the corresponding elements in the
input matrices.
Importing pandas To import Pandas in Python, you use the statement: import pandas as pd, which
allows you to access Pandas functions and data structures using the abbreviation
"pd."
Index In Python, an "index" typically refers to a position or identifier used to access
elements within a sequence or data structure, such as a list or string.

Libraries Libraries in Python are collections of pre-written code modules that provide
reusable functions and classes to simplify and enhance software development.
Linespace In Python, "linespace" refers to a NumPy function that generates an array of

evenly spaced values within a specified range.
NumPy NumPy in Python is a fundamental library for numerical computing that provides
support for large, multi-dimensional arrays and matrices, as well as a variety of
high-level mathematical functions to operate on these arrays.
One dimensional NumPy A one-dimensional NumPy array is a linear data structure that stores elements in
a single sequence, often used for numerical computations and data manipulation.
Open function In Python, the "open" function is used to access and manipulate files, allowing you
to read from or write to a specified file.
Pandas Pandas is a popular Python library for data manipulation and analysis, offering
data structures and tools for working with structured data like tables and time
series.
Pandas library Pandas library in Python refer to the various modules and functions within the
Pandas library, which provides powerful data structures and data analysis tools
for working with structured data.
Plotting Mathematical Functions Plotting mathematical functions in Python involves using libraries like Matplotlib
to create graphical representations of mathematical equations, aiding
visualization, and analysis.
Shape In NumPy, "shape" refers to an array's dimensions (number of rows and columns),
describing its size and structure.
Slicing Slicing in NumPy entails extracting specific portions of an array by specifying a

range of indices, enabling you to work with subsets of the data.
Two dimensional NumPy A two-dimensional NumPy array is a structured data representation with rows
and columns, resembling a matrix or table, ideal for various data manipulation and
analysis tasks.
Universal Functions Universal functions (ufuncs) in NumPy are functions that operate element-wise
on arrays, providing efficient and vectorized operations for a wide range of
mathematical and logical operations.
Vector addition Vector addition in Python involves adding corresponding elements of two or more
vectors, producing a new vector with the sum of their components.
Visualizations Visualizations in Python refer to the creation of graphical representations, such as

charts, plots, and graphs, to illustrate and communicate data and trends visually.

MODULE 5 - APIS AND DATA COLLECTION
MODULE INTRODUCTION AND LEARNING OBJECTIVES

This module delves into the unique ways to collect data by the use of APIs and web scraping. It
further explores data collection by explaining how to read and collect data when dealing with
different file formats.
LEARNING OBJECTIVES
• Explain the use of the HTTP protocol using the Requests Library method.
• Describe how the URL Request Response HTTP protocol works.
• Invoke simple, open-source APIs.
• Perform basic web scraping using Python.
• Work with different file formats using Python.
• Explain the difference between APIs and REST APIs.
• Summarize how APIs receive and send information.

VIDEO 021: APPLICATION PROGRAM INTERFACE (5:12)
In this video we will discuss Application Program Interfaces (APIs for short).
Figure 683
Specifically, we will discuss what an API is, API libraries, and REST APIs, including Request and
Response and an example with PyCoinGecko.
Figure 684
An API lets two pieces of software talk to each other. For example, you have your program, you
have some data, you have other software components. You use the API to communicate with
other software via inputs and outputs. Just like a function, you don’t have to know how the API
works, just its inputs and outputs.
Figure 685

Pandas is actually a set of software components, much of which are not even written in Python.
You have some data. You have a set of software components. We use the pandas API to process
the data by communicating with the other software components.
Figure 686
Let’s clean up the diagram. When you create a dictionary, and then create a pandas object with
the DataFrame constructor, in API lingo, this is an “instance.” The data in the dictionary is passed
along to the pandas API. You then use the dataframe to communicate with the API. When you
call the method head, the dataframe communicates with the API displaying the first few rows
of the dataframe. When you call the method mean the API will calculate the mean and return
the values.
Figure 687
REST APIs are another popular type of API; they allow you to communicate through the internet
allowing you to take advantage of resources like storage, access more data, artificial intelligence
algorithms, and much more. The RE stands for Representational, the S for State, and T for
Transfer.
Figure 688

In REST APIs your program is called the client. The API communicates with a web service you
call through the internet. There is a set of rules regarding communication, input or request, and
output or response.
Figure 689
Here are some common terms. You or your code can be thought of as a client. The web service
is referred to as a resource. The client finds the service via an endpoint. We will review this more
in the next section. The client sends requests to the resource and the response to the client.
Figure 690
HTTP methods are a way of transmitting data over the internet. We tell the REST APIs what to
do by sending a request. The request is usually communicated via an HTTP message. The HTTP
message usually contains a JSON file. This contains instructions for what operation we would
like the service to perform. This operation is transmitted to the web service via the internet. The
service performs the operation.
Figure 691

Figure 692
In a similar manner, the web service returns a response via an HTTP message, where the
information is usually returned via a JSON file. This information is transmitted back to the client.
Figure 693
Cryptocurrency data is excellent to be used in an API because it is constantly updated and it is

vital to cryptocurrency trading. We will use the PyCoinGecko Python client/wrapper for the
CoinGecko API, updated every minute by CoinGecko. We use the wrapper/client because it is
easy to use so you can focus on the task of collecting data, we will also introduce pandas time
series functions for dealing with time series data.
Figure 694
Using PyCoinGecko to collect data is simple. All we need is to install and import the library, then
create a client object, and finally use a function to request our data. In this function we are getting
data on bitcoin, in US dollars, for the past 30 days.

Figure 695
In this case our response is a JSON expressed as a Python dictionary of nested lists including
price, market cap, and total volumes, which contain the UNIX timestamp and the price at that
time.
Figure 696
We are only interested in price so that is what we will select using the key price.
Figure 697

To make things simple, we can convert our nested list to a DataFrame,
Figure 698
with the columns timestamp and price it's difficult to understand the column timestamp.
Figure 699
We will convert it to a more readable format using the pandas function to_datetime.
Figure 700

Using the to_datetime function, we create readable time data, the input is the timestamp column,
unit of time is set to milliseconds.
Figure 701
We append the output to the new column, date.
Figure 702
Now we want to create a candlestick plot.
Figure 703

To get the data for the daily candlesticks we will group by the date to find the minimum,
maximum,
first, and last price of each day.
Figure 704
Finally we will use plotly to create the candlestick chart and plot it.
Figure 705
Now we can view the candlestick chart by opening the HTML file and clicking Trust HTML in the
top left of the tab.
Figure 706

It should look something like this:
Figure 707

HANDS-ON LAB: INTRODUCTION TO API
APPLICATION PROGRAMMING INTERFACE
OBJECTIVES
• Create and Use APIs in Python
INTRODUCTION
An API lets two pieces of software talk to each other. Just like a function, you don’t have to know
how the API works only its inputs and outputs. An essential type of API is a REST API that
allows you to access resources via the internet. In this lab, we will review the Pandas Library in
the context of an API, we will also review a basic REST API
TABLE OF CONTENTS
• Pandas is an API
• REST APIs Basics
• Quiz on Tuples
!pip install pycoingecko

!pip install plotly
!pip install mplfinance
!pip install --upgrade nbformat
Pandas is an API
Pandas is actually set of software components , much of which is not even written in Python.
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.offline import plot
import datetime
from pycoingecko import CoinGeckoAPI
from mplfinance.original_flavor import candlestick2_ohlc
You create a dictionary, this is just data.
dict_={'a':[11,21,31],'b':[12,22,32]}
When you create a Pandas object with the Dataframe constructor in API lingo, this is an
"instance". The data in the dictionary is passed along to the pandas API. You then use the
dataframe to communicate with the API.
df=pd.DataFrame(dict_)
type(df)
pandas.core.frame.DataFrame

Figure 708
When you call the method head the dataframe communicates with the API displaying the first
few rows of the dataframe.
df.head()
When you call the method mean,the API will calculate the mean and return the value.
df.mean()
a 21.0
b 22.0
dtype: float64
REST APIs
Rest API’s function by sending a request, the request is communicated via HTTP message. The
HTTP message usually contains a JSON file. This contains instructions for what operation we
would like the service or resource to perform. In a similar manner, API returns a response, via an
HTTP message, this response is usually contained within a JSON.
In cryptocurrency a popular method to display the movements of the price of a currency.
Figure 709

Here is a description of the candle sticks.
Figure 710
In this lab, we will be using the CoinGecko API to create one of these candlestick graphs for
Bitcoin. We will use the API to get the price data for 30 days with 24 observation per day, 1 per
hour. We will find the max, min, open, and close price per day meaning we will have 30
candlesticks and use that to generate the candlestick graph. Although we are using the
CoinGecko API we will use a Python client/wrapper for the API called PyCoinGecko.
PyCoinGecko will make performing the requests easy and it will deal with the enpoint targeting.
Lets start off by getting the data we need. Using the get_coin_market_chart_by_id(id,
vs_currency, days). id is the name of the coin you want, vs_currency is the currency you want
the price in, and days is how many days back from today you want.
cg = CoinGeckoAPI()
bitcoin_data = cg.get_coin_market_chart_by_id(id='bitcoin', vs_currency='usd',

days=30)
type(bitcoin_data )
dict
The response we get is in the form of a JSON which includes the price, market caps, and total
volumes along with timestamps for each observation. We are focused on the prices so we will
select that data.
bitcoin_price_data = bitcoin_data['prices']
bitcoin_price_data[0:5]
[[1714201454889, 62978.37519374921],
[1714205039272, 62927.54516807768],
[1714208491993, 62974.580289400634],
[1714212175314, 62931.11352813524],
[1714215937798, 62751.432064903514]]

Finally let’s turn this data into a Pandas DataFrame.
data = pd.DataFrame(bitcoin_price_data, columns=['TimeStamp', 'Price'])
Now that we have the DataFrame we will convert the timestamp to datetime and save it as a
column called Date. We will map our unix_to_datetime to each timestamp and convert it to a
readable datetime.
data['date'] = data['TimeStamp'].apply(lambda d:
datetime.date.fromtimestamp(d/1000.0))
Using this modified dataset we can now group by the Date and find the min, max, open, and
close for the candlesticks.
candlestick_data = data.groupby(data.date, as_index=False).agg({"Price": ['min',

'max', 'first', 'last']})
Finally we are now ready to use plotly to create our Candlestick Chart.
fig = go.Figure(data=[go.Candlestick(x=candlestick_data['date'],
open=candlestick_data['Price']['first'],
high=candlestick_data['Price']['max'],
low=candlestick_data['Price']['min'],
close=candlestick_data['Price']['last'])
])
fig.update_layout(xaxis_rangeslider_visible=False)
fig.show()
Figure 711

PRACTICE QUIZ: SIMPLE APIS
Question 1
What does API stand for?
o Application Process Interface
o Automatic Program Interaction
o Application Programming Interface
o Application Programming Interaction
Correct: API stands for Application Programming Interface.
Question 2
Which data format is commonly found in the HTTP message for API requests?
o YAML
o HTML
o JSON
o XML
Correct: JSON is the most common data format found in HTTP message for API requests.
Question 3
What is the primary purpose of an API?
o To provide security to web applications
o To handle server-side database operations
o To connect and enable communication between software applications
o To design user interfaces for mobile applications
Correct: Primary role of API is to establish a connection and enable communication between
different parts of a software application.

VIDEO 022: REST APIS & HTTP REQUESTS - PART 1 (4:11)
In this video, we will discuss the HTTP protocol.
Figure 712
OUTLINE
Specifically, we will discuss:
• Uniform Resource Locator: URL
• Request
• Response
Figure 713
We touched on REST APIs in the last section.
Figure 714

The HTTP protocol can be thought of as a general protocol of transferring information through
the web. This includes many types of REST APIs. Recall that REST API’s function by sending a
request, and the request is communicated via HTTP message. The HTTP message usually
contains a JSON file.
Figure 715
When you, the client, use a web page your browser sends an HTTP request to the server where
the page is hosted. The server tries to find the desired resource by default "index.html".
Figure 716
if your request is successful, the server will send the object to the client in an HTTP response;
this includes information like the type of the resource, the length of the resource, and other
information. The table under the Web server represents a list of resources stored in the web
server. In this case, an HTML file, png image, and txt file.
Figure 717

When the request is made for the information, the web servers sends the the requested
information, i.e., one of the files.
Figure 718
UNIFORM RESOURCE LOCATOR: URL
Figure 719
Uniform resource locator (URL) is the most popular way to find resources on the web. We can
break the URL into three parts:
1. the scheme: this is the protocol, for this lab it will always be http://
2. Internet address or Base URL: this will be used to find the location;
some examples include www.ibm.com and www.gitlab.com
3. route: this is the location on the web server;
for example: /images/IDSNlogo.png
Figure 720

REQUEST AND RESPONSE
Let’s review the request and Response process.
Figure 721
Request Message
The following is an example of the request message for the get request method. There are other
HTTP methods we can use.
Figure 722
In the start line we have the GET method. This is an HTTP method. In this case, it’s requesting
the file index.html
Figure 723

The Request header passes additional information with an HTTP request.
Figure 724
In the GET method the Request header is empty. Some Requests have a body; we will have an
example of a request body later.
Figure 725
Response Message
The following table represents the response.
Figure 726

The response start line contains the version number followed by a descriptive phrase, in this
case, HTTP/1.0 a status code (200) meaning success, and the descriptive phrase OK. We have
more on status codes later.
Figure 727
The response header contains information.
Figure 728
Finally, we have the response body containing the requested file, in this case
an HTML document.
Figure 729

Status Code
Let’s look at other status codes. Some status code examples are shown in the table below. The
prefix indicates the class; for example, the 100s are informational responses; 100 indicates that
everything is OK so far. The 200s are Successful responses: For example, 200 The request has
succeeded. Anything in the 400s is bad news. 401 means the request is unauthorized. 500’s
stands for server errors, like 501 for not Implemented.
Figure 730
When an HTTP request is made, an HTTP method is sent. This tells the server what action to
perform. A list of several HTTP methods is shown here.
Figure 731
In the next video, we will use Python to apply the GET method that Retrieves data from the
server and the post method that sends data to the server.

VIDEO 023: REST APIS & HTTP REQUESTS - PART 2 (4:56)
In this video, we will discuss the HTTP protocol using the Requests Library a popular method
for dealing with the HTTP protocol in Python
Figure 732
OUTLINE
In this video, we will review Python library requests for Working with the HTTP protocols. We
will provide an overview of Get Requests and Post Requests
Figure 733
REQUESTS MODULE IN PYTHON

Let’s review the Request Module in Python.
This is one of several libraries including: httplib, urllib, that can work with the HTTP
protocol.
Figure 734

Requests is a python Library that allows you to send HTTP/1.1 requests easily. We can
import the library as follows:
i. You can make a GET request via the method get to www.ibm.com.
ii. We have the response object ’r’, this has information about the request, like the status
of the request.
iii. We can view the status code using the attribute status_code, which is 200 for OK.
iv. You can view the request headers:
Figure 735
You can view the request body in the following line. As there is no body for a GET request,
we get a None.
You can view the HTTP response header using the attribute headers. This returns
a python dictionary of HTTP response headers. We can look at the dictionary values.
Figure 736

We can obtain the date the request was sent by using the key Date. The key Content-
Type indicates the type of data. Using the response object ‘r’ , we can also check the encoding:
As the Content-Type is text/html, we can use the attribute text to display the HTML in the body.
We can review the first 100 characters. You can also download other content, see the lab for
more.
Figure 737
GET REQUEST WITH URL PARAMETERS

You can use the GET method to modify the results of your query. For example, retrieving data
from an API. In the lab we will use httpbin.org. A simple HTTP Request & Response Service.
Figure 738
GET REQUEST
We send a GET request to the server. Like before, we have the Base URL in the Route; we
append /get. This indicates we would like to preform a GET request. This is demonstrated
in the following table:
Figure 739

QUERY STRING
After GET is requested we have the query string. This is a part of a uniform resource locator
(URL) and this sends other information to the web server.
Figure 740
The start of the query is a ?, followed by a series of parameter and value pairs, as shown in the
table below. The first parameter name is ”name” and the value is ”Joseph.” The second
parameter name is ”ID” and the Value is ”123.” Each pair, parameter, and value is separated by
an equal sign ”=”. The series of pairs is separated by the ampersand, ”&.”
Figure 741

CREATE QUERY STRING
Let’s complete an example in python. We have the Base URL with GET appended to the end.
To create a Query string, we use the dictionary payload. The keys are the parameter names, and
the values are the value of the Query string. Then, we pass the dictionary payload to
the params parameter of the get() function. We can print out the URL and see the name and
values. We can see the request body. As the info is sent in the URL, the body has a value of
None. We can print out the status code.
Figure 742
CONTENT-TYPE
We can view the response as text: We can look at the key 'Content-Type’ to look at the content
type.
Figure 743
As the content 'Content-Type' is in the JSON, we format it using the method json() . It returns a
Python dict: The key 'args' has the name and values for the query string.
Figure 744

POST REQUESTS
Like a GET request a POST request is used to send data to a server, but the POST request
sends the data in a request body, not the url.
Figure 745
POST
In order to send the Post Request in the URL, we change the route to POST: This endpoint will
expect data and it is a convenient way to configure an HTTP request to send data to a server.
We have The Payload dictionary. To make a POST request, we use the post() function. The
variable payload is passed to the parameter data :
Figure 746

COMPARE POST AND GET
Comparing the URL using the attribute url from the response object of
the GET and POST request, we see the POST request has no name or value pairs in it’s url. We
can compare the POST and GET request body. We see only the POST request has a body:
Figure 747
We can view the key form to get the payload.
Figure 748

HANDS-ON LAB: ACCESS REST APIS & REQUEST HTTP
HTTP AND REQUESTS
OBJECTIVES
• Understand HTTP
• Handle HTTP Requests
TABLE OF CONTENTS
• Overview of HTTP
o Uniform Resource Locator:URL
o Request
o Response
• Requests in Python
o Get Request with URL Parameters
o Post Requests
OVERVIEW OF HTTP
When you, the client, use a web page your browser sends an HTTP request to the server where
the page is hosted. The server tries to find the desired resource by default "index.html". If
your request is successful, the server will send the object to the client in an HTTP response. This
includes information like the type of the resource, the length of the resource, and other
information.
The figure below represents the process. The circle on the left represents the client, the circle
on the right represents the Web server. The table under the Web server represents a list of
resources stored in the web server. In this case an HTML file, png image, and txt file .
The HTTP protocol allows you to send and receive information through the web including
webpages, images, and other web resources. In this lab, we will provide an overview of the
Requests library for interacting with the HTTP protocol.
Figure 749

UNIFORM RESOURCE LOCATOR: URL
Uniform resource locator (URL) is the most popular way to find resources on the web. We can
break the URL into three parts.
• scheme this is this protocol, for this lab it will always be http://
• Internet address or Base URL this will be used to find the location here are some
examples: www.ibm.com and www.gitlab.com
• route location on the web server for example: /images/IDSNlogo.png
You may also hear the term Uniform Resource Identifier (URI), URL are actually a subset of URIs.
Another popular term is endpoint, this is the URL of an operation provided by a Web server.
REQUEST
The process can be broken into the request and response process. The request using the get
method is partially illustrated below. In the start line we have the GET method, this is an HTTP
method. Also the location of the resource /index.html and the HTTP version. The Request
header passes additional information with an HTTP request:
Figure 750
When an HTTP request is made, an HTTP method is sent, this tells the server what action to
perform. A list of several HTTP methods is shown below. We will go over more examples later.
RESPONSE
The figure below represents the response; the response start line contains the version number
HTTP/1.0, a status code (200) meaning success, followed by a descriptive phrase (OK). The
response header contains useful information. Finally, we have the response body containing the
requested file, an HTML document. It should be noted that some requests have headers.
Figure 751

Some status code examples are shown in the table below, the prefix indicates the class. These
are shown in yellow, with actual status codes shown in white. Check out the following link for
more descriptions.
Figure 752
REQUESTS IN PYTHON
Requests is a Python Library that allows you to send HTTP/1.1 requests easily. We can import
the library as follows:
import requests
We will also use the following libraries:
import os
from PIL import Image
from IPython.display import IFrame
You can make a GET request via the method get to www.ibm.com:
url='https://www.ibm.com/'
r=requests.get(url)
We have the response object r, this has information about the request, like the status of the
request. We can view the status code using the attribute status_code.
r.status_code
200

You can view the request headers:
print(r.request.headers)
{'User-Agent': 'python-requests/2.29.0', 'Accept-Encoding': 'gzip, deflate, br', 'Accept':
'*/*', 'Connection': 'keep-alive', 'Cookie': '_abck=3F3297824B31FE9752AE4F59D907CDE2~-
1~YAAQZWQwF4UMapmPAQAA1LUuwQtGjdW5OHRIgzySUIVU5o1cNb9XPey1sWycyY6JIpR5zkdzXVjYBH+eCWWleLANtzmULIl6+F
+UTPM4LN2wTZM57jdxUV/nzq4N7LIfCKwRgV6+6BYprYYxKndA5k+eRW+Qiv188fnPhAhci+Lpd8KvrKKXijFpc2aUJx3FpNEkoi
F9XpSLLtvrbLcO3F6ftZ3ur/qtoZWmxFsBgbAQfcqinlDQFmRsHoBpiYiBngkvEa2PwdWb7cJ+JphIAxSRdzoHy30GbArmEO7NMp
CkJDG5MlUgOlHz1gX2m/rhFXnAYCdqHypIKN/lg3BDlLW8CnY0M5mcSlS4OwUgMU/Uc+I2Fho=~-1~-1~-1;
bm_sz=EC27E766150F74485F4FC747DE6F2A29~YAAQZWQwF4YMapmPAQAA1LUuwRf/SvMqO/SXdzTNa9tePvsgiZhlQb5pyOkAu
8t+xDwesI98TKB6lV/UOwE4rIcFJpVT4AWrhMWbHthClECG1OmeQQMz9zAatmu/+z/6JPl1lKivCpOQe0SFCH1KKBoSa+PhGlWKp
MRdk58tT8lDBO4UnSUJ36QpE4r8X2Y47IIwEHPOPzEi5EcLqJmgVsEpx3F4Qxx4Jz6k+6ygqPXEaBedmirI+LSkKnJgsfbOWmel0
LadI2dEtY1Alsm3Gk+vQAXhD6WPVbEs9BdJ038nP25eMPi9E1cof6Hy7WXoP4l3XQPlwlQsy0kdZPFFzvlgRvd0q9SkB6pD~3684
406~3555641'}
You can view the request body, in the following line, as there is no body for a get request we get
a None:
print("request body:", r.request.body)

request body: None
You can view the HTTP response header using the attribute headers. This returns a python
dictionary of HTTP response headers.
header=r.headers
print(r.headers)
{'Content-Security-Policy': 'upgrade-insecure-requests', 'x-frame-options': 'SAMEORIGIN', 'Last-
Modified': 'Tue, 28 May 2024 21:30:04 GMT', 'ETag': '"1aec0-6198a564b16b3-gzip"', 'Accept-Ranges':
'bytes', 'Content-Type': 'text/html;charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'Cache-
Control': 'max-age=20', 'Expires': 'Tue, 28 May 2024 21:50:35 GMT', 'X-Akamai-Transformed': '9 13656
0 pmb=mTOE,2', 'Content-Encoding': 'gzip', 'Date': 'Tue, 28 May 2024 21:50:15 GMT', 'Content-
Length': '13851', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'Strict-Transport-
Security': 'max-age=31536000'}
We can obtain the date the request was sent using the key Date
header['date']
'Tue, 28 May 2024 21:50:15 GMT'
Content-Type indicates the type of data:
header['Content-Type']
'text/html;charset=utf-8'
You can also check the encoding:
r.encoding
'utf-8'
As the Content-Type is text/html we can use the attribute text to display the HTML in the
body. We can review the first 100 characters:
r.text[0:100]
'\n<!DOCTYPE HTML>\n<html lang="en-us">\n<head>\r\n \r\n \r\n \r\n \r\n \r\n
\r\n <meta charset="'

You can load other types of data for non-text requests, like images. Consider the URL of the
following image:
# Use single quotation marks for defining string

url='https://cf-courses-data.s3.us.cloud-object-
storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/IDSNlogo.png'
We can make a get request:
r=requests.get(url)
We can look at the response header:
print(r.headers)
{'Date': 'Wed, 29 May 2024 03:07:54 GMT', 'X-Clv-Request-Id': 'c4ac306f-4bf8-4eb1-8a52-
d0c85bd4a19a', 'Server': 'Cleversafe', 'X-Clv-S3-Version': '2.5', 'Accept-Ranges': 'bytes', 'x-amz-
request-id': 'c4ac306f-4bf8-4eb1-8a52-d0c85bd4a19a', 'ETag': '"8bb44578fff8fdcc3d2972be9ece0164"',
'Content-Type': 'image/png', 'Last-Modified': 'Wed, 16 Nov 2022 03:32:41 GMT', 'Content-Length':
'78776'}
We can see the 'Content-Type'
r.headers['Content-Type']
'image/png'
An image is a response object that contains the image as a bytes-like object. As a result, we must
save it using a file object. First, we specify the file path and name
path=os.path.join(os.getcwd(),'image.png')
path
'/resources/labs/Module 5/image.png'
We save the file, in order to access the body of the response we use the attribute content then
save it using the open function and write method:
with open(path,'wb') as f:
f.write(r.content)
We can view the image:
Image.open(path)
Figure 753

Question 1: write wget
In the previous section, we used the wget function to retrieve content from the web server as
shown below. Write the python code to perform the same task. The code should be the same
as the one used to download the image, but the file name should be 'Example1.txt'.
!wget -O /resources/data/Example1.txt https://cf-courses-data.s3.us.cloud-object-

SkillsNetwork/labs/Module%205/data/Example1.txt
url='https://cf-courses-data.s3.us.cloud-object-
SkillsNetwork/labs/Module%205/data/Example1.txt'
path=os.path.join(os.getcwd(),'example1.txt')
r=requests.get(url)
with open(path,'wb') as f:
f.write(r.content)
Get Request with URL Parameters
You can use the GET method to modify the results of your query, for example retrieving data
from an API. We send a GET request to the server. Like before we have the Base URL, in the
Route we append /get, this indicates we would like to perform a GET request. This is
demonstrated in the following table:
Figure 754
The Base URL is for http://httpbin.org/ is a simple HTTP Request & Response Service.
The URL in Python is given by:
url_get='http://httpbin.org/get'
A query string is a part of a uniform resource locator (URL), this sends other information to the
web server. The start of the query is a ?, followed by a series of parameter and value pairs, as
shown in the table below. The first parameter name is name and the value is Joseph. The second
parameter name is ID and the Value is 123. Each pair, parameter, and value is separated by an
equals sign, =. The series of pairs is separated by the ampersand &.
Figure 755

To create a Query string, add a dictionary. The keys are the parameter names and the values are
the value of the Query string.
payload={"name":"Joseph","ID":"123"}
Then passing the dictionary payload to the params parameter of the get() function:
r=requests.get(url_get,params=payload)
We can print out the URL and see the name and values
r.url
r.json()['args']
Post Requests
Like a GET request, a POST is used to send data to a server, but the POST request sends the data
in a request body. In order to send the Post Request in Python, in the URL we change the route
to POST:
url_post='http://httpbin.org/post'
This endpoint will expect data as a file or as a form. A form is convenient way to configure an
HTTP request to send data to a server.
To make a POST request we use the post() function, the variable payload is passed to the
parameter data :
r_post=requests.post(url_post,data=payload)
Comparing the URL from the response object of the GET and POST request we see
the POST request has no name or value pairs.
print("POST request URL:",r_post.url )

print("GET request URL:",r.url)
POST request URL: http://httpbin.org/post
GET request URL: http://httpbin.org/get?name=Joseph&ID=123
We can compare the POST and GET request body, we see only the POST request has a body:
print("POST request body:",r_post.request.body)

print("GET request body:",r.request.body)
POST request body: name=Joseph&ID=123
GET request body: None
We can view the form as well:
r_post.json()['form']
{'ID': '123', 'name': 'Joseph'}
There is a lot more you can do. Check out Requests for more.

VIDEO 024 (OPTIONAL): HTML FOR WEB SCRAPING (5:00)
In this video we will review Hypertext Markup Language or HTML for Webscraping.
Figure 756
Lots of useful data is available on web pages, such as real estate prices and solutions to coding
questions. The website Wikipedia is a repository of the world's information.
Figure 757
OUTLINE
If you have an understanding of HTML, you can use Python to extract this information. In this
video, you will:
• review the HTML of a basic web page;
• understand the Composition of an HTML Tag;
• understand HTML Trees;
• and understand HTML Tables.
Figure 758

HTML Tags
Let’s say you were asked to find the name and salary of players in a National Basketball League
from the following web page.
Figure 759
The web page is comprised of HTML. It consists of text surrounded by a series of blue text
elements enclosed in angle brackets called tags.
Figure 760
HTML Composition
The tags tells the browser how to display the content. The data we require is in this text.
Figure 761

The first portion contains the "DOCTYPE html” which declares this document is an HTML
document. <html> element is the root element of an HTML page, and <head> element contains
meta information about the HTML page.
Figure 762
Next, we have the body, this is what's displayed on the web page. This is usually the data we
are interested in.
Figure 763
HTML Elements
We see the elements with an “h3”, this means type 3 heading, makes the text larger and bold.
These tags have the names of the players, notice the data is enclosed in the elements. It starts
with a h3 in brackets and ends in a slash h3 in brackets.
Figure 764

HTML Paragraphs Tags
There is also a different tag “p”, this means paragraph, each p tag contains a player's salary.
Figure 765
Composition of an HTML tag

Let’s take a closer look at the composition of an HTML tag.
Figure 766
HTML Anchor tag

Here is an example of an HTML Anchor tag. It will display IBM and when you click it, it will send
you to IBM.com.
Figure 767

Hyperlink Tag
We have the tag name, in this case “a”. This tag defines a hyperlink, which is used to link from
one page to another. It’s helpful to think of each tag name as a class in Python and each
individual tag as an instance.
Figure 768
Opening and End Tags

We have the opening or start tag and we have the end tag. This has the tag name preceded by
a slash.
Figure 769
Hyperlink Content
These tags contain the content, in this case what’s displayed on the web page.
Figure 770

Atributes
We have the attribute, this is composed of the Attribute Name and Attribute Value. In this case
it the url to the destination web page.
Figure 771
Figure 772
Figure 773

Inspect HTML
Real web pages are more complex, depending on your browser you can select the HTML
element, then click Inspect.
Figure 774
The result will give you the ability to inspect the HTML. There is also other types of content such
as CSS and JavaScript that we will not go over in this course.
Figure 775
The actual element is shown here.
Figure 776

HTML Trees
Each HTML document can actually be referred to as a document tree.
Figure 777
Document Tree
Let's go over a simple example. Tags may contain strings and other tags. These elements are
the tag’s children. We can represent this as a family tree. Each nested tag is a level in the tree.
Figure 778
The tag HTML tag contains the head and body tag. The Head and body tag are the descendants
of the html tag. In particular they are the children of the HTML tag.HTML tag is their parent.
Figure 779

The head and body tag are siblings as they are on the same level.
Figure 780
Title tag is the child of the head tag and its parent is the head tag.
Figure 781
The title tag is a descendant of the HTML tag but not its child.
Figure 782

The heading and paragraph tags are the children of the body tag; and as they are all children of
the body tag they are siblings of each other.
Figure 783
The bold tag is a child of the heading tag, the content of the tag is also part of the tree but this
can get unwieldy to draw.
Figure 784
HTML Tables
Next, let’s review HTML tables.
Figure 785

To define an HTML table we have the table tag.
Figure 786
Each table row is defined with a <tr> tag, you can also use a table header tag for the first row.
Figure 787
The table row cell contains a set of <td> tags, each defines a table cell.
Figure 788

For the first row first cell we have:
Figure 789
for the first row second cell we have:
Figure 790
and so on.
Figure 791

For the second row we have:
Figure 792
and for the second row first cell we have:
Figure 793
for the second row second cell we have:
Figure 794

and so on.
Figure 795
We now have some basic knowledge of HTML. Now let's try and extract some data from a web
page.

VIDEO 025: WEB SCRAPING (4:59)
In this video we will cover Web Scraping.
Figure 796
WHAT YOU WILL LEARN

After watching this video you will be able to:
• define web scraping;
• understand the role of BeautifulSoup Objects;
• apply the find_all method;
• and web scrape a website.
Figure 797

INTRODUCTION
What would you do if you wanted to analyze hundreds of points of data to find the best players
of a sports team?
Figure 798
Would you start manually copying and pasting information from different websites into a
spreadsheet?
Figure 799
Spending hours trying to find the right data, and eventually giving up because the task was too
overwhelming? That’s where web scraping can help.
Figure 800

WHAT IS WEB SCRAPING?
Web scraping is a process that can be used to automatically extract information from a website
and can easily be accomplished within a matter of minutes and not hours.
Figure 801
To get started, we just need a little Python code and the help of two modules named Requests
and BeautifulSoup.
Figure 802
WEB SCRAPING EXAMPLE

Let’s say you were asked to find the name and salary of players in a National Basketball
League, from the following webpage.
Figure 803

BEAUTIFULSOUP
First, we import BeautifulSoup.
Figure 804
We can store the webpage HTML as a string in the variable HTML.
Figure 805
To parse a document, pass it into the Beautiful Soup constructor.
Figure 806

We get the BeautifulSoup object, soup, which represents the document as a nested data
structure. Beautiful Soup represents HTML as a set of Tree like objects with methods used to
parse the HTML. We will review the Beautiful Soup object using the Beautiful Soup object, soup,
we created.
Figure 807
TAG OBJECT
The tag object corresponds to an HTML tag in the original document. For example, the tag
“title.”
Figure 808
Consider the tag h3. If there is more than one tag with the same name, the first element with
that tag is selected. In this case with Lebron James, we see the name is Enclosed in the bold
attribute "b". To extract it, use the Tree representation.
Figure 809

HTML TREE
Let’s use the Tree representation. The variable tag-object is located here.
Figure 810
We can access the child of the tag or navigate down the branch as follows:
Figure 811
PARENT ATTRIBUTE
You can navigate up the tree by using the parent attribute. The variable tag child is located here.
We can access the parent.
Figure 812

This is the original tag object. We can find the sibling of “tag object.”
Figure 813
NEXT-SIBLING ATTRIBUTE
We simply use the next-sibling attribute. We can find the sibling of sibling one. We simply use
the next sibling attribute.
Figure 814
NAVIGABLE STRING
Consider the tag child object.
Figure 815

You can access the attribute name and value as a key value pair in a dictionary as follows.
Figure 816
You can return the content as a Navigable string, this is like a Python string that supports
Beautiful Soup functionality.
Figure 817
FIND_ALL()
Let's review the method find_all. This is a filter, you can use filters to filter based on a tag’s name,
its attributes, the text of a string, or on some combination of these.
Figure 818

Consider the list of pizza places.
Figure 819
Like before, create a BeautifulSoup object. But this time, name it table.
Figure 820
PYTHON ITERABLE
The find_all () method looks through a tag’s descendants and retrieves all descendants that
match your filters. Apply it to the table with the tag tr.
Figure 821

The result is a Python iterable just like a list,
Figure 822
each element is a tag object for tr.
Figure 823
This corresponds to each row in the list, including the table header.
Figure 824

Figure 825
Figure 826
TAG OBJECT
Each element is a tag object. Consider the first row.
Figure 827

For example, we can extract the first table cell.
Figure 828
VARIABLE ROW
We can also iterate through each table cell.
Figure 829
First, we iterate through the list “table rows,” via the variable row.
Figure 830

ELEMENTS
Each element corresponds to a row in the table.
Figure 831
We can apply the method find_all to find all the table cells,
Figure 832
then we can iterate through the variable cells for each row.
Figure 833

For each iteration, the variable cell corresponds to an element in the table for that particular row.
Figure 834
We continue to iterate through each element and repeat the process for each row.
Figure 835
Figure 836

Figure 837
A WEB PAGE EXAMPLE

Let’s see how to apply Beautiful Soup to a webpage.
Figure 838
To scrape a webpage, we also need the Requests library.
Figure 839

The first step is to import the modules that are needed.
Figure 840
Use the get method from the requests library to download the webpage. The input is the URL.
Use the text attribute to get the text and assign it to the variable page.
Figure 841
Then, create a BeautifulSoup object ‘soup’ from the variable page. It will allow you to parse
through the HTML page. You can now scrape the Page.
Figure 842
Check out the labs for more.

HANDS-ON LAB: WEB SCRAPING
WEB SCRAPING LAB
OBJECTIVES
After completing this lab you will be:
• Familiar with the basics of the BeautifulSoup Python library
• Be able to scrape webpages for data and filter the data
TABLE OF CONTENTS
• Beautiful Soup Object
o Tag
o Children, Parents, and Siblings
o HTML Attributes
o Navigable String
• Filter
o find All
o find
o HTML Attributes
o Navigable String
• Downloading And Scraping The Contents Of A Web
For this lab, we are going to be using Python and several Python libraries. Some of these libraries
might be installed in your lab environment or in SN Labs. Others may need to be installed by
you. The cells below will install these libraries when executed.
!pip install bs4

#!pip install requests
Collecting bs4
Downloading bs4-0.0.2-py2.py3-none-any.whl (1.2 kB)
Requirement already satisfied: beautifulsoup4 in
/home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from bs4) (4.11.1)
Requirement already satisfied: soupsieve>1.2 in
/home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from beautifulsoup4->bs4)
(2.3.2.post1)
Installing collected packages: bs4
Successfully installed bs4-0.0.2
Import the required modules and functions
from bs4 import BeautifulSoup # this module helps in web scrapping.

import requests # this module helps us to download a web page

BEAUTIFUL SOUP OBJECTS
Beautiful Soup is a Python library for pulling data out of HTML and XML files, we will focus on
HTML files. This is accomplished by representing the HTML as a set of objects with methods
used to parse the HTML. We can navigate the HTML as a tree, and/or filter out what we are
looking for.
Consider the following HTML:
%%html
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h3>
<b id='boldest'>Lebron James</b>
</h3>
<p> Salary: $ 92,000,000 </p>
<h3> Stephen Curry</h3>
<p> Salary: $85,000, 000 </p>
<h3> Kevin Durant </h3>
<p> Salary: $73,200, 000</p>
</body>
</html>
Lebron James
Salary: $ 92,000,000
Stephen Curry
Salary: $85,000, 000
Kevin Durant
Salary: $73,200, 000
We can store it as a string in the variable HTML:
html="<!DOCTYPE html><html><head><title>Page Title</title></head><body><h3><b

id='boldest'>Lebron James</b></h3><p> Salary: $ 92,000,000 </p><h3> Stephen Curry</h3><p>
Salary: $85,000, 000 </p><h3> Kevin Durant </h3><p> Salary: $73,200,
000</p></body></html>"
To parse a document, pass it into the BeautifulSoup constructor. The BeautifulSoup object
represents the document as a nested data structure:
soup = BeautifulSoup(html, 'html5lib')
First, the document is converted to Unicode (similar to ASCII) and HTML entities are converted
to Unicode characters. BeautifulSoup transforms a complex HTML document into a complex tree
of Python objects. The BeautifulSoup object can create other types of objects. In this lab, we will
cover BeautifulSoup and Tag objects, that for the purposes of this lab are identical. Finally, we
will look at NavigableString objects.
We can use the method prettify() to display the HTML in the nested structure:
print(soup.prettify())

TAGS
Let's say we want the title of the page and the name of the top paid player. We can use the Tag.
The Tag object corresponds to an HTML tag in the original document, for example, the tag title.
tag_object=soup.title
print("tag object:",tag_object)
We can see the tag type bs4.element.Tag
print("tag object type:",type(tag_object))
If there is more than one Tag with the same name, the first element with that Tag name is called.
This corresponds to the most paid player:
tag_object=soup.h3
tag_object
Enclosed in the bold attribute b, it helps to use the tree representation. We can navigate down
the tree using the child attribute to get the name.
Children, Parents, and Siblings
As stated above, the Tag object is a tree of objects. We can access the child of the tag or navigate
down the branch as follows:
tag_child =tag_object.b
tag_child
You can access the parent with the parent
parent_tag=tag_child.parent
parent_tag
this is identical to:
tag_object
tag_object parent is the body element.
tag_object.parent
tag_object sibling is the paragraph element
sibling_1=tag_object.next_sibling
sibling_1
sibling_2 is the header element, which is also a sibling of both sibling_1 and tag_object
sibling_2=sibling_1.next_sibling
sibling_2

Exercise: next_sibling
Use the object sibling_2 and the method next_sibling to find the salary of Stephen Curry:
sibling_2.next_sibling
HTML Attributes
If the tag has attributes, the tag id="boldest" has an attribute id whose value is boldest. You
can access a tag’s attributes by treating the tag like a dictionary:
tag_child['id']
You can access that dictionary directly as attrs:
tag_child.attrs
You can also work with Multi-valued attributes. Check out BeautifulSoup Documentation for
more.
We can also obtain the content of the attribute of the tag using the Python get() method.
tag_child.get('id')
Navigable String
A string corresponds to a bit of text or content within a tag. Beautiful Soup uses the
NavigableString class to contain this text. In our HTML we can obtain the name of the first player
by extracting the string of the Tag object tag_child as follows:
tag_string=tag_child.string
tag_string
we can verify the type is Navigable String
type(tag_string)
A NavigableString is similar to a Python string or Unicode string. To be more precise, the main
difference is that it also supports some BeautifulSoup features. We can convert it to string object
in Python:
unicode_string = str(tag_string)
unicode_string
FILTER
Filters allow you to find complex patterns, the simplest filter is a string. In this section we will
pass a string to a different filter method and Beautiful Soup will perform a match against that
exact string. Consider the following HTML of rocket launches:
%%html
<table>
<tr>
<td id='flight' >Flight No</td>
<td>Launch site</td>
<td>Payload mass</td>
</tr>

<tr>
<td>1</td>
<td><a href='https://en.wikipedia.org/wiki/Florida'>Florida</a></td>
<td>300 kg</td>
</tr>
<tr>
<td>2</td>
<td><a href='https://en.wikipedia.org/wiki/Texas'>Texas</a></td>
<td>94 kg</td>
</tr>
<tr>
<td>3</td>
<td><a href='https://en.wikipedia.org/wiki/Florida'>Florida<a> </td>
<td>80 kg</td>
</tr>
</table>
Flight No Launch Site Payload Mass
1 Florida 300 kg
2 Texas 94 kg
3 Florida 80 kg
We can store it as a string in the variable table:
table="<table><tr><td id='flight'>Flight No</td><td>Launch site</td> <td>Payload

mass</td></tr><tr> <td>1</td><td><a
href='https://en.wikipedia.org/wiki/Florida'>Florida<a></td><td>300
kg</td></tr><tr><td>2</td><td><a
href='https://en.wikipedia.org/wiki/Texas'>Texas</a></td><td>94
kg</td></tr><tr><td>3</td><td><a href='https://en.wikipedia.org/wiki/Florida'>Florida<a>
</td><td>80 kg</td></tr></table>"
table_bs = BeautifulSoup(table, 'html5lib')
FIND ALL
The find_all() method looks through a tag’s descendants and retrieves all descendants that
match your filters.
The Method signature for find_all(name, attrs, recursive, string, limit,

**kwargs)
Name
When we set the name parameter to a tag name, the method will extract all the tags with that
name and its children.
table_rows=table_bs.find_all('tr')
table_rows

The result is a Python Iterable just like a list, each element is a tag object:
first_row =table_rows[0]
first_row
The type is tag
print(type(first_row))
we can obtain the child
first_row.td
If we iterate through the list, each element corresponds to a row in the table:
for i,row in enumerate(table_rows):

print("row",i,"is",row)
As row is a cell object, we can apply the method find_all to it and extract table cells in the
object cells using the tag td, this is all the children with the name td. The result is a list, each
element corresponds to a cell and is a Tag object, we can iterate through this list as well. We
can extract the content using the string attribute.
for i,row in enumerate(table_rows):

print("row",i)
cells=row.find_all('td')
for j,cell in enumerate(cells):
print('colunm',j,"cell",cell)
If we use a list we can match against any item in that list.
list_input=table_bs .find_all(name=["tr", "td"])

list_input
ATTRIBUTES
If the argument is not recognized it will be turned into a filter on the tag’s attributes. For example
with the id argument, Beautiful Soup will filter against each tag’s id attribute. For example, the
first td elements have a value of id of flight, therefore we can filter based on that id value.
table_bs.find_all(id="flight")
We can find all the elements that have links to the Florida Wikipedia page:
list_input=table_bs.find_all(href="https://en.wikipedia.org/wiki/Florida")
list_input
If we set the href attribute to True, regardless of what the value is, the code finds all tags
with href value:
table_bs.find_all(href=True)
There are other methods for dealing with attributes and other related methods. Check out the
following link

Exercise: find_all
Using the logic above, find all the elements without href value
table_bs.find_all(href=False)
Using the soup object soup, find the element with the id attribute content set to "boldest".
soup.find_all(id="boldest")
string
With string you can search for strings instead of tags, where we find all the elements with
Florida:
table_bs.find_all(string="Florida")
FIND
The find_all() method scans the entire document looking for results. It’s useful if you are looking
for one element, as you can use the find() method to find the first element in the document.
Consider the following two tables:
%%html
<h3>Rocket Launch </h3>
<p>
<table class='rocket'>
<tr>
<td>Flight No</td>
<td>Launch site</td>
<td>Payload mass</td>
</tr>
<tr>
<td>1</td>
<td>Florida</td>
<td>300 kg</td>
</tr>
<tr>
<td>2</td>
<td>Texas</td>
<td>94 kg</td>
</tr>
<tr>
<td>3</td>
<td>Florida </td>
<td>80 kg</td>
</tr>
</table>
</p>
<p>
<h3>Pizza Party </h3>
<table class='pizza'>
<tr>
<td>Pizza Place</td>
<td>Orders</td>
<td>Slices </td>
</tr>
<tr>
<td>Domino's Pizza</td>

<td>10</td>
<td>100</td>
</tr>
<tr>
<td>Little Caesars</td>
<td>12</td>
<td >144 </td>
</tr>
<tr>
<td>Papa John's </td>
<td>15 </td>
<td>165</td>
</tr>
Figure 843
We store the HTML as a Python string and assign two_tables:
two_tables="<h3>Rocket Launch </h3><p><table class='rocket'><tr><td>Flight

No</td><td>Launch site</td> <td>Payload
mass</td></tr><tr><td>1</td><td>Florida</td><td>300
kg</td></tr><tr><td>2</td><td>Texas</td><td>94 kg</td></tr><tr><td>3</td><td>Florida
</td><td>80 kg</td></tr></table></p><p><h3>Pizza Party </h3><table
class='pizza'><tr><td>Pizza Place</td><td>Orders</td> <td>Slices
</td></tr><tr><td>Domino's Pizza</td><td>10</td><td>100</td></tr><tr><td>Little
Caesars</td><td>12</td><td >144 </td></tr><tr><td>Papa John's </td><td>15
</td><td>165</td></tr>"
We create a BeautifulSoup object two_tables_bs
two_tables_bs= BeautifulSoup(two_tables, 'html.parser')
We can find the first table using the tag name table
two_tables_bs.find("table")
We can filter on the class attribute to find the second table, but because class is a keyword in
Python, we add an underscore to differentiate them.
two_tables_bs.find("table",class_='pizza')

DOWNLOADING AND SCRAPING THE CONTENTS OF A WEB PAGE
We Download the contents of the web page:
url = "http://www.ibm.com"
We use get to download the contents of the webpage in text format and store in a variable
called data:
data = requests.get(url).text
We create a BeautifulSoup object using the BeautifulSoup constructor
soup = BeautifulSoup(data,"html5lib") # create a soup object using the variable

'data'
Scrape all links
for link in soup.find_all('a',href=True): # in html anchor/link is represented by

the tag <a>
print(link.get('href'))
SCRAPE ALL IMAGES TAGS
for link in soup.find_all('img'):# in html image is represented by the tag <img>

print(link)
print(link.get('src'))
Scrape data from HTML tables
#The below url contains an html table with data about colors and color codes.
url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-
DA0321EN-SkillsNetwork/labs/datasets/HTMLColorCodes.html"
link to url: link

Before proceeding to scrape a web site, you need to examine the contents and the way data is
organized on the website. Open the above url in your browser and check how many rows and
columns there are in the color table.
# get the contents of the webpage in text format and store in a variable called data
data = requests.get(url).text
soup = BeautifulSoup(data,"html5lib")
#find a html table in the web page

table = soup.find('table') # in html table is represented by the tag <table>
#Get all rows from the table

for row in table.find_all('tr'): # in html table row is represented by the tag <tr>
# Get all columns in each row.
cols = row.find_all('td') # in html a column is represented by the tag <td>
color_name = cols[2].string # store the value in column 3 as color_name

color_code = cols[3].string # store the value in column 4 as color_code
print("{}--->{}".format(color_name,color_code))

VIDEO 026: WORKING WITH DIFFERENT FILE FORMATS (4:11)
Hello. Welcome to Working With Different File Formats.
Figure 844
WHAT YOU WILL LEARN

After watching this video, you will be able to:
• Define different file formats such as csv, xml, and json.
• Write simple programs to read and output data.
• List what Python libraries are needed to extract data.
Figure 845
INTRODUCTION
When collecting data you will find there are many different file formats that need to be collected
or read in order to complete a data driven story or analysis. When gathering the data Python can
make the process simpler with its predefined libraries, but before we explore Python let’s first
check out some of the various file formats.
Figure 846

UNDERSTANDING FILE FORMATS
When looking at a file name you will notice an extension at the end of the title. These extensions
let you know what type of file it is and what it needed to open it. For instance if you see a title
like “FileExample.csv” you will know this is a “csv” file. This is only one example of different file
types as there are many more such as “json” or “xml”.
Figure 847
PYTHON PANDAS LIBRARY

When coming across these different file formats and trying to access their data we need to utilize
Python libraries to make this process easier. The first Python library to become familiar with is
called Pandas. By importing this library in the beginning of the code we are then be able to easily
read the different file types.
Figure 848
READING CSV FILES

Since we have now imported the Panda library let’s use it to read the first “csv” file. In this
instance we have come across the “FileExample.csv” file. The first step is to assign the file to a
variable. Then create another variable to read the file with the help of the Panda library. We can
then call read_csv function output the data to the screen. With this example there were no
headers for the data so it added the first line as the header. Since we don’t want the first line of
data as the header let’s find out how to correct this issue.

Figure 849
USING DATAFRAMES
Now that we have learned how to read and output the data from a “csv” file let’s make it look a
little more organized. From the last example we were able to print out the data but because the
file had no headers it printed the first line of data as a header.
Figure 850
We easily solve this by adding a dataframe attribute. We use the variable “df” to call the file and
then add the “columns” attribute.
Figure 851

By adding this one line to our program we can then neatly organize the data output into the
specified headers for each column.
Figure 852
READING JSON FILES

The next file format we will explore is the “json” file format. In this type of file the text is written
in a language independent data format and is similar to a Python dictionary. The first step in
reading this type of file is to import json. After importing “json” we can add a line to open the file
call the “load” attribute of “json” to begin and read the file and lastly we can then print the file.
Figure 853
READING XML FILES

The next file format is “xml”. This type of file is also known as Extensible Markup Language.
While the Pandas library does not have an attribute to read this type of file let’s explore how to
parse this type of file. The first step to read this type of file is to import xml. By importing this
library we can then use the “etree” attribute to parse the “xml” file. We then add the column
headers and assign then to the dataframe.
Figure 854

Then create a loop to go through the document to collect the necessary data and append the
data to a dataframe.
Figure 855
RECAP
In this video, you learned:
• How to recognize different file types
• How to use Python libraries to extract data
• How to use dataframes when collecting data
Figure 856

HANDS-ON LAB: WORKING WITH DIFFERENT FILE FORMATS
TABLE OF CONTENTS
1. Data Engineering
2. Data Engineering Process
3. Working with different file formats
4. Data Analysis
DATA ENGINEERING
Data engineering is one of the most critical and foundational skills in any data scientist’s toolkit.
DATA ENGINEERING PROCESS

There are several steps in Data Engineering process.
Extract - Data extraction is getting data from multiple sources. Ex. Data extraction from a website
using Web scraping or gathering information from the data that are stored in different
formats(JSON, CSV, XLSX etc.).
Transform - Transforming the data means removing the data that we don't need for further
analysis and converting the data in the format that all the data from the multiple sources is in the
same format.
Load - Loading the data inside a data warehouse. Data warehouse essentially contains large
volumes of data that are accessed to gather insights.
WORKING WITH DIFFERENT FILE FORMATS

In the real-world, people rarely get neat tabular data. Thus, it is mandatory for any data scientist
(or data engineer) to be aware of different file formats, common challenges in handling them
and the best, most efficient ways to handle this data in real life. We have reviewed some of this
content in other modules.
File Format
A file format is a standard way in which information is encoded for storage in a file. First, the file
format specifies whether the file is a binary or ASCII file. Second, it shows how the information
is organized. For example, the comma-separated values (CSV) file format stores tabular data in
plain text.
To identify a file format, you can usually look at the file extension to get an idea. For example, a
file saved with name "Data" in "CSV" format will appear as Data.csv. By noticing the .csv
extension, we can clearly identify that it is a CSV file and the data is stored in a tabular format.
There are various formats for a dataset, .csv, .json, .xlsx etc. The dataset can be stored in different
places, on your local machine or sometimes online.
In this section, you will learn how to load a dataset into our Jupyter Notebook.
Now, we will look at some file formats and how to read them in Python:

COMMA-SEPARATED VALUES (CSV) FILE FORMAT
The Comma-separated values file format falls under a spreadsheet file format.
In a spreadsheet file format, data is stored in cells. Each cell is organized in rows and columns.
A column in the spreadsheet file can have different types. For example, a column can be of string
type, a date type, or an integer type.
Each line in CSV file represents an observation, or commonly called a record. Each record may
contain one or more fields which are separated by a comma.
Reading data from CSV in Python
The Pandas Library is a useful tool that enables us to read various datasets into a Pandas data
frame.
Let us look at how to read a CSV file in Pandas Library.
We use pandas.read_csv() function to read the csv file. In the parentheses, we put the file path
along with a quotation mark as an argument, so that pandas will read the file into a data frame
from that address. The file path can be either a URL or your local file address.
import piplite
await piplite.install(['seaborn', 'lxml', 'openpyxl'])
import pandas as pd
from pyodide.http import pyfetch
SkillsNetwork/labs/Module%205/data/addresses.csv"

await download(filename, "addresses.csv")
df = pd.read_csv("addresses.csv", header=None)
df
Figure 857

Adding column name to the DataFrame
We can add columns to an existing DataFrame using its columns attribute.
df.columns =['First Name', 'Last Name', 'Location ', 'City','State','Area Code']
df
Figure 858
Selecting a single column

To select the first column 'First Name', you can pass the column name as a string to the indexing
operator.
df[‘First Name’]
0 John
1 Jack
2 John "Da Man"
3 Stephen
4 NaN
5 Joan "the bone", Anne
Name: First Name, dtype: object
Selecting multiple columns

To select multiple columns, you can pass a list of column names to the indexing operator.
df = df[['First Name', 'Last Name', 'Location ', 'City','State','Area Code']]

df
Figure 859

Selecting rows using .iloc and .loc
Now, let's see how to use .loc for selecting rows from our DataFrame.
loc() : loc() is label based data selecting method which means that we have to pass the
name of the row or column which we want to select.
# To select the first row

df.loc[0]
0 John
1 Doe
2 120 jefferson st.
3 Riverside
4 NJ
5 8075
Name: 0, dtype: object
# To select the 0th,1st and 2nd row of "First Name" column only
df.loc[[0,1,2], "First Name" ]
0 John
1 Jack
2 John "Da Man"
Name: First Name, dtype: object
Now, let's see how to use .iloc for selecting rows from our DataFrame.
iloc() : iloc() is a indexed based selecting method which means that we have to pass
integer index in the method to select specific row/column.
# To select the 0th,1st and 2nd row of "First Name" column only
df.iloc[[0,1,2], 0]
0 John
1 Jack
2 John "Da Man"
Name: 0, dtype: object
For more information please read the documentation.

Let's perform some basic transformation in pandas.
Transform Function in Pandas

Python's Transform function returns a self-produced dataframe with transformed values after
applying the function specified in its parameter.
Let's see how Transform function works.
#import library
import pandas as pd
import numpy as np
#creating a dataframe
df=pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c'])
df
a b c
0 1 2 3
1 4 5 6
2 7 8 9

Let’s say we want to add 10 to each element in a dataframe:
#applying the transform function

df = df.transform(func = lambda x : x + 10)
df
a b c
0 11 12 13
1 14 15 16
2 17 18 19
Now we will use DataFrame.transform() function to find the square root to each element of the
dataframe.
result = df.transform(func = ['sqrt'])

result
a b c
sqrt sqrt sqrt
0 3.316625 3.464102 3.605551
1 3.741657 3.872983 4.000000
2 4.123106 4.242641 4.358899
For more information about the transform() function please read the documentation.
JSON file Format

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for
humans to read and write.
JSON is built on two structures:
1. A collection of name/value pairs. In various languages, this is realized as an object,
record, struct, dictionary, hash table, keyed list, or associative array.
2. An ordered list of values. In most languages, this is realized as an array, vector, list, or
sequence.
JSON is a language-independent data format. It was derived from JavaScript, but many modern
programming languages include code to generate and parse JSON-format data. It is a very
common data format with a diverse range of applications.
The text in JSON is done through quoted string which contains the values in key-value mappings
within { }. It is similar to the dictionary in Python.
Python supports JSON through a built-in package called json. To use this feature, we import the
json package in Python script.
import json

Writing JSON to a File
This is usually called serialization. It is the process of converting an object into a special format
which is suitable for transmitting over the network or storing in file or database.
To handle the data flow in a file, the JSON library in Python uses the dump() or dumps() function
to convert the Python objects into their respective JSON object. This makes it easy to write data
to files.
import json
person = {
'first_name' : 'Mark',
'last_name' : 'abc',
'age' : 27,
'address': {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
}
}
serialization using dump() function

json.dump() method can be used for writing to JSON file.
Syntax: json.dump(dict, file_pointer)
Parameters:
1. dictionary – name of the dictionary which should be converted to JSON object.
2. file pointer – pointer of the file opened in write or append mode.
with open('person.json', 'w') as f: # writing JSON object

json.dump(person, f)
serialization using dumps() function

json.dumps() that helps in converting a dictionary to a JSON object.
It takes two parameters:
1. dictionary – name of the dictionary which should be converted to JSON object.
2. indent – defines the number of units for indentation
# Serializing json
json_object = json.dumps(person, indent = 4)
# Writing to sample.json
with open("sample.json", "w") as outfile:
outfile.write(json_object)
print(json_object)
{
"first_name": "Mark",
"last_name": "abc",
"age": 27,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
}
}

Our Python objects are now serialized to the file. For deserialize it back to the Python object, we
use the load() function.
Reading JSON to a File

This process is usually called Deserialization - it is the reverse of serialization. It converts the
special format returned by the serialization back into a usable object.
Using json.load()
The JSON package has json.load() function that loads the json content from a json file into a
dictionary.
It takes one parameter:
File pointer: A file pointer that points to a JSON file.
import json
# Opening JSON file

with open('sample.json', 'r') as openfile:
# Reading from json file

json_object = json.load(openfile)
print(json_object)
print(type(json_object))
{'first_name': 'Mark', 'last_name': 'abc', 'age': 27, 'address': {'streetAddress': '21 2nd
Street', 'city': 'New York', 'state': 'NY', 'postalCode': '10021-3100'}}
<class 'dict'>
XLSX file format

XLSX is a Microsoft Excel Open XML file format. It is another type of Spreadsheet file format.
In XLSX data is organized under the cells and columns in a sheet.
Reading the data from XLSX file

Let's load the data from XLSX file and define the sheet name. For loading the data you can use
the Pandas library in python.
import pandas as pd
# Not needed unless you're running locally

# urllib.request.urlretrieve("https://cf-courses-data.s3.us.cloud-object-
SkillsNetwork/labs/Module%205/data/file_example_XLSX_10.xlsx", "sample.xlsx")
SkillsNetwork/labs/Module%205/data/file_example_XLSX_10.xlsx"

await download(filename, "file_example_XLSX_10.xlsx")

df = pd.read_excel("file_example_XLSX_10.xlsx")
df
Figure 860
XML file format

XML is also known as Extensible Markup Language. As the name suggests, it is a markup
language. It has certain rules for encoding data. XML file format is a human-readable and
machine-readable file format.
Pandas does not include any methods to read and write XML files. Here, we will take a look at
how we can use other modules to read data from an XML file and load it into a Pandas
DataFrame.
Writing with xml.etree.ElementTree

The xml.etree.ElementTree module comes built-in with Python. It provides functionality for
parsing and creating XML documents. ElementTree represents the XML document as a tree. We
can move across the document using nodes which are elements and sub-elements of the XML
file.
For more information please read the xml.etree.ElementTree documentation.
import xml.etree.ElementTree as ET
# create the file structure

employee = ET.Element('employee')
details = ET.SubElement(employee, 'details')
first = ET.SubElement(details, 'firstname')
second = ET.SubElement(details, 'lastname')
third = ET.SubElement(details, 'age')
first.text = 'Shiv'
second.text = 'Mishra'
third.text = '23'
# create a new XML file with the results

mydata1 = ET.ElementTree(employee)
# myfile = open("items2.xml", "wb")
# myfile.write(mydata)
with open("new_sample.xml", "wb") as files:
mydata1.write(files)

Reading with xml.etree.ElementTree
Let's have a look at a one way to read XML data and put it in a Pandas DataFrame. You can see
the XML file in the Notepad of your local machine.
# Not needed unless running locally

# !wget https://cf-courses-data.s3.us.cloud-object-
SkillsNetwork/labs/Module%205/data/Sample-employee-XML-file.xml
import xml.etree.ElementTree as etree
SkillsNetwork/labs/Module%205/data/Sample-employee-XML-file.xml"

await download(filename, "Sample-employee-XML-file.xml")
You would need to firstly parse an XML file and create a list of columns for data frame, then
extract useful information from the XML file and add to a pandas data frame.
Here is a sample code that you can use.:
tree = etree.parse("Sample-employee-XML-file.xml")
root = tree.getroot()
columns = ["firstname", "lastname", "title", "division", "building","room"]
datatframe = pd.DataFrame(columns = columns)
for node in root:
firstname = node.find("firstname").text
lastname = node.find("lastname").text
title = node.find("title").text
division = node.find("division").text
building = node.find("building").text
room = node.find("room").text
datatframe = datatframe.append(pd.Series([firstname, lastname, title, division,

building, room], index = columns), ignore_index = True)
dataframe
Figure 861

Reading xml file using pandas.read_xml function
We can also read the downloaded xml file using the read_xml function present in the pandas library which
returns a Dataframe object.
For more information read the pandas.read_xml documentation.
# Herein xpath we mention the set of xml nodes to be considered for migrating to the
dataframe which in this case is details node under employees.
df=pd.read_xml("Sample-employee-XML-file.xml", xpath="/employees/details")
Save Data
Correspondingly, Pandas enables us to save the dataset to csv by using the dataframe.to_csv()
method, you can add the file path and name along with quotation marks in the parentheses.
For example, if you would save the dataframe df as employee.csv to your local machine, you
may use the syntax below:
datatframe.to_csv("employee.csv", index=False)
We can also read and save other file formats, we can use similar functions to pd.read_csv() and
df.to_csv() for other data formats. The functions are listed in the following table:
Read/Save Other Data Formats
Figure 862
Let's move ahead and perform some Data Analysis.
Binary File Format

"Binary" files are any files where the format isn't made up of readable characters. It contain
formatting information that only certain applications or processors can understand. While
humans can read text files, binary files must be run on the appropriate software or processor
before humans can read them.
Binary files can range from image files like JPEGs or GIFs, audio files like MP3s or binary
document formats like Word or PDF.
Let's see how to read an Image file.

Reading the Image file
Python supports very powerful tools when it comes to image processing. Let's see how to
process the images using the PIL library.
PIL is the Python Imaging Library which provides the python interpreter with image editing
capabilities.
# importing PIL
from PIL import Image
# Uncomment if running locally

# urllib.request.urlretrieve("https://hips.hearstapps.com/hmg-
prod.s3.amazonaws.com/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg",
"dog.jpg")
filename = "https://hips.hearstapps.com/hmg-prod.s3.amazonaws.com/images/dog-puppy-
on-garden-royalty-free-image-1586966191.jpg"

await download(filename, "dog.jpg")
# Read image
img = Image.open('dog.jpg')
# Output Images
display(img)
Data Analysis
In this section, you will learn how to approach data acquisition in various ways and obtain
necessary insights from a dataset. By the end of this lab, you will successfully load the data into
Jupyter Notebook and gain some fundamental insights via the Pandas Library.
In our case, the Diabetes Dataset is an online source and it is in CSV (comma separated value)
format. Let's use this dataset as an example to practice data reading.
About this Dataset

Context: This dataset is originally from the National Institute of Diabetes and Digestive and
Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient
has diabetes, based on certain diagnostic measurements included in the dataset. Several
constraints were placed on the selection of these instances from a larger database. In particular,
all patients here are females at least 21 years of age of Pima Indian heritage.
Content: The datasets consists of several medical predictor variables and one target variable,
Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI,
insulin level, age, and so on.

We have 768 rows and 9 columns. The first 8 columns represent the features and the last
column represent the target/label.
# Import pandas library

import pandas as pd
SkillsNetwork/labs/Module%205/data/diabetes.csv"

await download(filename, "diabetes.csv")

df = pd.read_csv("diabetes.csv")
After reading the dataset, we can use the dataframe.head(n) method to check the top n rows of
the dataframe, where n is an integer. Contrary to dataframe.head(n), dataframe.tail(n) will show
you the bottom n rows of the dataframe.
# show the first 5 rows using dataframe.head() method

print("The first 5 rows of the dataframe")
df.head(5)
To view the dimensions of the dataframe, we use the .shape parameter.
df.shape
(768, 9)
Statistical Overview of dataset
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Pregnancies 768 non-null int64
1 Glucose 768 non-null int64
2 BloodPressure 768 non-null int64
3 SkinThickness 768 non-null int64
4 Insulin 768 non-null int64
5 BMI 768 non-null float64
6 DiabetesPedigreeFunction 768 non-null float64
7 Age 768 non-null int64
8 Outcome 768 non-null int64
dtypes: float64(2), int64(7)
memory usage: 54.1 KB

This method prints information about a DataFrame including the index dtype and columns, non-
null values and memory usage.
df.describe()
Figure 863
Pandas describe() is used to view some basic statistical details like percentile, mean, standard
deviation, etc. of a data frame or a series of numeric values. When this method is applied to a
series of strings, it returns a different output
Identify and handle missing values

We use Python's built-in functions to identify these missing values. There are two methods to
detect missing data:
.isnull()
.notnull()
The output is a boolean value indicating whether the value that is passed into the argument is
in fact missing data.
missing_data = df.isnull()
missing_data.head(5)
Figure 864
"True" stands for missing value, while "False" stands for not missing value.
Count missing values in each column

Using a for loop in Python, we can quickly figure out the number of missing values in each
column. As mentioned above, "True" represents a missing value, "False" means the value is
present in the dataset.

In the body of the for loop the method ".value_counts()" counts the number of "True" values.
for column in missing_data.columns.values.tolist():

print(column)
print (missing_data[column].value_counts())
print("")
Pregnancies
False 768
Name: Pregnancies, dtype: int64
Glucose
False 768
Name: Glucose, dtype: int64
BloodPressure
False 768
Name: BloodPressure, dtype: int64
SkinThickness
False 768
Name: SkinThickness, dtype: int64
Insulin
False 768
Name: Insulin, dtype: int64
BMI
False 768
Name: BMI, dtype: int64
DiabetesPedigreeFunction
False 768
Name: DiabetesPedigreeFunction, dtype: int64
Age
False 768
Name: Age, dtype: int64
Outcome
False 768
Name: Outcome, dtype: int64
As you can see above, there is no missing values in the dataset.
Correct data format

Check all data is in the correct format (int, float, text or other).
In Pandas, we use
.dtype() to check the data type
.astype() to change the data type
Numerical variables should have type 'float' or 'int'.
df.dtypes
Pregnancies int64
Glucose int64
BloodPressure int64
SkinThickness int64
Insulin int64
BMI float64
DiabetesPedigreeFunction float64
Age int64
Outcome int64
dtype: object
As we can see before, All columns have the correct data type.

Visualization
Visualization is one of the best way to get insights from the dataset. Seaborn and Matplotlib
are two of Python's most powerful visualization libraries.
# import libraries
import seaborn as sns
labels= 'Diabetic','Not Diabetic'

plt.pie(df['Outcome'].value_counts(),labels=labels,autopct='%0.02f%%')
plt.legend()
plt.show()
Figure 865
As you can see above, 65.10% females are Diabetic and 34.90% are Not Diabetic.
Thank you for completing this Notebook!

PRACTICE QUIZ: REST APIS, WEB SCRAPING, AND WORKING WITH FILES
Question 1
What is the function of "GET" in HTTP requests?
o Deletes a specific resource
o Returns the response from the client to the requestor
o Sends data to create or update a resource
o Carries the request to the client from the requestor
Correct: GET carries the request to the client
Question 2
What does URL stand for?
o Uniform Resource Learning
o Uniform Resource Locator
o Unilateral Resistance Locator
o Uniform Request Location
Correct: URL acts as a resource locator. That’s why they are also called Web addresses.
Question 3
What does the file extension “csv” stand for?
o Comma Serrated Values
o Comma Separation Valuations
o Common Separated Variables
o Comma Separated Values
Correct: CSV is a data format in which each value is separated by a comma ‘,’.
Question 4
What is web scraping?
o The process to display all data within a URL.
o The process to request and retrieve information from a client.
o The process to describe communication options.
o The process to extract data from a particular website.
Correct: Web scraping implies extraction of data for a web page.

PRACTICE PROJECT: GDP DATA EXTRACTION AND PROCESSING
In this practice project, you will put the skills acquired through the course to use. You will extract
data from a website using webscraping and request APIs process it using Pandas and Numpy
libraries.
To complete this lab, you will utilize JupyterLab running on the Cloud in the Skills Network Labs
environment.
Skills Network Labs (SN Labs) is a virtual lab environment used in this course. Upon clicking the
"Start Lab" button below, your Username and Email will be passed to SN Labs and will be used
in strict accordance with IBM Skills Network Privacy policy, such as for communicating important
information to enhance your learning experience.
In case you need to download the lab instructions click HERE to open a new tab.
Practice Project: GDP Data extraction and processing
Introduction
In this practice project, you will put the skills acquired through the course to use. You will extract
data from a website using webscraping and request APIs process it using Pandas and Numpy
libraries.
Project Scenario:
An international firm that is looking to expand its business in different countries across the world
has recruited you. You have been hired as a junior Data Engineer and are tasked with creating a
script that can extract the list of the top 10 largest economies of the world in descending order
of their GDPs in Billion USD (rounded to 2 decimal places), as logged by the International
Monetary Fund (IMF).
The required data seems to be available on the URL mentioned below:
URL: click in here
Objectives
• Use Webscraping to extract required information from a website.
• Use Pandas to load and process the tabular data as a dataframe.
• Use Numpy to manipulate the information contatined in the dataframe.
• Load the updated dataframe to CSV file.

Disclaimer
If you are using a downloaded version of this notebook on your local machine, you may
encounter a warning message as shown in the screenshot below.
Figure 866
This does not affect the execution of your codes in any way and can be simply ignored.
Setup
For this lab, we will be using the following libraries:

• pandas for managing the data.
• numpy for mathematical operations.
Importing Required Libraries

We recommend you import all required libraries in one place (here):
import numpy as np
import pandas as pd
# You can also use this section to suppress warnings generated by your code:
def warn(*args, **kwargs):
pass
import warnings
warnings.warn = warn
warnings.filterwarnings('ignore')

Exercises
Exercise 1
Extract the required GDP data from the given URL using Web Scraping.
URL="https://web.archive.org/web/20230902185326/https://en.wikipedia.org/wiki/List_of
_countries_by_GDP_%28nominal%29"
You can use Pandas library to extract the required table directly as a DataFrame. Note that the
required table is the third one on the website, as shown in the image below.
Figure 867

# Change the data type of the 'GDP (Million USD)' column to integer. Use astype()
method.
# Convert the GDP value in Million USD to Billion USD
# Use numpy.round() method to round the value to 2 decimal places.
# Rename the column header from 'GDP (Million USD)' to 'GDP (Billion USD)'
method.
df['GDP (Million USD)'] = df['GDP (Million USD)'].astype(int)

df[['GDP (Million USD)']] = df[['GDP (Million USD)']]/1000

df[['GDP (Million USD)']] = np.round(df[['GDP (Million USD)']], 2)
df.rename(columns = {'GDP (Million USD)' : 'GDP (Billion USD)'})
Exercise 2
Modify the GDP column of the DataFrame, converting the value available in Million USD to
Billion USD. Use the round() method of Numpy library to round the value to 2 decimal places.
Modify the header of the DataFrame to GDP (Billion USD).
method.
method.
df['GDP (Million USD)'] = df['GDP (Million USD)'].astype(int)

df[['GDP (Million USD)']] = df[['GDP (Million USD)']]/1000

df[['GDP (Million USD)']] = np.round(df[['GDP (Million USD)']], 2)
df.rename(columns = {'GDP (Million USD)' : 'GDP (Billion USD)'})
Exercise 3
Load the DataFrame to the CSV file named "Largest_economies.csv"
# Load the DataFrame to the CSV file named "Largest_economies.csv"

df.to_csv('./Largest_economies.csv')
Congratulations! You have completed the lab.

MODULE 5 SUMMARY: APIS, AND DATA COLLECTION
• Simple APIs in Python are application programming interfaces that provide

straightforward and easy-to-use methods for interacting with services, libraries, or
data, often with minimal configuration or complexity.
o An API lets two pieces of software talk to each other.
o Using an API library in Python entails importing the library, calling its functions
or methods to make HTTP requests, and parsing the responses to access data
or services provided by the API.
o Pandas API processes the data by communicating with the other software
components.
o An Instance forms when you create a dictionary and then use the DataFrames
constructor to create a Pandas object.
o Method “head” will display the mentioned number of rows from the top (default
5) of DataFrames, while method means will calculate the mean and return the
values
• Rest APIs allow you to communicate through the internet, taking advantage of
resources like storage, access more data, AI algorithms, and so on.
o HTTP methods transmit data over the internet.
o An HTTP message typically includes a JSON file with instructions for operations.
o HTTP messages containing JSON files are returned to the client as a response
from web services.
o Dealing with time series data involves using the Pandas time series function.
o You can get data for daily candlesticks and plot the chart using Plotly with the
candlestick plot.
• The HTTP (HyperText Transfer Protocol) transfers data, including web pages and
resources, between a client (a web browser) and a server on the World Wide Web.
o An HTTP protocol may include many types of REST APIs
o An HTTP response includes information like the type of resource, length of
resource, and so on
o Uniform resource locator (URL) is the most popular way to find resources on the
web.
o URL is divided into three parts: scheme, internet address or base URL, and route
o The GET method is one of the popular methods of requesting information. Some
other methods may also include the body.
o Response method contains the version and body of the response.
o POST submits data to the server, PUT updates data already on the server,
DELETE deletes data from the server
• Requests is a Python library that allows you to send HTTP/1.1 requests easily
o You can modify the results of your query with the GET method.

o You can obtain multiple requests from a URL like name, ID, and so on with a
Query string.
• Web scraping in Python involves extracting and parsing data from websites to gather
information for various applications, using libraries like Beautiful Soup and requests.
o HTML comprises text surrounded by blue text elements enclosed in angular
brackets called tags.
o Each tag name in Python is a class, and each tag is an instance.
o You can select an HTML element on a web page to inspect the webpage.
o Web pages may also contain CSS and JavaScript along with HTML elements.
o Each HTML document is like an HTML Tree, which may contain strings and other
tags.
o Each HTML Table has table tags and is defined with rows, header, body, and so
on
• Tabular data can also be extracted from web pages using the method in Pandas.
• Beautiful Soup in Python is a library for parsing and navigating HTML and XML
documents, making extracting and manipulating data from web pages more accessible.
• To parse a document, pass it through the Beautiful Soup constructor to get a beautiful
soup object representing the document as a nested data structure.
• Beautiful soup represents HTML as a set of tree-like objects with methods to parse the
HTML.
• Navigable string is like a Python string that supports beautiful soup functionality.
• find_all is a method used to extract content based on the tag’s name, its attributes, the
text of a string, or some combination of these.
• The find_all method looks through a tag’s descendants and retrieves all descendants
that match your filters.
• The result is a Python iterable like a list.
• File formats refer to the specific structure and encoding rules used to store and
represent data in files, such as .txt for plain text or .csv for comma-separated values.
• Python works with different file formats such as CSV, XML, JSON, xlsx, and so on
• The extension of a file name will let you know what type of file it is and what it needs
to open with.
• To access data from CSV files, we can use Python libraries such as Pandas.
• Similarly, different methods help parse JSON, XML, and other files.

CHEAT SHEET: APIS AND DATA COLLECTION

Accessing Access the value of a specific Syntax:
element attribute attribute of an HTML 1. attribute = element[(attribute)]
element. Example:
1. href = link_element[(href)]
BeautifulSoup() Parse the HTML content of a Syntax:

web page using 1. soup = BeautifulSoup(html, (html.parser))
BeautifulSoup. The parser Example:
type can vary based on the 1. html = (https://api.example.com/data) soup =
project. BeautifulSoup(html, (html.parser))
delete() Send a DELETE request to Syntax:

remove data or a resource 1. response = requests.delete(url)
from the server. DELETE Example:
requests delete a specified 1. response =
resource on the server. requests.delete((https://api.example.com/delete))
find() Find the first HTML element Syntax:

that matches the specified 1. element = soup.find(tag, attrs)
tag and attributes. Example:
1. first_link = soup.find((a), {(class): (link)})
find_all() Find all HTML elements that Syntax:

match the specified tag and 1. elements = soup.find_all(tag, attrs)
attributes. Example:
1. all_links = soup.find_all((a), {(class):
(link)})</td>
findChildren() Find all child elements of an Syntax:
HTML element. 1. children = element.findChildren()
Example:
1. child_elements = parent_div.findChildren()
get() Perform a GET request to Syntax:

retrieve data from a specified 1
URL. GET requests are 1. response = requests.get(url)
typically used for reading Copied!
data from an API. The Example:
response variable will 1
contain the server's 1. response =
response, which you can requests.get((https://api.example.com/data))
Copied!
process further.
Headers Include custom headers in Syntax:
the request. Headers can 1. headers = {(HeaderName): (Value)}
provide additional
Example:
information to the server, 1. base_url = (https://api.example.com/data)
headers = {(Authorization): (Bearer YOUR_TOKEN)}
response = requests.get(base_url, headers=headers)

such as authentication
tokens or content types.
Import Libraries Import the necessary Python Syntax:
libraries for web scraping. 1. from bs4 import BeautifulSoup
json() Parse JSON data from the Syntax:

response. This extracts and
1. data = response.json()
works with the data returned
Example:
by the API. The
1. response =
response.json() method
requests.get((https://api.example.com/data))
converts the JSON response 2. data = response.json()
into a Python data structure
(usually a dictionary or list).
next_sibling() Find the next sibling element Syntax:
in the DOM. 1. sibling = element.find_next_sibling()
Example:
1. next_sibling =
current_element.find_next_sibling()
parent Access the parent element in Syntax:
the Document Object Model 1. parent = element.parent
(DOM). Example:
1. parent_div = paragraph.parent
post() Send a POST request to a Syntax:

specified URL with data. 1. response = requests.post(url, data)
Create or update POST Example:
requests using resources on 1. response =
the server. The data requests.post((https://api.example.com/submit),
data={(key): (value)})
parameter contains the data
to send to the server, often
in JSON format.
put() Send a PUT request to Syntax:
update data on the server. 1. response = requests.put(url, data)
PUT requests are used to Example:
update an existing resource 1. response =
on the server with the data requests.put((https://api.example.com/update),
data={(key): (value)})
provided in the data
parameter, typically in JSON
format.
Query parameters Pass query parameters in the Syntax:
URL to filter or customize the 1. params = {(param_name): (value)}
request. Query parameters Example:
specify conditions or limits 1. base_url = "https://api.example.com/data"
for the requested data. 2. params = {"page": 1, "per_page": 10}
3. response = requests.get(base_url, params=params)
select() Syntax:
1. element = soup.select(selector)

Select HTML elements from Example:
the parsed HTML using a 1. titles = soup.select((h1))
CSS selector.
status_code Check the HTTP status code Syntax:
of the response. The HTTP 1. response.status_code
status code indicates the Example:
result of the request 1. url = "https://api.example.com/data"
(success, error, redirection). 2. response = requests.get(url)
Use the HTTP status codeIt 3. status_code = response.status_code
can be used for error
handling and decision-
making in your code.
tags for find() and Specify any valid HTML tag Tag Example:
find_all() as the tag parameter to 1. - (a): Find anchor () tags.
search for elements of that 2. - (p): Find paragraph ((p)) tags.
type. Here are some 3. - (h1), (h2), (h3), (h4), (h5), (h6): Find
heading tags from level 1 to 6 ( (h1),n (h2)).
common HTML tags that you
can use with the tag 4. - (table): Find table () tags.
parameter. 5. - (tr): Find table row () tags.

6. - (td): Find table cell ((td)) tags.
7. - (th): Find table header cell ((td))tags.
8. - (img): Find image ((img)) tags.
9. - (form): Find form ((form)) tags.
10. - (button): Find button ((button)) tags.
text Retrieve the text content of Syntax:

an HTML element. 1. text = element.text
Example:
1. title_text = title_element.text

READING: GLOSSARY: APIS AND DATA COLLECTION
Term Definition
API Key An API key in Python is a secure access token or code used to authenticate and
authorize access to an API or web service, enabling the user to make
authenticated requests.
APIs APIs (Application Programming Interfaces) are a set of rules and protocols that
enable different software applications to communicate and interact, facilitating
the exchange of data and functionality.
Audio file An audio file is a digital recording or representation of sound, often stored in
formats like MP3, WAV, or FLAC, allowing playback and storage of audio
content.
Authorize In Python, "authorize" often means granting permission or access to a user or
system to perform specific actions or access particular resources, often related
to authentication and authorization mechanisms.
Beautiful Soup Objects Beautiful Soup objects in Python are representations of parsed HTML or XML
documents, allowing easy navigation, searching, and manipulation of the
document’s elements and data.
Bitcoin currency Bitcoin is a decentralized digital currency that operates without a central
authority, allowing peer-to-peer transactions on a blockchain network.
Browser A browser is a software application that enables users to access and interact
with web content, displaying websites and web applications.
Candlestick plot A candlestick plot in Python visually represents stock price movements over
time, using rectangles to illustrate the open, close, high, and low prices for a
given period.
Client/Wrapper A client or wrapper in Python is a software component that simplifies
interaction with external services or APIs, encapsulating communication and
providing higher-level functionality for developers.
CoinGecko API The CoinGecko API is a web service that provides cryptocurrency market data
and information, allowing developers to access real-time and historical data for
various cryptocurrencies.
DELETE Method The DELETE method in Python is an HTTP request method used to request the
removal or deletion of a resource on a web server.
Endpoint In Python, an "endpoint" refers to a specific URL or URI that a web service or
API exposes to perform a particular function or access a resource.
File extension A file extension is a suffix added to a filename to indicate the file's format or
type, often used by operating systems and applications to determine how to
handle the file.

find_all In Python, find_all is a Beautiful Soup method used to search and extract all
occurrences of a specified HTML or XML element, returning a list of matching
elements.
GET method The GET method in Python is an HTTP request method used to retrieve data
from a web server by appending parameters to the URL.
HTML HTML (Hypertext Markup Language) is the standard language for creating and
structuring content on web pages, using tags to define the structure and
presentation of documents.
HTML Anchor tags HTML anchor tags in Python are used to create hyperlinks within web pages,
linking to other web pages or resources using the <a> element with the href
attribute.
HTML Tables HTML tables in Python are used to organize and display data in a structured
grid format on a web page, constructed with <table>, <tr>, <th>, and <td>
elements.
HTML Tag An HTML tag in Python is a specific code enclosed in angle brackets used to
define elements within an HTML document, specifying how content should be
presented or structured.
HTML Trees HTML trees in Python refer to the hierarchical structure created when parsing
an HTML document, representing its elements and their relationships, typically
used for manipulation or extraction of data.
HTTP HTTP (HyperText Transfer Protocol) is the foundation of data communication

on the World Wide Web, used for transmitting and retrieving web content
between clients and servers.
http lib A mathematical convention is a fact, name, notation, or usage which is

generally agreed upon by mathematicians.
Identify In Python, "identify" usually means determining if two variables or objects refer
to the same memory location, which can be checked using the is operator.
Instance In Python, an "instance" typically refers to a specific occurrence of an object or

class, created from a class blueprint, with its own unique set of data and
attributes.
JSON file A JSON (JavaScript Object Notation) file is a lightweight data interchange
format that stores structured data in a human-readable text format, commonly
used for configuration, data exchange, and web APIs.
Mean value The mean value in Python is the average of a set of numerical values,
calculated by adding all values and dividing by the total number of values.
Navigable string In Python, a Navigable String is a Beautiful Soup object representing a string
within an HTML or XML document, allowing for navigation and manipulation
of the text content.
Plotly Plotly is a Python library for creating interactive and visually appealing web-
based data visualizations and dashboards.
PNG file A PNG (Portable Network Graphics) file is a lossless image format in Python
that is commonly used for high-quality graphics with support for transparency
and compression.

POST method The POST method in Python is an HTTP request method used to send data to a
web server, often used for submitting form data and creating or updating
resources.
Post request A POST request in Python is an HTTP method used to send data to a web
server for the purpose of creating or updating a resource, typically used in web
applications and APIs.
PUT method The PUT method in Python is an HTTP request method used to update an
existing resource on a web server by replacing or modifying it.
Py-Coin-Gecko Py-Coin-Gecko is a Python library that provides a convenient interface for

accessing cryptocurrency data and information from the CoinGecko API.
Python iterable A Python iterable is an object that can be looped over, typically used in for
loops, and includes data structures like lists, tuples, and dictionaries.
Query string A query string in Python is a part of a URL that contains data or parameters to
be sent to a web server, typically used in HTTP GET requests to retrieve specific
information.
rb mode In Python, "rb" mode is used when opening a file to read it in binary mode,
allowing you to read and manipulate non-text files like images or binary data.
Resource In Python, a "resource" typically refers to an external entity such as a file,

database connection, or network object that can be managed and manipulated
within a program.
Rest API A REST API in Python is a web-based interface that follows the principles of
Representational State Transfer (REST), allowing communication and data
exchange over HTTP using standard HTTP methods and data formats.
Service instance In Python, a "service instance" typically refers to an instantiated object or entity
representing a service, enabling interaction with that service in a program or
application.
Timestamp A timestamp is a representation of a specific moment in time, often expressed

as a combination of date and time, used for record-keeping and data tracking.
Transcribe "Transcribe" typically means converting spoken language or audio into written
text, often using automatic speech recognition (ASR) technology.
Unix timestamp A UNIX timestamp is a numerical value representing the number of seconds
that have elapsed since January 1, 1970, 00:00:00 UTC, used for time-keeping
in Unix-based systems and programming.
url (Uniform Resource In Python, a URL (Uniform Resource Locator) is a web address that specifies the
Locator) location of a resource on the internet, typically consisting of a protocol,
domain, and path.
urllib The "urllib" library in Python is used for working with URLs and making HTTP
requests, including functions for fetching web content, handling cookies, and
more.
Web service Web services in Python are software components that allow applications to
communicate over the internet by sending and receiving data in a
standardized format, typically using protocols like HTTP or XML.

Web scraping Web scraping in Python is the process of extracting data from websites by
parsing and analyzing their HTML structure, often done with libraries like
BeautifulSoup or Scrapy.
xlsx An XLSX file is a file format used for storing spreadsheet data in Excel,
containing worksheets, cells, and formulas in a structured manner.
xml XML (Extensible Markup Language) is a text-based format for storing and
structuring data using tags, often used for data interchange and configuration
files.

FINAL EXAM
INSTRUCTIONS
Exam Instructions
1. This exam is worth 50% of your entire grade for the course.
2. There is no pass/fail for the exam itself, but the grade you get will affect your overall
passing grade for the course.
3. Time allowed: 1 hour
4. Attempts per question:
o One attempt - For True/False questions
o Two attempts - For any question other than True/False
5. Clicking the "Final Check" button when it appears, means your submission is FINAL. You will
NOT be able to resubmit your answer for that question ever again.
6. Check your grades in the course at any time by clicking on the "Progress" tab.
IMPORTANT: Do not let the time run out and expect the system to grade you
automatically. You must explicitly submit your answers, otherwise they would be
marked as incomplete.
FINAL EXAM

COURSE WRAP UP
CONGRATULATIONS AND NEXT STEPS

Congratulations on completing this course. We hope you enjoyed it.
As a next step, you can take the appropriate follow-on Python Project from the list below to
apply your newfound skills in a real-world scenario.
• Python for Data Science Project
• Python for Data Engineering Project
• Python for AI and Development Project
Note: Successful completion of this course is a prerequisite for these Python Project
courses.
If you are looking to start a career in Data Science, Data Engineering or AI & Application
Development, note that this course is part of the following Professional Certificates which are
designed to empower you with the skills to become job-ready in these fields.
• IBM Data Analyst Professional Certificate
• IBM Data Science Professional Certificate
• IBM Data Engineering Professional Certificate
• IBM Full Stack Developer Professional Certificate
• DevOps and Software Engineering Professional Certificate
• Applied Data Science with R Professional Certificate
We encourage you to leave your feedback and rate this course.
Good luck!

CONGRATULATIONS!
You have completed your course. Share your success on social media or email.

CREDITS AND ACKNOWLEDGMENTS
Primary Instructor
• Joseph Santarcangelo (IBM)
Other Contributors & Staff
Project Lead
• Rav Ahuja (IBM)
Instructional Designer
• Heather Vaughan (Skill-Up Technologies)
Lab Authors
• Azim Hirjani (IBM)
Production Team
Project Coordinator
• Simranjit Singh (Skill-Up Technologies)
• Charlie Money (Skill-Up Technologies)
Publishing
• Grace Barker (IBM)
• Rachael Jones (Skill-Up Technologies)
• Eboney Hinds (Skill-Up Technologies)
• Sunny Anderson (Skill-Up Technologies)
QA
• Pradnya B (Skill-Up Technologies)
Narration
• Bella West (Skill-Up Technologies)
Video Production
• Simer Preet (Skill-Up Technologies)
• Lauren Hall (Skill-Up Technologies)
• Hunter Bay (Skill-Up Technologies)
• Tanya Singh (Skill-Up Technologies)
• Om Singh (Skill-Up Technologies)
Teaching Assistants and Forum Moderators
• Lavanya T S (Skill-Up Technologies)
• Malika Singla (Skill-Up Technologies)
• Lakshmi Holla (Skill-Up Technologies)
• Anita Verma (Skill-Up Technologies)
Compilation
• Osvaldo Alencar (may 31, 2024 – 7:40pm)

IBM PY0101EN - Python Basics For Data Science

Uploaded by

IBM PY0101EN - Python Basics For Data Science

Uploaded by

PYTHON BASICS

FOR DATA SCIENCE

SUMMARY VIDEOS LAB 2 of 590

SUMMARY VIDEOS LAB 3 of 590

SUMMARY VIDEOS LAB 4 of 590

SUMMARY VIDEOS LAB 5 of 590

SUMMARY VIDEOS LAB 6 of 590

SUMMARY VIDEOS LAB 7 of 590

SUMMARY VIDEOS LAB 8 of 590

ABOUT THIS COURSE

NOTE FOR LEARNERS AUDITING THE COURSE

SUMMARY VIDEOS LAB 9 of 590

SUMMARY VIDEOS LAB 10 of 590

MODULE 1: PYTHON BASICS

SUMMARY VIDEOS LAB 11 of 590

MODULE 5: APIS AND DATA COLLECTION

SUMMARY VIDEOS LAB 12 of 590

Note: Videos available at edx.org

SUMMARY VIDEOS LAB 13 of 590

Module Lab Video Subject Time (min) Total Time (h:m)

✓ 3 Writing Your First Python Code 10

✓ 4 Working with Types in Python 10

10 Reading: Conditions and Branching 10

11 Introduction to Loops in Python 10

12 Reading: Exploring Python Functions 15

3 ✓ 12 Functions in Python 40 3:55

13 Reading: Exception Handling 10

14 Reading: Objects and Classes 10

✓ 14 Hands-on Lab: Objects and Classes 40

✓ 14 Practice Lab: Text Analysis 45

✓ 15 Hands-On Lab: Reading Files with Open 40

✓ 16 Hands-On Lab: Writing Files with Open 25

✓ 18 Practice Lab: Selecting Data in a DataFrame 15

✓ 19 Hands-On Lab: One Dimensional Numpy 40

✓ 20 Hands-On Lab: Two Dimensional Numpy 30

✓ 21 Hands-On Lab: Introduction to API 15

✓ 23 Hands-on Lab: Access REST APIs & Request HTTP 15

5 ✓ 25 Hands-on Lab: Web Scraping 30 2:00

✓ 26 Hands-on Lab: Working with Different File Formats 30

✓ 26 Practice Project: GDP Data Extraction and Processing 30

Total Time (h:m) 10:40

SUMMARY VIDEOS LAB 14 of 590

VIDEO 001: COURSE INTRODUCTION (1:45)

SUMMARY VIDEOS LAB 15 of 590

SUMMARY VIDEOS LAB 16 of 590

SUMMARY VIDEOS LAB 17 of 590

SUMMARY VIDEOS LAB 18 of 590

SUMMARY VIDEOS LAB 19 of 590

SUMMARY VIDEOS LAB 20 of 590

WHAT YOU WILL LEARN

SUMMARY VIDEOS LAB 21 of 590

WHO IS PYTHON FOR

SUMMARY VIDEOS LAB 22 of 590

DIVERSITY AND INCLUSION EFFORTS

SUMMARY VIDEOS LAB 23 of 590

SUMMARY VIDEOS LAB 24 of 590

WHAT YOU WILL LEARN

SUMMARY VIDEOS LAB 25 of 590

SUMMARY VIDEOS LAB 26 of 590

SUMMARY VIDEOS LAB 27 of 590

SUMMARY VIDEOS LAB 28 of 590

SUMMARY VIDEOS LAB 29 of 590

SUMMARY VIDEOS LAB 30 of 590

SUMMARY VIDEOS LAB 31 of 590

SUMMARY VIDEOS LAB 32 of 590

Estimated time needed: 10 minutes

After completing this lab you will be able to:

• Say 'Hello' to the world in Python

Say 'Hello' to the world in Python

# Try your first Python output

What version of Python are we using?

SUMMARY VIDEOS LAB 33 of 590

# Check the Python Version

Writing comments in Python

# Practice on writing comments

# Print string as error message