STAT 451: Intro To Machine Learning Lecture Notes
STAT 451: Intro To Machine Learning Lecture Notes
Lecture Notes
Sebastian Raschka
Department of Statistics
University of Wisconsin–Madison
http://stat.wisc.edu/∼sraschka/teaching/stat451-fs2020/
Fall 2020
Contents
3 Using Python 3
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1.1 About this Lecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1.2 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2.1 Installing Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.2 Managing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.3 Installing and Updating packages . . . . . . . . . . . . . . . . . . . . . 8
3.3 Running Python Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Interpreter/REPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.2 IPython . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.3 Scripts (.py files) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.4 Jupyter Notebooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.5 JupyterLab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.6 Jupyter Notebooks and Homework Submissions . . . . . . . . . . . . . 14
3.4 Relevant Python Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.1 Basic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.3 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5.1 List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5.2 Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5.3 Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5.4 Tuple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.6 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.7 Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.7.1 Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.7.2 Comprehensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.8 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.9 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.10 Command Line Arguments via Scripts . . . . . . . . . . . . . . . . . . . . . . 16
3.11 Reading and Writing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.12 Importing Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.13 Standard Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.14 GIL and Multiprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.15 Subprocesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.16 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.17 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.18 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2
STAT 451: Intro to Machine Learning
Lecture Notes
Sebastian Raschka
Department of Statistics
University of Wisconsin–Madison
http://stat.wisc.edu/∼sraschka/teaching/stat451-fs2020/
Fall 2020
3 Using Python
3.1 Introduction
This lecture provides a brief overview of the Python programming language. While Python
is among the programming languages that is easiest to learn – one of the reasons why it has
become so popular across different fields – it is impossible to cover all essential aspects of
the language adequately within an hour, or, even during the course of this semester (since
we primarily want to focus on Machine Learning, not programming!). Hence, it is expected
that you already have basic programming knowledge and experience with coding up simple
problems, as listed in the course pre-requisites. No worries, even if you haven’t done any
programming, yet, having worked with R would also be a useful experience.
As mentioned at the beginning of this course and listed on the course website, students who
have not used Python before will need to spend a few hours on becoming more familiar with
it, by working through one of the recommended resources listed on the course website1 or
other books or courses, which I shared about a month ago via the mailing list. However,
if you haven’t had a chance to look at one or two of these recommended resources, it is by
no means too late to learn the basics for this course (you can easily catch up and learn the
basics on a single weekend).
Throughout this lecture, it is important to keep in mind that we are not trying to learn
Python from a computer science, programming, or software engineering perspective. In this
course, our objective is to use Python as a scientist or researcher with scientific computing
needs in mind. People often say that programming is a computer science topic, which I
disagree with. Learning a programming language is similar to learning how to use a pen.
Moreover, learning how to write words with a pen does not mean that we are automatically
novelists. Since many computer science topics are very relevant to machine learning (for
example, complexity theory and big-O notation, which we briefly covered the last lecture),
it is highly recommended to read an introductory computer science textbook if you would
like to pursue machine learning further after this course.
1 http://stat.wisc.edu/∼sraschka/teaching/stat451-fs2020/
Sebastian Raschka STAT451 FS20. L03: Using Python Page 4
That being said, this lecture and the following lectures will provide a quick overview of the
relevant topics that you will need for several homework exercises and as a foundation for using
the scientific computing libraries for machine learning. Note that this lecture mainly covers
the Python language itself, whereas the next lecture will focus more on scientific computing
libraries for Python, which we will be using to implement and use various machine learning
algorithms covered in this class.
3.1.2 Python
In short, Python2 is an interpreted, dynamic language that does not require static type
declarations. In that sense, while being a multi-purpose programming language, it is more
similar to the R language rather than the C programming language3 , which has a static
type system.
For example, consider a simple program that prints the string “a+b=c” and the result of
the integer division “1+2.” If we were to implement that in its simplest way In C, we first
have to write a file with the following contents (note the static type declarations):
#include <stdio.h>
int main ()
{
int result;
char word[6] = "a+b=c";
result = 1 + 2;
printf("%s\n", word);
printf("%d\n", result);
return 0;
}
Then, we would need to compile it into a program and run it. For example, if we saved the
file as example.c, we could compile it via the GNU compiler tools, GCC 4 :
Next, we would execute the compiled program, example from the command line5 :
$ ./example
a+b=c
3
name ”Python,” despite its current logo, is unrelated to the snake but is derived from Monty Python comedy
group and their show, Monty Python’s Flying Circus, which was popular in the 1970s and 1980s.
3 While there are many different Python interpreters out there, the official Python interpreter is itself
written in C
4 https://gcc.gnu.org
5 Lines starting with an ”$” symbol indicate that we execute a command in a command line terminal,
result = 1 + 2
word = "a+b=c"
print(result)
print(word)
We can execute this previous code in an interactive Python interpreter, or we can copy and
paste it into a text file (for example, example.py), so that we can then run it as a script,
for example, by running the following command in your command line terminal:
$ python example.py.
Some more details about executing Python code will follow later.
Here, the main point of this section is that Python is a dynamic, interpreted programming
language. This means that we do not have to specify the types of the variables in Python,
and we do not need to compile anything in order to obtain the results. Also, as illustrated
above, instead of writing code interactively, we can write the code in a .py script file, which
we can execute using a Python interpreter (e.g., python script.py). But again, no separate
compilation step is required. Python, as a dynamic, interpreted language, is very flexible
and convenient to use, which makes it especially attractive for scientific computing.
One downside of dynamically typed languages is that they generally perform computations
magnitudes slower than statically typed languages. However, in the next lecture, we will
work with libraries that implement the more “expensive” computations in C or Fortran
code and use Python as a so-called “glue” or “wrapper” language. This way, by calling
functions in Python that have been implemented lower-level programming languages, we
leverage Python’s convenient syntax while using computationally efficient code.
Another downside of dynamic typic is that errors, except for syntax errors, are only raised
during runtime. In certain scenarios, this can have important implications. For example,
consider the following Python code snippet:
if cond:
text = "abc" + 123
else:
text = "abc" + "!!!"
The line nested under the if statement gets executed if a certain condition cond is true;
otherwise, the line that follows the else statement gets executed instead. Now, there is an
illegal expression in this code example, "abc" + 123: we cannot add an integer-type value
to a string-type value. However, if the condition (cond) is always false, the code runs just
fine because the line will never be executed – no harm done. The worst case scenario is
though if cond is rarely true because this error occurs very rarely – it may slip through our
test suites if we are not very thorough. In contrast, equivalent errors in static-type languages
such as C are usually caught by the compiler so that we can fix them before we run, share,
“ship,” or deploy our code.
3.2 Setup
This section provides an overview of the different ways Python can be installed and set
up. Please do not follow these instructions on your first read-through. Read through the
complete “Setup” section first as it lists several alternative approaches. Once you read the
section, you can revisit it and set up Python the way you prefer.
Sebastian Raschka STAT451 FS20. L03: Using Python Page 6
Note that many different operating systems already come with a default Python installation.
While the default Python version on macOS is relatively outdated (some old version of
Python 2.7), most Linux distributions come with a relatively old version of Python 3.
In general, you can check the pre-installed Python version by executing
$ which python
Python 2.7 vs. Python 3 For those who are curious why there is a debate: about
10 years ago, Python developers wanted to add substantial improvements to the Python
language. However, these improvements would have been backward incompatible. Hence,
people would have been forced to rewrite certain sections of their “old” code in order use the
latest Python versions, which can be cumbersome if you have developed large code bases.
Hence, the Python community decided to branch off and develop Python 3 seperately, while
maintaining Python 2.7 with minimal updates (bugfixes, security updates, etc.).
Instead of making the switch to Python 3, many people kept using Python 2.7. However,
10 years after the introduction of Python 3, more people use Python 3 than Python 2.7
(according to most surveys you can find on the internet), and almost all major libraries
have been ported to Python 36 . In fact, many major libraries are going to drop (or already
dropped) Python 2.7 support 7 . Furthermore, Python 2.7 will lose official support by 2020,
which means that no updates (no bug fixes, no security updates, etc.) will be made8 after
2019.
In sum, there is no point in learning Python 2.7 now, and we will be using Python 3 in this
class9 . In particular, I created and tested all the code for this class in Python 3.8, which was
released October 2019. There is a beta version of Python 3.9 already, but the final version
has not been released yet while I am writing this. However, in case you are installing Python
3.9, since Python 3 versions are backwards compatible, the code will also run fine on Python
3.9. The code may even work on Python 3.7 or 3.6 (however, I have not tested it explicitly).
Thus, I recommend installing Python 3.8 to be on the safe side.
Windows, Linux, and macOS Unfortunately, I am not very familiar with the Windows
operating system; hence, this tutorial is more geared towards Linux and macOS10 . In general,
the Python interpreter should work on Windows in the same way as it does on Linux and
macOS/Unix. However, there will be certain differences when working with data files on
your computer’s storage disk. For instance, Linux and macOS/Unix uses forward slashes
6 http://py3readiness.org
7 http://python3statement.org
8 https://pythonclock.org
9 In case you are interested, here is an article that covers some of the most relevant differences between
python.org The “official” way to install Python is to obtain an official distribution from
the python.org website. You can find a Python installer for several different operating
systems and versions of Python (I recommend you to use one of the most recent one; right
now, the most recent version is 3.8.5) at https://www.python.org/downloads/. If you choose
to install Python from there, please follow the instruction provided on the https://www.
python.org website.
Another important aspect of using Python is managing environments and packages. This is
especially useful if we are working on different projects which each require different Python
packages and different versions thereof. Essentially, a virtual environment is like a “con-
tainer” on your computer that contains only those libraries that are relevant for a given
project, and you can only use them if the environment is “active.” If you have multi-
ple projects (or take multiple Python-related classes), virtual environments are a powerful
organizational tool.
virtualenv The probably most widely used tool for creating and managing Python en-
vironments is virtualenv. You can find more about virtualenv on the website: https:
//virtualenv.pypa.io/en/stable/.
conda If you are using Anaconda or Miniconda, it is recommended to use the conda
package managing tool that comes with it. With conda, we can create and manage virtual
environments similar to virtualenv, which is the commonly used virtual environment tool
for Python. For example, below is some code for creating a virtual environment that we call
“stat451” just for this class:
Note that we have to execute source activate stat451 each time we open a new command
line terminal. Otherwise, the default environment will be used.
You can see that a virtual environment is active based on your command line prompt as
illustrated below.
Before:
After:
(stat451) sebastian@Sebastians-MacBook-Pro:~$
For more information about conda and managing virtual environments, please see https:
//conda.io/docs/user-guide/tasks/manage-environments.html.
There are two recommended ways for installing Python packages, which will be introduced
in the next subsections.
Sebastian Raschka STAT451 FS20. L03: Using Python Page 9
Pip Pip is the official Python package installer. While pip is a Python library or package
itself, you can directly use it from the command line as a standalone program. For example,
to install the NumPy package that we will be using next lecture, you can use the following
command:
Note that since pip is also a Python package, entering pip on the command line terminal
is the same as running python -m pip, that is, running pip as a module via the -m flag.
This may be useful of your command line terminal does not recognize the pip command.
For example, NumPy can be installed this way as follows:
For more information about Pip, please see https://pip.pypa.io/en/stable/ and refer to
https://pip.pypa.io/en/stable/installing/ in case you encounter problems with using pip on
your command line.
Conda If you are using Anaconda or Miniconda, it is highly recommended to use the
conda package manager to install Python packages instead of pip. The Anaconda team
provides pre-compiled versions of Python packages to ensure the best compatibility with your
environment and operating system, and it handles complex dependencies between different
packages if necessary.
The usage is very similar to pip. To install the NumPy package that we will be using the
next lecture, you can use the following command:
While most major packages for scientific computing are available via conda, you may find
that you need packages that are not available through the conda installer. Note that if
you are typing conda install , the package is fetched from the official Anaconda website.
However, conda also allows us to specify channels to download packages from other sources.
One of these is the community project conda-forge, which provides additional packages
for conda that are not available via the official Anaconda channel. One such example is the
mlxtend package, which we will be using for one of the homework exercises to visualize 2D
decision regions of scikit-learn classifiers. To install mlxtend from the conda-forge channel,
you can use the -c (channel) flag as follows:
Even if you are primarily using conda, you can still install packages via pip. Below, I listed
my recommended order of approaches to try when installing a new package:
There are many different ways how we can execute Python code. The following subsections
list some of these. Note that you are welcome to use any approach you prefer, the problem
sets (homeworks) will be handed out as Jupyter Notebooks 14 . Also, you are expected to
hand in your homework in the form of Jupyter Notebooks + an HTML file created from the
Jupyter Notebook. The overall procedure will be discussed in more detail class when I hand
out/provide the first problem set.
3.3.1 Interpreter/REPL
The simplest way to use Python is via the so-called “Read-eval-print loop” (REPL). The
REPL essentially means that we are executing Python code in an interactive session:
$ python
PPython 3.8.5 (default, Sep 4 2020, 02:22:02)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
14 Please see the official documentation for more details: https://jupyter-notebook.readthedocs.io/en/
stable/
Sebastian Raschka STAT451 FS20. L03: Using Python Page 11
>>> print(1 + 2)
3
>>> for i in range(5):
... print(i)
...
0
1
2
3
4
>>>
• Starting a line with a “$” character refers to a command line prompt in a terminal
(for example, a Linux or Unix shell).
• An “>>>” at the line start refers to a prompt in a Python interpreter.
• An “...” at the beginning of a line indicates the continuation of the input command
that was initiated by the previous >>> prompt.
The REPL is useful if we want to evaluate a small number of expressions, for example, but it
is not recommended for doing “heavy lifting,” that is, running more extensive code examples
or programs.
3.3.2 IPython
IPython stands for “interactive” Python, and using Python over the standard Python REPL
has many advantages – for example, the so-called “magics,” which are some extra commands
for our convenience15 . Another of my favorite IPython features is that we can use the Tab
key to autocomplete function and variable names.
IPython can be installed using conda. For example,
For more information and installation instruction, please refer to the official documentation
at https://ipython.org/install.html.
Once installed, we can start an IPython session by evoking the ipython command from the
command line:
$ ipython
Python 3.8.5 (default, Sep 4 2020, 02:22:02)
Type ’copyright’, ’credits’ or ’license’ for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type ’?’ for help.
While all default Python language and interpreter features also work in IPython, IPython
has a nicer “help” documentation compared to the help() function in Python, which we
can evoke via the ? command. For example, if we want to find out more about Python’s
sorted() function, we can simply type sorted?, as shown below:
15 https://ipython.org/ipython-doc/3/interactive/magics.html
Sebastian Raschka STAT451 FS20. L03: Using Python Page 12
In [1]: sorted?
Signature: sorted(iterable, /, *, key=None, reverse=False)
Docstring:
Return a new list containing all items from the iterable in ascending order.
A custom key function can be supplied to customize the sort order and the
the reverse flag can be set to request the result in descending order.
Type: builtin_function_or_method
Also, IPython allows us to use Linux/Unix commands directly within the Python session
– if we start a command with “!” IPython interprets the command after the ! as a shell
(Linux/Unix) command. For example, !ls lists the current subdirectories and files in the
current working directory:
In [15]: !ls
Creative Cloud Files OneDrive code
Desktop Pictures custom-settings
Documents Public miniconda3
Downloads ...
If we are developing more extensive analyses or programs, we want to keep most of our code
in some sort of file, which has certain advantages. For example, if we execute a long series of
commands in the REPL or in IPython to produce certain results, it would be cumbersome
to reproduce the results if we want to perform the same or a similar analysis again (for
example, imagine you want to rerun all the previous code after adding a few extra entries
to the dataset). If we kept the code in a file instead, it is easier to
If you are creating Python (.py) files, you can, of course, use any text editor you like.
However, it is highly recommended to use a text editor that at least offers programming-
language specific syntax highlighting to make your life easier. Common and good choices
Sebastian Raschka STAT451 FS20. L03: Using Python Page 13
for text editors are Visual Studio Code16 , Atom17 , and Sublime Text18 , for example. My
personal preference is Visual Studio Code.
Of course, there are also specific IDEs (Integrated Developer Environments) for Python that
provide additional convenience functions. Commonly used Python IDEs are PyCharm19 and
Spyder20 . However, for many tasks in scientific computing, IDEs are considered “overkill”
and would be something to consider for more advanced Python-based software development.
Using Jupyter Notebooks for writing code is like using Microsoft Word documents for writing
text: it is very convenient if you want to have and view everything in one file.
Jupyter Notebooks are particularly hand for conducting data analyses (because it allows
you to add notes, figures, and plots).
Originally, Jupyter Notebook21 was developed as an interactive document on top of IPython
– back then, it was called IPython Notebook22 . However, over the years, the developers
extended the “Notebook” concept to also support other programming languages, such as
Julia and R23 .
You can think of Jupyter Notebooks as an interactive environment similar to IPython.
However, in addition to having an interactive IPython session, Jupyter Notebooks are also
“documents” that allow us to add text, LaTeX equations, figures, and so forth.
The reason why Jupyter notebooks are so popular within the scientific computing community
is that they make it easy to save, present, and share a data analysis in a single, executable
file.
We discussed how to set up and use notebooks live in class. Since Jupyter notebooks are such
an interactive concept, it is probably most effective if you consider a video tutorial as a refer-
ence rather than text. For example, Corey Schafer is sharing a good video tutorial for setting
up Jupyter Notebook on YouTube at https://www.youtube.com/watch?v=HW29067qVWk.
3.3.5 JupyterLab
While Jupyter Notebook is the “original” application for working with Jupyter notebooks,
it is not the only one. Analogously, Microsoft Word is a program to open Word (.doc, .docx)
files, but those files can also be opened by other applications (like OpenOffice or LibreOffice).
Recently, a new, officially supported interface for Jupyter notebooks was released called
JupyterLab24 . JupyterLab is a modernized version of Jupyter Notebook that adds some
more convenience features on top of it. You are welcome to use it since .ipynb files are
compatible with both Jupyter Notebook and JupyterLab.
16 https://code.visualstudio.com
17 https://atom.io
18 https://www.sublimetext.com
19 https://www.jetbrains.com/pycharm/
20 https://www.spyder-ide.org
21 https://jupyter-notebook.readthedocs.io/en/stable/
22 This is the reason why we still use the file ending ‘.ipynb‘ for Jupyter notebooks.
23 The term Jupyter is basically something like an acronym of the terms Julia, Python, and R
24 https://jupyterlab.readthedocs.io/en/latest/
Sebastian Raschka STAT451 FS20. L03: Using Python Page 14
As discussed in the lecture, when it comes to computing, we will be mostly working with
Jupyter notebooks in this course. You will receive the homework assignments, questions,
and starter code as Jupyter notebooks (.ipynb files). Also, you are then expected to hand
us back the solutions in the form of Jupyter notebooks, which are the original notebook files
but modified with your solutions.
We (the TA and I) will be viewing your answers to the homework questions assignments
using JupyterLab, too. Also, we will run your code on our computers to make sure that the
code you provide in certain assignments actually works. Thus, please make sure that your
notebooks can be executed sequentially.
For working on the homework and submitting it, please follow the following steps:
• Do not modify the cells that contain the original question/assignment text or code
cells that start with the line ”# DO NOT MODIFY THIS CELL.
• After you finished working on your notebook, make a copy of the notebook, and click
on the “Restart & Run All Cells” button under the “Cells” tab in the menu bar, to
check that all your code can be executed in sequential order.
• If everything works as expected, export the notebook as HTML file (click on File ->
Download As -> HTML (.html) ).
• Upload both the .ipynb and the .html file of the notebook with your homework
solutions via the Canvas Assignment page for this homework.
There are many really good Python resources out there as we discussed. Also, since this is a
machine learning course, we cannot spend to much time on learning Python in this course.
Below is a list of Python concepts that I would consider as most relevant.
If you are already familiar with Python, my recommendation is to read through the list below
and check for yourself that the subjects make any sense to you. If the majority of these do,
you can read up on individual concepts using the Python documentation, for example.
If most of this is new to you, you should consider spending a few hours working through a
Python learning resource – consider these listed on the course website.
Also, consider the excellent, official, and free Python tutorial as a reference resource, as well
as the official Python documentation for learning about Python:
3.4.2 Operators
Arithmetic Operators
• +, -, *, /, //, **, . . .
Sebastian Raschka STAT451 FS20. L03: Using Python Page 15
3.4.3 Strings
Basics
• single quote, double quote, escape characters, strings that span multiple lines
• strings are immutable objects
• string indexing and slicing
• different ways to print a string
3.5.1 List
• list type
• mutable
• sorting a list
• variable-size
• slicing and indexing
3.5.2 Dictionary
• dict type
• key-value pairs
• keys must be immutable types
• fast look-up (“hash table”)
3.5.3 Set
3.5.4 Tuple
3.6 Conditionals
• if / elif / else
Sebastian Raschka STAT451 FS20. L03: Using Python Page 16
3.7 Iteration
• while-loop
• for-loop
• useful keywords: continue, break
• useful objects: range, enum, zip
3.7.1 Generators
3.7.2 Comprehensions
• list comprehension
• set comprehension
• dictionary comprehension
3.8 Functions
3.9 Classes
• Using sys
import sys
first_value = sys.argv[1]
second_value = sys.argv[2]
print("A:", first_value)
print("B:", second_value)
• import numpy
• import numpy as np
• from numpy import some function
Sebastian Raschka STAT451 FS20. L03: Using Python Page 17
3.15 Subprocesses
3.16 Exceptions
3.17 Debugging
3.18 Resources