0% found this document useful (0 votes)
28 views38 pages

Pytorch

PyTorch is a popular deep learning framework that provides tools for: 1) Defining neural network models using tensors and modules; 2) Training models with automatic differentiation and optimizers; 3) Loading and preprocessing data with datasets and dataloaders.

Uploaded by

jr.developer.78
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
28 views38 pages

Pytorch

PyTorch is a popular deep learning framework that provides tools for: 1) Defining neural network models using tensors and modules; 2) Training models with automatic differentiation and optimizers; 3) Loading and preprocessing data with datasets and dataloaders.

Uploaded by

jr.developer.78
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 38

PyTorch Tutorial

Willie Chang
Pranay Manocha
Installing PyTorch
• 💻💻 On your own computer
• Anaconda/Miniconda: conda install pytorch -c pytorch
• Others via pip: pip3 install torch

• 🌐🌐 On Princeton CS server (ssh cycles.cs.princeton.edu)


• Non-CS students can request a class account.
• Miniconda is highly recommended, because:
• It lets you manage your own Python installation
• It installs locally; no admin privileges required
• It’s lightweight and fits within your disk quota
• Instructions:
• wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
• chmod u+x ./Miniconda3-latest-Linux-x86_64.sh
• ./Miniconda3-latest-Linux-x86_64.sh
• After Miniconda is installed: conda install pytorch -c pytorch
Writing code
• Up to you; feel free to use emacs, vim, PyCharm, etc. if you want.
• Our recommendations:

Jupyter Notebook Also try


Jupyter Lab!
VS Code
• Install: conda/pip3 install jupyter • Install the Python extension.
• 💻💻 Run on your computer • 🌐🌐 Install the Remote
• jupyter notebook Development extension.
• 🌐🌐 Run on Princeton CS server • Python files can be run like
• Pick any 4-digit number, say 1234 Jupyter notebooks by delimiting
• 🌐🌐 hostname -s cells/sections with #%%
• 🌐🌐 jupyter notebook --no-browser --port=1234 • Debugging PyTorch code is just
• 💻💻 like debugging any other Python
ssh -N -L 1234:localhost:1234 __@__.cs.princeton.edu

• First blank is username, second is hostname code: see Piazza @108 for info.
Why talk about libraries?
• Advantage of various deep learning frameworks

• Quick to develop and test new ideas


• Automatically compute gradients
• Run it all efficiently on GPU to speed up computation
Various Frameworks
• Various Deep Learning Frameworks

• Focus on PyTorch in this session.

Source: CS231n slides


Preview: (and advantages)
• Preview of Numpy & PyTorch & Tensorflow

Computation Graph Numpy Tensorflow PyTorch


Advantages (continued)
• Which one do you think is better?
Advantages (continued)
• Which one do you think is better?

PyTorch!
• Easy Interface − easy to use API. The code execution in this framework is quite easy. Also need a
fewer lines to code in comparison.
• It is easy to debug and understand the code.
• Python usage − This library is considered to be Pythonic which smoothly integrates with the Python
data science stack.
• It can be considered as NumPy extension to GPUs.
• Computational graphs − PyTorch provides an excellent platform which offers dynamic
computational graphs. Thus a user can change them during runtime.
• It includes many layers as Torch.
• It includes lot of loss functions.
• It allows building networks whose structure is dependent on computation itself.
• NLP: account for variable length sentences. Instead of padding the sentence to a
fixed length, we create graphs with different number of LSTM cells based on the sentence’s
length.
PyTorch
• Fundamental Concepts of PyTorch
• Tensors
• Autograd
• Modular structure
• Models / Layers
• Datasets
• Dataloader
• Visualization Tools like
• TensorboardX (monitor training)
• PyTorchViz (visualise computation graph)
• Various other functions
• loss (MSE,CE etc..)
• optimizers
Prepare Train Evaluate
Input Data Model Model
•Load data •Train •Visualise
•Iterate over weights
examples
Tensor
• Tensor?
• PyTorch Tensors are just like numpy arrays, but they can run on GPU.
• Examples:

And more operations like:

Indexing, slicing, reshape, transpose, cross product,


matrix product, element wise multiplication etc...
Tensor (continued)
• Attributes of a tensor 't':
• t= torch.randn(1)

• requires_grad - making a trainable parameter


•By default False
•Turn on:
• t.requires_grad_() or
• t= torch.randn(1,requires_grad=True)
•Accessing tensor value:
• t.data
•Accessing tensor gradient
• t.grad

• grad_fn – history of operations for autograd


• t.grad_fn
Loading Data, Devices and CUDA
• Numpy arrays to PyTorch tensors
• torch.from_numpy(x_train)
• Returns a cpu tensor!
• PyTorch tensor to numpy
• t.numpy()
• Using GPU acceleration
• t.to()
• Sends to whatever device (cuda or cpu)
• Fallback to cpu if gpu is unavailable:
• torch.cuda.is_available()
• Check cpu/gpu tensor OR numpy array ?
• type(t) or t.type()
• returns
• numpy.ndarray
• torch.Tensor
• CPU - torch.cpu.FloatTensor
• GPU - torch.cuda.FloatTensor

*Assume 't' is a tensor


Autograd
• Autograd
• Automatic Differentiation Package
• Don’t need to worry about partial differentiation, chain rule etc..
• backward() does that
• loss.backward()
• Gradients are accumulated for each step by default:
• Need to zero out gradients after each update
• t.grad.zero_()

*Assume 't' is a tensor


Autograd (continued)
• Manual Weight Update - example
Optimizer
• Optimizers (optim package)
• Adam, Adagrad, Adadelta, SGD etc..
• Manually updating is ok if small number of weights
• Imagine updating 100k parameters!
• An optimizer takes the parameters we want to update, the learning rate we want
to use (and possibly many other hyper-parameters as well!)
and performs the updates
Loss
• Loss
• Various predefined loss functions to choose from
• L1, MSE, Cross Entropy …...
Model

• In PyTorch, a model is represented by a regular Python class that inherits


from the Module class.
• Two components
• __init__(self): it defines the parts that make up the model —in our
case, two parameters, a and b
• forward(self, x): it performs the actual computation, that is, it outputs a prediction,
given the input x
Model (example)
• Example:

• Properties:
• model = ManualLinearRegression()
• model.state_dic() - returns a dictionary of trainable parameters with their current
values
• model.parameters() - returns a list of all trainable parameters in the model
• model.train() or model.eval()
Putting things together
• Sample Code in practice
Complex Models
• Complex Model Class
• Predefined 'layer' modules

• 'Sequential' layer modules


Dataset
• Dataset
• In PyTorch, a dataset is represented by a regular Python class that inherits
from the Dataset class. You can think of it as a kind of a Python list of
tuples, each tuple corresponding to one point (features, label)
• 3 components:
• __init__(self)
• __get_item__(self, index)
• __len__(self)
• Unless the dataset is huge
(cannot fit in memory), you don’t
explictly need to define this class.
Use TensorDataset
Dataloader
• Dataloader
• What happens if we have a huge dataset? Have to train in 'batches'
• Use PyTorch's Dataloader class!
• We tell it which dataset to use, the desired mini-batch size and if we’d like to shuffle it
or not. That’s it!
• Our loader will behave like an iterator, so we can loop over it and fetch a different
mini-batch every time.
Dataloader (example)
• Sample Code in Practice:
Split Data
• Random Split for Train, Val and Test Set
• random_split()
Saving / Loading Weights
Method 1
• Only inference/evaluation – save only state_dict
• Save:
• torch.save(model.state_dict(), PATH)
• Load:
• model = TheModelClass(*args, **kwargs)
• model.load_state_dict(torch.load(PATH))
• model.eval()

• CONVENTION IS TO SAVE MODELS USING EITHER A .PT OR A .PTH EXTENSION

https://pytorch.org/tutorials/beginner/saving_loading_models.html
Saving / Loading Weights (continued)
• Method 2
• Checkpoint – resume training / inference
• Save:
• torch.save({
'epoch': epoch,
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'loss': loss,
...
}, PATH)
• Load:
• model = TheModelClass(*args, **kwargs)
optimizer = TheOptimizerClass(*args, **kwargs)
checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']
model.eval()
# - or -
model.train()
Evaluation
• Two important things:
• torch.no_grad()
• Don’t store the history of all computations
• eval()
• Tell compiler which mode to run on.
Visualization
• TensorboardX (visualise training)
• PyTorchViz (visualise computation graph)

https://github.com/lanpa/tensorboardX/
Visualization (continued)
• PyTorchViz

https://github.com/szagoruyko/pytorchviz
References
• Important References:
• For setting up jupyter notebook on princeton ionic cluster
• https://oncomputingwell.princeton.edu/2018/05/jupyter-on-the-cluster/
• Best reference is PyTorch Documentation
• https://pytorch.org/ and https://github.com/pytorch/pytorch
• Good Blogs: (with examples and code)
• https://lelon.io/blog/2018/02/08/pytorch-with-baby-steps
• https://www.tutorialspoint.com/pytorch/index.htm
• https://github.com/hunkim/PyTorchZeroToAll
• Free GPU access for short time:
• Google Colab provides free Tesla K80 GPU of about 12GB. You can run the session in
an interactive Colab Notebook for 12 hours.
• https://colab.research.google.com/
Misc
• Dynamic VS Static Computation Graph

Epoch 1

a b x_train_tensor
Misc
• Dynamic VS Static Computation Graph

a b x_train_tensor

yhat
Misc
• Dynamic VS Static Computation Graph

a b x_train_tensor

loss

yhat y_train_tensor

loss
Misc
• Dynamic VS Static Computation Graph

Epoch 2

a b x_train_tensor
Misc
• Dynamic VS Static Computation Graph

a b x_train_tensor

yhat
Misc
• Dynamic VS Static Computation Graph

a b x_train_tensor

loss

yhat y_train_tensor

loss
Misc
• Dynamic VS Static Computation Graph

Building the graph and computing the


graph happen at the same time.

Seems inefficient, especially if we are


building the same graph over and over
again...
Misc
• Alternative : Static Computation Graphs:

Alternative: Static
graphs

Step 1: Build
computational graph
describing our
computation (including
finding paths for
backprop)

Step 2: Reuse the same


graph on every iteration

You might also like