Skip to content

Build and train GPT-2 from scratch with fastai flavored training frame work

License

Notifications You must be signed in to change notification settings

sanbaiw/fast-nanogpt

Repository files navigation

Build GPT-2 model from scratch, init it properly then train!

The basic idea is to create a GPT-2 model and train it. And during the process we will explore how to observe the metrics and make the training smooth. We will build a fastai flavored mini training framework that let us adjust and training process more easily

This project is inspired by Let's reproduce GPT-2 (124M) from Andrej Karpathy and Practical Deep Learning Part 2 from Jeremy Howard.

About

Build and train GPT-2 from scratch with fastai flavored training frame work

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published