GitHub - sanbaiw/fast-nanogpt: Build and train GPT-2 from scratch with fastai flavored training frame work

Build GPT-2 model from scratch, init it properly then train!

The basic idea is to create a GPT-2 model and train it. And during the process we will explore how to observe the metrics and make the training smooth. We will build a fastai flavored mini training framework that let us adjust and training process more easily

This project is inspired by Let's reproduce GPT-2 (124M) from Andrej Karpathy and Practical Deep Learning Part 2 from Jeremy Howard.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
nbs		nbs
tinyai		tinyai
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
input.txt		input.txt
settings.ini		settings.ini
setup.py		setup.py
train_gpt2.py		train_gpt2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

sanbaiw/fast-nanogpt

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages