Skip to content

GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

License

Notifications You must be signed in to change notification settings

aashiqmuhamed/GRASS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

This repository contains the code to reproduce GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients.

GRASS (GRAdient Structured Sparsification) introduces sparse projections to transform gradients into structured sparse updates, significantly reducing memory usage for optimizer states and minimizing gradient memory footprint, computation, and communication costs. This approach enables half-precision pretraining of a 13B parameter LLaMA model on a single 40GB A100 GPU and achieves up to a $2\times$ throughput improvement on an 8-GPU system, while maintaining comparable performance to full-rank training and existing projection-based methods.

About

GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published