GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

This repository contains the code to reproduce GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients.

GRASS (GRAdient Structured Sparsification) introduces sparse projections to transform gradients into structured sparse updates, significantly reducing memory usage for optimizer states and minimizing gradient memory footprint, computation, and communication costs. This approach enables half-precision pretraining of a 13B parameter LLaMA model on a single 40GB A100 GPU and achieves up to a $2\times$ throughput improvement on an 8-GPU system, while maintaining comparable performance to full-rank training and existing projection-based methods.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

About

Releases

Packages

License

aashiqmuhamed/GRASS

Folders and files

Latest commit

History

Repository files navigation

GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages