FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,233 520 Updated Jan 6, 2025

karpathy / nano-llama31

nanoGPT style version of Llama 3.1

Python 1,283 67 Updated Aug 8, 2024

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,087 455 Updated Jan 3, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 14,925 1,410 Updated Jan 5, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 7,064 655 Updated Jan 5, 2025

alessiodm / drl-zh

Deep Reinforcement Learning: Zero to Hero!

Jupyter Notebook 2,031 74 Updated Aug 18, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 24,940 2,835 Updated Oct 2, 2024

ggerganov / llama.cpp

LLM inference in C/C++

C++ 70,233 10,135 Updated Jan 4, 2025

dabeaz-course / python-mastery

Advanced Python Mastery (course by @dabeaz)

Python 10,774 1,800 Updated Aug 10, 2024

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 3,387 343 Updated Dec 3, 2024

siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,754 110 Updated Dec 30, 2024

microsoft / ark

A GPU-driven system framework for scalable AI applications

C++ 111 16 Updated Oct 8, 2024

DefTruth / CUDA-Learn-Notes

📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 1,867 195 Updated Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

taihaozesong

Achievements

Achievements

Block or report taihaozesong

Stars

Infini-AI-Lab / MagicPIG

MoonshotAI / moonpalace

v2rockets / sd_optimization

HazyResearch / ThunderKittens

EleutherAI / cookbook

wjakob / nanobind

huaxz1986 / APUE_notes

mit-han-lab / nunchaku

kvcache-ai / Mooncake

flashinfer-ai / flashinfer

stas00 / ml-engineering

unslothai / unsloth

RRZE-HPC / gpu-benches

microsoft / sarathi-serve

microsoft / vattention

microsoft / DeepSpeed

DefTruth / Awesome-LLM-Inference

pytorch / FBGEMM