-
UC Berkeley
- Berkeley, CA
- https://yaodongyu.github.io
Stars
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Repo for the research paper "Aligning LLMs to Be Robust Against Prompt Injection"
Code implementation of synthetic continued pretraining
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
The official implementation of DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Efficient Triton Kernels for LLM Training
Official repo for consistency models.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
[ICML 2024] CLLMs: Consistency Large Language Models
Official PyTorch Implementation of the Longhorn Deep State Space Model
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
Codebase for the ICML 2024 paper "Differentially Private Representation Learning via Image Captioning"
Universal and Transferable Attacks on Aligned Language Models
Code for the paper "Interpreting and Improving Diffusion Models from an Optimization Perspective", appearing in ICML 2024
The official implementation for Pseudo Numerical Methods for Diffusion Models on Manifolds (PNDM, PLMS | ICLR2022)
Code for CRATE (Coding RAte reduction TransformEr).
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
800,000 step-level correctness labels on LLM solutions to MATH problems
Fast, memory-efficient, scalable optimization of deep learning with differential privacy
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Fully featured implementation of Routing Transformer
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention
Transformer based on a variant of attention that is linear complexity in respect to sequence length
[ICML 2021 Oral] We show pure attention suffers rank collapse, and how different mechanisms combat it.
An implementation of Performer, a linear attention-based transformer, in Pytorch
Reformer, the efficient Transformer, in Pytorch
A concise but complete full-attention transformer with a set of promising experimental features from various papers