Stars
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
DSPy: The framework for programming—not prompting—language models
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2…
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
Ongoing research training transformer models at scale
Hackable and optimized Transformers building blocks, supporting a composable construction.
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Simple, safe way to store and distribute tensors
FFCV: Fast Forward Computer Vision (and other ML workloads!)
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
A python tool for evaluating the quality of sentence embeddings.
Ongoing research training transformer language models at scale, including: BERT & GPT-2
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting has…
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"
EsViT: Efficient self-supervised Vision Transformers