Stars
Flexible and powerful framework for managing multiple AI agents and handling complex conversations
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Fast inference from large lauguage models via speculative decoding
A curated list for Efficient Large Language Models
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Academic Homepage Template
[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
FlashInfer: Kernel Library for LLM Serving
Retrieval-Augmented Generation in 3 Lines of Code!
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
This is an implementation of the paper: Searching for Best Practices in Retrieval-Augmented Generation (EMNLP2024)
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach (ICML 2024)
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Ongoing research training transformer models at scale
PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Structured state space sequence models
LogAI - An open-source library for log analytics and intelligence
CaMML:Context-Aware MultiModal Learner for Large Models (ACL 2024)
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Finetune mistral-7b-instruct for sentence embeddings
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL 2024 Findings))
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
Generative Representational Instruction Tuning
Code for "SemDeDup", a simple method for identifying and removing semantic duplicates from a dataset (data pairs which are semantically similar, but not exactly identical).