Lists (3)
Sort Name ascending (A-Z)
Stars
O1 Replication Journey: A Strategic Progress Report – Part I
SGLang is a fast serving framework for large language models and vision language models.
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
alibaba / Megatron-LLaMA
Forked from NVIDIA/Megatron-LMBest practice for training LLaMA models in Megatron-LM
Example models using DeepSpeed
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
A minimalistic and high-performance SAT solver
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
Official inference library for Mistral models
Running large language models on a single GPU for throughput-oriented scenarios.
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
how to optimize some algorithm in cuda.
Development repository for the Triton language and compiler
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)