MrDoghead

Follow

🐢

Moving

dongnan.cao MrDoghead

🐢

Moving

Follow

lr=1e-9

1 follower · 6 following

Lists (3)

Sort

cuda

deepLearning

tools

Stars

GAIR-NLP / O1-Journey

O1 Replication Journey: A Strategic Progress Report – Part I

1,671 49 Updated Nov 30, 2024

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 6,512 578 Updated Dec 14, 2024

tatsu-lab / alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,563 245 Updated Nov 11, 2024

HqWu-HITCS / Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

16,676 1,564 Updated Sep 19, 2024

chatchat-space / Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 32,448 5,626 Updated Nov 29, 2024

Significant-Gravitas / AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 169,123 44,544 Updated Dec 13, 2024

MAhaitao999 / CUDA_Programming

《CUDA编程基础与实践》一书的代码

Cuda 98 26 Updated Apr 28, 2022

CLUEbenchmark / CLUE

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

Python 4,030 541 Updated May 23, 2024

alibaba / Megatron-LLaMA

Forked from NVIDIA/Megatron-LM

Best practice for training LLaMA models in Megatron-LM

Python 634 53 Updated Jan 2, 2024

microsoft / DeepSpeedExamples

Example models using DeepSpeed

Python 6,149 1,050 Updated Dec 14, 2024

brightmart / nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,528 1,548 Updated May 23, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 36,060 4,444 Updated Dec 12, 2024

alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 749 107 Updated Dec 5, 2024

MegEngine / InferLLM

a lightweight LLM model inference framework

C++ 706 87 Updated Apr 7, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,014 334 Updated Dec 14, 2024

luchangli03 / onnxsim_large_model

simplify >2GB large onnx model

Python 45 4 Updated Nov 30, 2024

niklasso / minisat

A minimalistic and high-performance SAT solver

C++ 1,036 394 Updated Apr 28, 2024

ymcui / Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python 7,115 576 Updated Sep 23, 2024

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 11,748 1,230 Updated Dec 11, 2024

mistralai / mistral-inference

Official inference library for Mistral models

Jupyter Notebook 9,791 869 Updated Nov 12, 2024

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,236 552 Updated Oct 28, 2024

Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.

Python 1,502 637 Updated Sep 12, 2024

fpgaminer / GPTQ-triton

GPTQ inference Triton kernel

Jupyter Notebook 285 22 Updated May 18, 2023

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 1,726 142 Updated Dec 12, 2024

triton-lang / triton

Development repository for the Triton language and compiler

C++ 13,683 1,679 Updated Dec 14, 2024

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,589 212 Updated Dec 13, 2024

microsoft / Megatron-DeepSpeed

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 1,921 344 Updated Dec 5, 2024

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35,862 4,167 Updated Dec 14, 2024

turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Python 2,780 220 Updated Sep 30, 2023

ymcui / Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,505 1,875 Updated Apr 30, 2024