Skip to content
View zhoujian-z's full-sized avatar
  • Beijing Institute of Technology
  • Beijing

Highlights

  • Pro

Block or report zhoujian-z

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

FlashInfer: Kernel Library for LLM Serving

Cuda 2,439 254 Updated Mar 19, 2025

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 15,504 1,793 Updated Mar 2, 2025

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Python 241 27 Updated Mar 20, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python 1,058 116 Updated Mar 20, 2025

ModelScope: bring the notion of Model-as-a-Service to life.

Python 7,586 781 Updated Mar 21, 2025

Fast inference from large lauguage models via speculative decoding

Python 691 68 Updated Aug 22, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,466 172 Updated Jun 25, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,894 510 Updated Mar 21, 2025

Development repository for the Triton language and compiler

MLIR 14,938 1,874 Updated Mar 21, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,907 2,403 Updated Aug 12, 2024

Universal LLM Deployment Engine with ML Compilation

Python 20,228 1,693 Updated Mar 20, 2025

EVA Series: Visual Representation Fantasies from BAAI

Python 2,452 181 Updated Aug 1, 2024

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,560 4,868 Updated Feb 23, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,759 1,176 Updated Mar 21, 2025
Python 602 55 Updated Jul 31, 2024

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,509 1,049 Updated Mar 20, 2025

Sample codes for my CUDA programming book

Cuda 1,670 340 Updated Feb 15, 2025

[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving

Python 3,850 434 Updated Mar 9, 2025

This is a Chinese translation of the CUDA programming guide

1,463 227 Updated Nov 13, 2024

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,936 1,543 Updated Mar 21, 2025

Learn CUDA Programming, published by Packt

Cuda 1,120 251 Updated Dec 30, 2023

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Python 2,253 397 Updated Mar 20, 2025

how to optimize some algorithm in cuda.

Cuda 2,028 182 Updated Mar 19, 2025

compiler learning resources collect.

Python 2,317 342 Updated Mar 19, 2025

A simple tool that can generate TensorRT plugin code quickly.

Python 228 35 Updated Jul 11, 2023

Simple samples for TensorRT programming

Python 1,587 344 Updated Mar 12, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,359 2,166 Updated Mar 11, 2025

A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.

JavaScript 1,447 176 Updated Feb 25, 2025

Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.

C++ 31,467 7,331 Updated Nov 24, 2024

C++那些事

C++ 40,573 8,649 Updated Jun 14, 2024
Next