The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,560 4,868 Updated Feb 23, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,759 1,176 Updated Mar 21, 2025

Tlntin / Qwen-TensorRT-LLM

Python 602 55 Updated Jul 31, 2024

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,509 1,049 Updated Mar 20, 2025

brucefan1983 / CUDA-Programming

Sample codes for my CUDA programming book

Cuda 1,670 340 Updated Feb 15, 2025

OpenDriveLab / UniAD

[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving

Python 3,850 434 Updated Mar 9, 2025

HeKun-NVIDIA / CUDA-Programming-Guide-in-Chinese

This is a Chinese translation of the CUDA programming guide

1,463 227 Updated Nov 13, 2024

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,936 1,543 Updated Mar 21, 2025

PacktPublishing / Learn-CUDA-Programming

Learn CUDA Programming, published by Packt

Cuda 1,120 251 Updated Dec 30, 2023

quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Python 2,253 397 Updated Mar 20, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,028 182 Updated Mar 19, 2025

BBuf / tvm_mlir_learn

compiler learning resources collect.

Python 2,317 342 Updated Mar 19, 2025

NVIDIA-AI-IOT / tensorrt_plugin_generator

A simple tool that can generate TensorRT plugin code quickly.

Python 228 35 Updated Jul 11, 2023

NVIDIA / trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

Python 1,587 344 Updated Mar 12, 2025

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,359 2,166 Updated Mar 11, 2025

ZhangGe6 / onnx-modifier

A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.

JavaScript 1,447 176 Updated Feb 25, 2025

TheAlgorithms / C-Plus-Plus

Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.

C++ 31,467 7,331 Updated Nov 24, 2024

Light-City / CPlusPlusThings

C++那些事

C++ 40,573 8,649 Updated Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhou Jian zhoujian-z

Highlights

Block or report zhoujian-z

Stars

flashinfer-ai / flashinfer

liguodongiot / llm-action

hemingkx / Spec-Bench

SafeAILab / EAGLE

modelscope / modelscope

feifeibear / LLMSpeculativeSampling

FasterDecoding / Medusa

InternLM / lmdeploy

triton-lang / triton

haotian-liu / LLaVA

mlc-ai / mlc-llm

baaivision / EVA

huggingface / pytorch-image-models