Skip to content
View chenbong's full-sized avatar

Highlights

  • Pro

Block or report chenbong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Powerful yet simple to use screenshot software 🖥️ 📸

C++ 25,198 1,609 Updated Dec 3, 2024

🏞 A lightweight, versatile image viewer

C# 7,976 506 Updated Dec 7, 2024

Practical and minimal image viewer

C++ 2,081 122 Updated Oct 19, 2024

Let your Claude able to think

TypeScript 9,925 1,134 Updated Dec 3, 2024

AI模型接口管理与分发系统,支持将多种大模型转为OpenAI格式调用、支持Midjourney Proxy、Suno、Rerank,兼容易支付协议,可供个人或者企业内部管理与分发渠道使用,本项目基于One API二次开发。🍥 The next-generation LLM gateway and AI asset management system supports multiple lan…

Go 3,932 895 Updated Dec 14, 2024

A Toolkit to Help Optimize Large Onnx Model

Python 148 9 Updated May 16, 2024

Count number of parameters / MACs / FLOPS for ONNX models.

Python 88 21 Updated Oct 26, 2024

Code base of the BEVDet series .

Python 1,476 267 Updated Jul 4, 2024

小雅Alist的相关周边

Shell 5,628 821 Updated Dec 14, 2024

LLM inference in C/C++

C++ 69,197 9,952 Updated Dec 13, 2024

Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.

81 6 Updated Mar 23, 2024

Awesome LLMs on Device: A Comprehensive Survey

986 124 Updated Oct 8, 2024

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 349 37 Updated Dec 14, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,827 438 Updated Dec 13, 2024

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 36,057 4,444 Updated Dec 12, 2024

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Python 232 18 Updated Oct 8, 2024

Universal LLM Deployment Engine with ML Compilation

Python 19,379 1,593 Updated Dec 14, 2024

Grok open release

Python 49,730 8,340 Updated Aug 30, 2024

[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python 254 23 Updated Oct 10, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 461 27 Updated Nov 9, 2024

The official implementation of the EMNLP 2023 paper LLM-FP4

Python 168 11 Updated Dec 15, 2023

Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

Python 258 24 Updated Sep 3, 2024

The uncompromising Python code formatter

Python 39,316 2,481 Updated Dec 11, 2024

用于自动预约民政局婚姻登记处的号,限广东省民政局

Python 9 1 Updated Jun 25, 2023

初始提交

Python 1 Updated May 31, 2023

Actively maintained ONNX Optimizer

C++ 653 90 Updated Mar 5, 2024

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 736 56 Updated Oct 8, 2024

A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.

JavaScript 1,370 170 Updated Dec 6, 2024

一种任务级GPU算力分时调度的高性能深度学习训练平台

Python 318 40 Updated Oct 24, 2023

pruning vision models in torch

Python 14 3 Updated Dec 5, 2024
Next