Skip to content
View abduld's full-sized avatar

Block or report abduld

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.

1,896 218 Updated Nov 21, 2024

MLIR tools and dialect for GraphBLAS

Python 17 6 Updated Mar 30, 2022

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

Python 261 48 Updated Nov 22, 2024

A light-weight header-only library for GPU porting between CUDA and HIP

C 5 Updated Sep 12, 2024

Bringup-Bench is a collection of standalone minimal library and system dependence benchmarks useful for bringing up newly designed CPUs, accelerators, compilers and operating systems. You probably …

C 136 17 Updated Nov 20, 2024

Making Long-Context LLM Inference 10x Faster and 10x Cheaper

Python 240 25 Updated Nov 22, 2024

A stand-alone implementation of several NumPy dtype extensions used in machine learning.

C++ 219 30 Updated Nov 20, 2024

Go Wrappers for OpenXLA PJRT

Go 16 2 Updated Nov 23, 2024

Blazingly fast LLM inference.

Rust 4,490 314 Updated Nov 23, 2024

Structured Text Generation

Python 9,587 493 Updated Nov 22, 2024

A language for constraint-guided and efficient LLM programming.

Python 3,705 200 Updated Jun 3, 2024

A guidance language for controlling large language models.

Jupyter Notebook 19,143 1,045 Updated Nov 24, 2024

UTPX (Userspace Transparent Paging Extension) is a proof-of-concept LD_PRELOAD library that accelerates HIP managed allocations on systems without XNACK or with XNACK disabled.

C++ 7 Updated Jan 8, 2024

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 13,474 1,235 Updated Oct 30, 2024

CLI tool for text to image generation using the FLUX.1 model.

Swift 45 4 Updated Oct 20, 2024

MTEB: Massive Text Embedding Benchmark

Jupyter Notebook 1,971 275 Updated Nov 23, 2024

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 4,871 369 Updated Nov 5, 2024

📋 A list of open LLMs available for commercial use.

11,224 735 Updated Jul 5, 2024

A MLX port of FLUX based on the Huggingface Diffusers implementation.

Python 1,010 60 Updated Nov 23, 2024

Multi-platform high-performance compute language extension for Rust.

Rust 683 33 Updated Nov 23, 2024

MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.

Python 510 45 Updated Nov 23, 2024

A modern model graph visualizer and debugger

JavaScript 1,060 85 Updated Nov 19, 2024

Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.

Cuda 11 2 Updated Nov 3, 2023

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.

C++ 26 1 Updated Nov 5, 2024
Next