Skip to content
View zafstojano's full-sized avatar

Block or report zafstojano

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
8 stars written in C++
Clear filter

LLM inference in C/C++

C++ 70,234 10,134 Updated Jan 4, 2025

Distribute and run LLMs with a single file.

C++ 21,119 1,088 Updated Jan 5, 2025

MLX: An array framework for Apple silicon

C++ 18,159 1,045 Updated Jan 5, 2025

Development repository for the Triton language and compiler

C++ 13,897 1,693 Updated Jan 6, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,073 1,045 Updated Jan 3, 2025

🎤⌨️ Acoustic keyboard eavesdropping

C++ 8,582 590 Updated Jan 15, 2023

Enabling PyTorch on XLA Devices (e.g. Google TPU)

C++ 2,507 487 Updated Jan 4, 2025

10x faster matrix and vector operations

C++ 2,478 170 Updated Oct 12, 2022