Skip to content
View nwnk's full-sized avatar

Block or report nwnk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.

LLVM 36 8 Updated Oct 25, 2021

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…

Python 7,490 1,329 Updated Mar 14, 2025

The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor…

C 16 6 Updated Mar 28, 2019

A simple guide to compile Llama.cpp and llama-cpp-python using CLBLAST for older generation AMD GPUs.

C 6 1 Updated Sep 3, 2023
C++ 44 18 Updated Feb 26, 2025

a software library containing BLAS functions written in OpenCL

C++ 852 237 Updated Aug 2, 2024

Intel® GPU Compute Samples

C++ 104 18 Updated Mar 7, 2025

Vulkan mipmap generation with 3 strategies: blit chain, compute with per-level barriers, compute with Subgroup shuffle.

C++ 4 Updated May 23, 2024

Customizable compute shader for fast cache-aware mipmap generation

GLSL 48 3 Updated Sep 7, 2024

An Open Framework for Federated Learning.

Python 756 220 Updated Mar 13, 2025

pretends to export c++ functions but with a c abi

Python 1 Updated Sep 15, 2024

MLX: An array framework for Apple silicon

C++ 19,585 1,114 Updated Mar 14, 2025

Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices

LLVM 855 88 Updated Jun 21, 2024

Examples for building and running LLM services and applications locally with Podman

Python 2 Updated Aug 19, 2024

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.

C++ 257 36 Updated Mar 14, 2025

Community maintained container images to use with toolbx and distrobox

Dockerfile 357 34 Updated Feb 1, 2025

On-device AI across mobile, embedded and edge for PyTorch

C++ 2,591 477 Updated Mar 14, 2025

glmark2 is an OpenGL 2.0 and ES 2.0 benchmark

C 454 187 Updated Feb 21, 2025

CUDA on non-NVIDIA GPUs

Rust 10,922 699 Updated Mar 13, 2025

Mali G610 & 710 GPU Driver for Termux

C 62 24 Updated Sep 22, 2024

Raspberry Pi 4 UEFI Firmware Images

1,253 152 Updated Feb 21, 2025

Reverse engineered Linux driver for the Apple Neural Engine (ANE).

C 399 18 Updated Mar 12, 2024

MIOpenGEMM is now deprecated

C++ 62 11 Updated Jul 17, 2023

Open source version of RV, the Sci-Tech award-winning media review and playback software.

C++ 620 156 Updated Mar 13, 2025

Implementation of OpenCL 3.0 on Vulkan

C++ 380 43 Updated Mar 4, 2025

An Xlib compatibility layer implemented on top of the Haiku API, in order to run X11 applications on Haiku without an X server.

C 93 3 Updated Aug 29, 2024

Documentation of NVIDIA chip/hardware interfaces

C 1,267 93 Updated Sep 10, 2024

Words of the same length with related meanings.

Python 345 20 Updated Mar 3, 2025

OpenCL implementation running on the VideoCore IV GPU of the Raspberry Pi models

C++ 736 81 Updated Sep 14, 2022
Next