akihironitta

Akihiro Nitta akihironitta

144 followers · 326 following

@kumo-ai @pyg-team
Mountain View, California
17:23 (UTC -08:00)
https://www.akihironitta.com

Sponsoring

Achievements

x4 x4 x4

Achievements

x4 x4 x4

Highlights

1 security advisory credit

Organizations

Lists (7)

Sort

Stars

seemethere / buildkite-knowledge-base

Python 1 Updated Dec 23, 2025

NVIDIA / cuda-tile

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…

MLIR 333 24 Updated Dec 20, 2025

halide / Halide

a language for fast, portable data-parallel computation

C++ 6,483 1,096 Updated Dec 24, 2025

Noumena-Network / nmoe

MoE training for Me and You and maybe other people

Python 293 25 Updated Dec 17, 2025

NVIDIA / cutile-python

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,666 86 Updated Dec 20, 2025

meta-pytorch / autoparallel

An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.

Python 48 13 Updated Dec 23, 2025

meta-pytorch / MSLK

MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI training and inference, such as FP8 row-wise quantization and …

Python 15 9 Updated Dec 25, 2025

triton-inference-server / vllm_backend

Python 322 34 Updated Dec 24, 2025

triton-inference-server / pytorch_backend

The Triton backend for the PyTorch TorchScript models.

C++ 168 64 Updated Dec 22, 2025

gaogaotiantian / viztracer

A debugging and profiling tool that can trace and visualize python code execution

Python 7,464 467 Updated Dec 24, 2025

meta-pytorch / torchcomms

torchcomms: a modern PyTorch communications API

C++ 313 49 Updated Dec 24, 2025

InferenceMAX / InferenceMAX

Open Source Continuous Inference Benchmarking - GB200 NVL72 vs MI355X vs B200 vs H200 vs MI325X & soon™ TPUv6e/v7/Trainium2/3/GB300 NVL72 - DeepSeek 670B MoE, GPTOSS

Python 403 67 Updated Dec 25, 2025