Starred repositories
An extensible framework that instruments python programs at runtime
The book "Performance Analysis and Tuning on Modern CPU"
Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workl…
Fast and memory-efficient exact attention
SGLang is a fast serving framework for large language models and vision language models.
A high-throughput and memory-efficient inference and serving engine for LLMs
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
NVIDIA driver packaging for RHEL
Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysis on Intel(R) Processor Graphics easily
Effing package management! Build packages for multiple platforms (deb, rpm, etc) with great ease and sanity.
⚡ Energy consumption metrology agent. Let "scaph" dive and bring back the metrics that will help you make your systems and applications more sustainable !
Funding rate arbitrage on cryptocurrency.
Source Code for 'Foundations of Libvirt Development' by W. David Ashley
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch
A playbook for systematically maximizing the performance of deep learning models.
📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.
how to optimize some algorithm in cuda.
A collection of metrics to profile a single deep learning model or compare two different deep learning models
DeepLearning Framework Performance Profiling Toolkit
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
Open-source implementation of Google Vizier for hyper parameters tuning