huangs0

Follow

Huang Songlin huangs0

Follow

I'm Songlin, a MPhil student at HKU AIoT Lab @aiot-lab on Federated Learning. My long term vision is to build applicable, fast and trustworthy AI.

7 followers · 17 following

@aiot-lab
Hong Kong
03:18 (UTC +08:00)
huangs0.github.io
in/huang-songlin
https://huangs0.notion.site

Highlights

Pro

Stars

sakjain92 / Fractional-GPUs

Splits single Nvidia GPU into multiple partitions with complete compute and memory isolation (wrt to performace) between the partitions

C 157 39 Updated Apr 21, 2019

microsoft / NPKit

NCCL Profiling Kit

Python 127 12 Updated Jul 1, 2024

xmithd / aoc2024

Rust 1 Updated Jan 15, 2025

Cpp-Club / Cxx_HOPL4_zh

Chinese translation of Bjarne Stroustrup's HOPL4 paper

2,241 398 Updated Dec 10, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,886 188 Updated Jan 29, 2025

pytorch-labs / applied-ai

Applied AI experiments and examples for PyTorch

Python 216 22 Updated Jan 21, 2025

JimZeyuYang / GPU_Power_Benchmark

Microbenchmark that unveals the mechanisms behind power readings reported by nvidia-smi/MVML on your NVIDIA GPU.

C++ 11 Updated Dec 12, 2024

gpoore / minted

minted is a LaTeX package that provides syntax highlighting using the Pygments library. Highlighted source code can be customized using fancyvrb.

TeX 1,775 128 Updated Nov 25, 2024

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,195 97 Updated Jan 24, 2025

vbpf / ebpf-verifier

eBPF verifier based on abstract interpretation

C++ 400 45 Updated Jan 25, 2025

iovisor / ubpf

Forked from rlane/ubpf

Userspace eBPF VM

C 852 141 Updated Jan 29, 2025

NVIDIA / cuda-gdb

CUDA GDB

C 192 56 Updated Aug 23, 2024

louishhy / min-server

Learning through minimalistic server implementations.

Python 10 Updated Oct 20, 2024

DefTruth / CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,159 229 Updated Jan 27, 2025

test-jitcomp / Artemis

A JIT Compiler Fuzzer for JVMs via CSE/JoNM in "Validating JIT Compilers via Compilation Space Exploration" (SOSP'23)

Java 52 2 Updated Dec 19, 2024

runtimeverification / k

K Framework Tools 7.0

Python 465 152 Updated Jan 28, 2025

WebAssembly / wabt

The WebAssembly Binary Toolkit

C++ 7,037 722 Updated Jan 18, 2025

B-Con / crypto-algorithms

Basic implementations of standard cryptography algorithms, like AES and SHA-1.

C 1,877 698 Updated Dec 28, 2020

WAVM / WAVM

WebAssembly Virtual Machine

C++ 2,668 226 Updated Feb 14, 2024

wasm3 / wasm3

🚀 A fast WebAssembly interpreter and the most universal WASM runtime

C 7,417 473 Updated Sep 10, 2024

swiftlang / swift-corelibs-libdispatch

The libdispatch Project, (a.k.a. Grand Central Dispatch), for concurrency on multicore hardware

C 2,489 465 Updated Jan 29, 2025

cloudflare / workers-rs

Write Cloudflare Workers in 100% Rust via WebAssembly

Rust 2,704 300 Updated Jan 29, 2025

containerd / containerd

An open and reliable container runtime

Go 17,884 3,521 Updated Jan 27, 2025

Conqueror712 / CUDA-Simulator

A self-developed version of the user-mode CUDA emulator project and a learning repository for Rust

Rust 4 2 Updated Sep 22, 2023

troydhanson / uthash

C macros for hash tables and more

C 4,253 936 Updated Oct 15, 2024

microsoft / proxy

Proxy: Next Generation Polymorphism in C++

C++ 2,430 159 Updated Jan 27, 2025

inhocho89 / llvm14-ldb

Latency Debug compatible LLVM compiler based on LLVM 14

14 3 Updated Apr 15, 2024

aiot-lab / TADAR

Thermal Array-based Detection and Ranging for Privacy-Preserving Human Sensing

Jupyter Notebook 7 Updated Oct 22, 2024

Ghamry0x2 / Page-Replacement-Algorithms

A simple console application providing the implementation of the FIFO, LRU, LFU, Second Chance, Enhance Second-Chance, and Optimal page replacement algorithms, built using Java.

Java 1 Updated Dec 24, 2018

SunsetQuest / CudaPAD

CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.

C# 110 16 Updated Jan 17, 2023