yufansong

🎯

Focusing

Yufan Song yufansong

🎯

Focusing

SDE RisingWave Labs | Ex CMU, Snowflake, ByteDance

150 followers · 56 following

Risingwave Labs
Seattle
in/yufansong
@YufanSong98
https://scholar.google.com/citations?hl=en&user=cpZgsAUAAAAJ

Achievements

x3 x3

Achievements

x3 x3

Organizations

Stars

Open-Source-O1 / Open-O1

Python 988 30 Updated Nov 21, 2024

BoundaryML / baml

BAML is a language that helps you get structured data from LLMs, with the best DX possible. Works with all languages. Check out the promptfiddle.com playground

Rust 1,581 55 Updated Dec 12, 2024

aws-neuron / transformers-neuronx

Python 101 29 Updated Dec 12, 2024

huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.

Jupyter Notebook 213 65 Updated Dec 13, 2024

aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services

Python 475 155 Updated Dec 11, 2024

mlc-ai / web-llm

High-performance In-browser LLM Inference Engine

TypeScript 13,986 906 Updated Dec 9, 2024

mlc-ai / xgrammar

Efficient, Flexible and Portable Structured Generation

C++ 468 22 Updated Dec 12, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 1,723 142 Updated Dec 12, 2024

srush / Triton-Puzzles

Puzzles for learning Triton

Jupyter Notebook 1,196 92 Updated Nov 18, 2024

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 35,170 4,341 Updated Nov 18, 2024

jiahaoli57 / Call-for-Reviewers

This project aims to collect the latest "call for reviewers" links from various top CS/ML/AI conferences/journals

540 13 Updated Dec 12, 2024

HandsOnLLM / Hands-On-Large-Language-Models

Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

Jupyter Notebook 2,960 651 Updated Dec 10, 2024

ArroyoSystems / arroyo

Distributed stream processing engine in Rust

Rust 3,840 226 Updated Dec 12, 2024

skyzh / write-you-a-vector-db

A Vector Database Tutorial (over CMU-DB's BusTub system)

C++ 643 18 Updated Jan 21, 2024

rust-lang-cn / book-cn

Rust 程序设计语言中文版——Chinese translation of The Rust Programming Language (Book)

Rust 862 145 Updated Nov 22, 2024

sustcsonglin / flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,418 72 Updated Dec 12, 2024

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 6,505 577 Updated Dec 12, 2024

zilliztech / GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,277 510 Updated Sep 18, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,544 153 Updated Dec 13, 2024

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,713 372 Updated Jul 11, 2024

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 929 82 Updated Dec 10, 2024

microsoft / WindowsAgentArena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

Python 509 50 Updated Nov 20, 2024

sotopia-lab / sotopia

Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)

Python 170 22 Updated Dec 13, 2024

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,682 215 Updated Dec 13, 2024