Stars
Python tool for converting files and office documents to Markdown.
Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead
SGLang is a fast serving framework for large language models and vision language models.
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
A reading list on LLM based Synthetic Data Generation 🔥
Data and tools for generating and inspecting OLMo pre-training data.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Video+code lecture on building nanoGPT from scratch
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
An efficient video loader for deep learning with smart shuffling that's super easy to digest
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
🚀 KIMI AI 长文本大模型逆向API【特长:长文本解读整理】,支持高速流式输出、智能体对话、联网搜索、探索版、K1思考模型、长文档解读、图像解析、多轮对话,零配置部署,多路token支持,自动清理会话痕迹,仅供测试,如需商用请前往官方开放平台。
The hub for EleutherAI's work on interpretability and learning dynamics
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
A library for advanced large language model reasoning
A minimal GPU design in Verilog to learn how GPUs work from the ground up