Stars
Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app
2023年最新总结,阿里,腾讯,百度,美团,头条等技术面试题目,以及答案,专家出题人分析汇总。
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
A toolkit for developing and comparing reinforcement learning algorithms.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A high-throughput and memory-efficient inference and serving engine for LLMs
Open-Sora: Democratizing Efficient Video Production for All
《Designing Data-Intensive Application》DDIA中文翻译
Fast and memory-efficient exact attention
Ongoing research training transformer models at scale
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
SGLang is a fast serving framework for large language models and vision language models.
Open-source observability for your LLM application, based on OpenTelemetry
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
A best practices guide for day 2 operations, including operational excellence, security, reliability, performance efficiency, and cost optimization.
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Just Code ! 针对面试训练算法题, 目前包括字节跳动面试题、 LeetCode 和剑指 offer ,持续扩容中 ⭐
Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
AIFoundation 主要是指AI系统遇到大模型,从底层到上层如何系统级地支持大模型训练和推理,全栈的核心技术。