Lists (1)
Sort Name ascending (A-Z)
Stars
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
A launch point for your personal nvim configuration
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
Redset is a dataset containing three months worth of user query metadata that ran on a selected sample of instances in the Amazon Redshift fleet. We provide query metadata for 200 provisioned and s…
Neural Networks: Zero to Hero
Open, Multi-modal Catalog for Data & AI
llama3 implementation one matrix multiplication at a time
Pseudonymization with Cryptography
Curated list of project-based tutorials
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
A high-throughput and memory-efficient inference and serving engine for LLMs
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
The repo for SOSP23 paper: FIFO queues are all you need for cache evictions
BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)
Fast Static Symbol Table (FSST): efficient random-access string compression
Static reflection for enums (to string, from string, iteration) for modern C++, work with any enum type without any macro or boilerplate code
Self-Driving Database Management System from Carnegie Mellon University
Apply a coding style with clang-format only to new code added to an existing code base.
🦜🔗 Build context-aware reasoning applications
Stable Diffusion web UI
Code and documentation to train Stanford's Alpaca models, and generate the data.
eBPF-based Networking, Security, and Observability