Stars
🦜🔗 Build context-aware reasoning applications
A latent text-to-image diffusion model
🔊 Text-Prompted Generative Audio Model
Google Research
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
A multi-voice TTS system trained with an emphasis on quality
High-Resolution Image Synthesis with Latent Diffusion Models
Neural Networks: Zero to Hero
Foundational Models for State-of-the-Art Speech and Text Translation
PyTorch code and models for the DINOv2 self-supervised learning method.
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
Using Low-rank adaptation to quickly fine-tune diffusion models.
CoreNet: A library for training deep neural networks
Flax is a neural network library for JAX that is designed for flexibility.
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focuse…
A simple notebook demonstrating prompt-based music generation via Mubert API
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Metric depth estimation from a single image
Training LLMs with QLoRA + FSDP
Simple image captioning model
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"