Stars
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
🔊 Text-Prompted Generative Audio Model
Google Research
stable diffusion webui colab
This repository contains the source code for the paper First Order Motion Model for Image Animation
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
High-Resolution Image Synthesis with Latent Diffusion Models
Foundational Models for State-of-the-Art Speech and Text Translation
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
PyTorch code and models for the DINOv2 self-supervised learning method.
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Image restoration with neural networks but without learning.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
A series of large language models trained from scratch by developers @01-ai
Code samples used on cloud.google.com
A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
A unified framework for 3D content generation.
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
serp-ai / bark-with-voice-clone
Forked from suno-ai/bark🔊 Text-prompted Generative Audio Model - With the ability to clone voices
A course on aligning smol models.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
pytorch implementation of openpose including Hand and Body Pose Estimation.
MTEB: Massive Text Embedding Benchmark
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.