Stars
A latent text-to-image diffusion model
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
LAVIS - A One-stop Library for Language-Vision Intelligence
Official inference library for Mistral models
Image restoration with neural networks but without learning.
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
Efficient Image Captioning code in Torch, runs on GPU
Reference models and tools for Cloud TPUs.
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Text recognition (optical character recognition) with deep learning methods, ICCV 2019
Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.
Efficient few-shot learning with Sentence Transformers
Python API client for AUTOMATIC1111/stable-diffusion-webui
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)
Resources of semantic segmantation based on Deep Learning model
A Python module for computing the Structural Similarity Image Metric (SSIM)
This is a mxnet version of Realtime_Multi-Person_Pose_Estimation, origin code is here https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation