Lists (24)
Sort Oldest
模型训练部署
GPT
论文润色视频剪辑合成
AIGC资料
换脸
DeeFake , FaceSwap虚拟人
Talking HeadAIGC图像视频生成项目
有用的项目
Face
CNN
算法模型学习资料
CV
audio
for work
gan
img2img
dataset
AIGC项目
3D
CLIP相关
AI
VLM
视觉文本多模态Agent
video
Stars
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
Document to Markdown OCR library with Llama 3.2 vision
[NeurIPS 2024🔥] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
Official implementation of the paper "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation"
A suite of image and video neural tokenizers
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
This repository gives the official implementation of Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models (WACV 2025)
[IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer
The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.
Janus-Series: Unified Multimodal Understanding and Generation Models
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
NIST_FRVT Top 1🏆 Face Recognition, Liveness Detection(Face Anti-Spoof), Face Attribute Analysis Linux Server SDK Demo ☑️ Face Recognition ☑️ Face Matching ☑️ Face Liveness Detection ☑️ Face Identif…
This is a HeadSwap project not only face
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
[3DV'25] 3D Reconstruction with Spatial Memory
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3,…
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
This React component is used to render Markdown into a beautiful poster image, with support for copying as an image. Md to Poster/Image/Quote/Card/Instagram/Twitter/Facebook...
Real time interactive streaming digital human
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.