Skip to content
View lrain-CN's full-sized avatar
🤣
Focusing
🤣
Focusing

Block or report lrain-CN

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 6,158 418 Updated Dec 6, 2024

每个人都能用的数字人

Python 790 163 Updated Dec 9, 2024

Document to Markdown OCR library with Llama 3.2 vision

TypeScript 1,707 139 Updated Nov 12, 2024

[NeurIPS 2024🔥] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Python 810 45 Updated Dec 12, 2024

Official implementation of the paper "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation"

Python 513 89 Updated Oct 29, 2024

A suite of image and video neural tokenizers

Python 966 23 Updated Nov 13, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,685 2,281 Updated Aug 12, 2024

This repository gives the official implementation of Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models (WACV 2025)

Python 58 8 Updated Oct 28, 2024

[IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer

Python 189 24 Updated Sep 10, 2024

The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.

Python 211 9 Updated Oct 30, 2024

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 1,228 57 Updated Nov 13, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 35,207 4,346 Updated Nov 18, 2024

基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.

Python 4,763 622 Updated Oct 23, 2024

NIST_FRVT Top 1🏆 Face Recognition, Liveness Detection(Face Anti-Spoof), Face Attribute Analysis Linux Server SDK Demo ☑️ Face Recognition ☑️ Face Matching ☑️ Face Liveness Detection ☑️ Face Identif…

C 761 290 Updated Dec 4, 2024

一个超轻量级、可以在移动端实时运行的数字人模型

Python 1,247 191 Updated Nov 13, 2024

This is a HeadSwap project not only face

Jupyter Notebook 35 9 Updated Dec 28, 2022

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 11,742 1,230 Updated Dec 11, 2024

Structured Text Generation

Python 10,038 520 Updated Dec 13, 2024

[3DV'25] 3D Reconstruction with Spatial Memory

Python 798 36 Updated Nov 29, 2024

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python 3,243 376 Updated Dec 10, 2024

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 3,110 249 Updated Dec 6, 2024

研报,行业研报,研究报告,每天定时更新,可关注公众号查看研报

Python 71 9 Updated Dec 14, 2024
JavaScript 9 Updated Nov 20, 2024

1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3,…

Shell 5,883 1,103 Updated Dec 12, 2024

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Python 9,595 1,313 Updated Sep 14, 2024

This React component is used to render Markdown into a beautiful poster image, with support for copying as an image. Md to Poster/Image/Quote/Card/Instagram/Twitter/Facebook...

TypeScript 748 63 Updated Oct 10, 2024

Real time interactive streaming digital human

Python 4,105 599 Updated Dec 8, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 6,306 548 Updated Dec 8, 2024

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python 13,485 1,409 Updated Nov 20, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,573 223 Updated Dec 4, 2024
Next