Stars
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
Speech To Speech: an effort for an open-sourced and modular GPT4-o
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Node.js Production Process Manager with a built-in Load Balancer.
SRS is a simple, high-efficiency, real-time media server supporting RTMP, WebRTC, HLS, HTTP-FLV, HTTP-TS, SRT, MPEG-DASH, and GB28181.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
A geometry-shader-based, global CUDA sorted high-performance 3D Gaussian Splatting rasterizer. Can achieve a 5-10x speedup in rendering compared to the vanialla diff-gaussian-rasterization.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Real time interactive streaming digital human
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…
Curated list of data science interview questions and answers
Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko,…
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Streamlit — A faster way to build and share data apps.
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Official implementation for "Generating Diverse and Natural 3D Human Motions from Texts (CVPR2022)."
[CVPR 2024] Official implementation of the paper "Towards Versatile Human-Human Interaction Analysis"
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
4DHumans: Reconstructing and Tracking Humans with Transformers
Systems design is the process of defining the architecture, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems …
ECCV2020 paper "Whole-Body Human Pose Estimation in the Wild"
High-resolution models for human tasks.