mazzzystar

☂️

Focusing

Ke Fang mazzzystar

☂️

Focusing

Computer Vision & Generative AI. "We create the world we live in."

725 followers · 441 following

Lists (3)

Sort

Stars

Stability-AI / stable-codec

A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.

116 1 Updated Dec 3, 2024

naver-ai / usdm

Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)

Python 49 1 Updated Dec 3, 2024

continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains

TypeScript 19,752 1,743 Updated Dec 5, 2024

Yuanshi9815 / OminiControl

A minimal and universal controller for FLUX.1.

Python 738 38 Updated Nov 28, 2024

xxyQwQ / ComfyBench

Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".

Python 129 10 Updated Nov 23, 2024

richards199999 / Thinking-Claude

Let your Claude able to think

TypeScript 9,253 1,065 Updated Dec 3, 2024

ddlBoJack / Awesome-Speech-Language-Model

Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.

124 10 Updated Nov 10, 2024

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

181 7 Updated Nov 28, 2024

neuralmagic / guidellm

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 172 12 Updated Dec 4, 2024

gregpr07 / browser-use

Make websites accessible for AI agents

Python 2,830 222 Updated Dec 4, 2024

Mebius1916 / NextTalk_web

The web version of the NextTalk project

TypeScript 35 1 Updated Nov 12, 2024

edwko / OuteTTS

Interface for OuteTTS models.

Python 724 50 Updated Dec 4, 2024

instantX-research / InstantIR

InstantIR: Blind Image Restoration with Instant Generative Reference 🔥

Python 388 27 Updated Nov 14, 2024

ali-vilab / In-Context-LoRA

Official repository of In-Context LoRA for Diffusion Transformers

1,243 61 Updated Nov 17, 2024

Haiyang-W / TokenFormer

Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 420 30 Updated Nov 12, 2024

lifeiteng / NotebookTTS

Text-To-Speech for NotebookLM

23 Updated Nov 27, 2024

etched-ai / open-oasis

Inference script for Oasis 500M

Python 1,592 133 Updated Nov 8, 2024

google / sequence-layers

Python 24 Updated Oct 30, 2024

GAIR-NLP / O1-Journey

O1 Replication Journey: A Strategic Progress Report – Part I

1,591 46 Updated Nov 30, 2024

shallowdream204 / DreamClear

[NeurIPS 2024🔥] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Python 797 42 Updated Dec 3, 2024

bytedance / Hybrid-SD

Python 16 Updated Oct 30, 2024

Sakshi113 / MMAU

Python 26 1 Updated Nov 9, 2024

bghira / SimpleTuner

A general fine-tuning kit geared toward diffusion models.

Python 1,863 176 Updated Dec 4, 2024

XLabs-AI / x-flux-comfyui

Python 1,165 74 Updated Oct 30, 2024

OpenPipe / best-hn

Jupyter Notebook 7 Updated Nov 12, 2024

anliyuan / Ultralight-Digital-Human

一个超轻量级、可以在移动端实时运行的数字人模型

Python 1,169 179 Updated Nov 13, 2024

om-ai-lab / OmAgent

A Hub for the State-of-the-art Language and Multimodal Agents

Python 1,353 105 Updated Dec 4, 2024

JishengBai / AudioSetCaps

A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline

Python 85 3 Updated Dec 3, 2024

KoljaB / RealtimeTTS

Converts text to speech in realtime

Python 2,084 210 Updated Nov 30, 2024

haoheliu / audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

Python 307 31 Updated Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly