Skip to content
View zhaoshitian's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@Alpha-VLLM

Block or report zhaoshitian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A conversational Q&A agent configuration system, self-hosted deployment solutions, and a convenient all-in-one application SDK, allowing you to create intelligent Q&A bots for your GitHub repositories

TypeScript 1,083 59 Updated Jan 17, 2025

Testing baseline LLMs performance across various models

Python 201 18 Updated Dec 27, 2024

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Python 57 4 Updated Jul 11, 2024

LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management

Python 56 1 Updated Jan 15, 2025

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

431 28 Updated Jan 17, 2025

Seamlessly integrate state-of-the-art transformer models into robotics stacks

Python 180 21 Updated Jan 10, 2025

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 544 297 Updated Jul 4, 2024

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 1,736 223 Updated Dec 11, 2024

Building Open-Ended Embodied Agents with Internet-Scale Knowledge

Java 1,862 167 Updated Mar 18, 2024

Inference script for Oasis 500M

Python 1,703 146 Updated Nov 8, 2024

Implementation of a JEPA Image World Model, trained on OpenAI's VPT Minecraft contractor dataset.

Python 6 Updated Apr 19, 2024

WorldModel is a MaskGIT model trained on 8x8x8 Minecraft voxel volumes. Beyond generating blocks from scratch, it excels in filling spaces based on neighboring blocks, ensuring seamless integration…

Python 6 Updated Sep 12, 2023

CGL-Dataset v2 for huggingface datasets

Python 4 Updated Sep 20, 2024

Repo of paper "Free Process Rewards without Process Labels"

Python 101 2 Updated Jan 16, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,259 346 Updated Jan 14, 2025

Build AI-powered Agents for Twitter🐣

Python 146 26 Updated Aug 7, 2024

Recipes to train reward model for RLHF.

Python 1,094 76 Updated Dec 12, 2024

A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.

TypeScript 7,974 747 Updated Jan 2, 2025

Scalable RL solution for advanced reasoning of language models

Python 899 56 Updated Jan 17, 2025

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets

Python 4,190 392 Updated Jan 17, 2025

③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.

Python 336 24 Updated Aug 12, 2024

[ECCV 2024] Official Pytorch Implementation of A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

Python 64 1 Updated Jul 20, 2024

[LMM + AIGC] What do we expect from LMMs as AIGI evaluators and how do they perform?

143 3 Updated Sep 27, 2024
Python 7,722 505 Updated Apr 14, 2024

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 536 23 Updated Jul 13, 2024

Large Concept Models: Language modeling in a sentence representation space

Python 1,739 140 Updated Jan 16, 2025

VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE

Python 258 7 Updated Jan 19, 2025

Official repo and evaluation implementation of VSI-Bench

Python 332 21 Updated Jan 17, 2025

[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.

Python 270 7 Updated Jul 9, 2024
Next