Skip to content
View zhaohengyuan1's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report zhaohengyuan1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
316 results for source starred repositories
Clear filter

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

TypeScript 818 54 Updated Jan 17, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 34,088 5,242 Updated Jan 21, 2025

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 5,399 594 Updated Aug 8, 2024

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…

Python 1,966 170 Updated Nov 7, 2024

🤠 Agent-as-a-Judge and DevAI dataset

Python 310 40 Updated Jan 20, 2025

ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.

Python 1,400 135 Updated May 27, 2024

VisualWebArena is a benchmark for multimodal agents.

Python 274 52 Updated Nov 9, 2024

O1 Replication Journey

1,897 58 Updated Jan 14, 2025

FQGAN: Factorized Visual Tokenization and Generation

Python 39 Updated Jan 5, 2025

Code for ROICtrl: Boosting Instance Control for Visual Generation

Python 100 Updated Dec 10, 2024

Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Jupyter Notebook 847 47 Updated Jan 20, 2025

[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos

Python 95 2 Updated Dec 26, 2024

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 28,796 2,741 Updated Jan 21, 2025

Out-of-the-box (OOTB) GUI Agent for Windows and macOS

Python 1,172 104 Updated Jan 1, 2025

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Python 691 90 Updated Jan 15, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 17,051 1,219 Updated Jan 20, 2025

(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator

Python 103 Updated Oct 17, 2024

connecting humans and agents

Python 67 7 Updated Dec 6, 2024

(NeurIPS 2024) Learning to Visual Question Answering, Asking and Assessment

Python 67 2 Updated Nov 7, 2024

[NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.

Python 45 Updated Oct 14, 2024

[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.

Python 62 4 Updated Nov 27, 2024

[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"

Python 170 7 Updated Mar 4, 2024

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

1,559 147 Updated May 9, 2023

Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"

Python 40 3 Updated Apr 18, 2024

A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour

Python 37,979 5,530 Updated Jan 21, 2025

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,430 232 Updated Jun 14, 2024

Set-of-Mark Prompting for GPT-4V and LMMs

Python 1,246 101 Updated Aug 19, 2024

[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 2,304 398 Updated Jan 17, 2025

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

7,109 423 Updated Jul 28, 2024
Next