Skip to content
View hanmenghan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report hanmenghan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[TMLR 2022] High-Modality Multimodal Transformer

Python 110 7 Updated Nov 2, 2024
Python 3 Updated Jan 4, 2025

[ICLR'24] Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching

Python 462 26 Updated Dec 16, 2024

天大博士/硕士学位论文Latex模板,根据2021年版要求修改,可直接在Overleaf上运行。:star:所写的论文成功提交天津大学图书馆存档!(2021.12.24)

TeX 345 62 Updated Aug 26, 2022

Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Jupyter Notebook 782 40 Updated Jan 4, 2025

Out-of-the-box (OOTB) GUI Agent for Windows and macOS

Python 1,103 98 Updated Jan 1, 2025

A Python library for performing calculations in the Dempster-Shafer theory of evidence.

Python 136 51 Updated May 16, 2021

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

390 23 Updated Jan 5, 2025

【MICCAI 2023 Early Accept & MedIA submission】EyeMost "Reliable Multimodality Eye Disease Screening via Mixture of Student's t Distributions"

Python 19 Updated Dec 11, 2024

The project page of paper: Trusted Multi-View Classification [ICLR'2021 paper]

Python 241 45 Updated Sep 24, 2024

What do we learn from inverting CLIP models?

Python 46 3 Updated Mar 6, 2024

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

Python 26,087 3,266 Updated Dec 30, 2024

Environments, tools, and benchmarks for general computer agents

Python 187 17 Updated Oct 23, 2024
JavaScript 24 1 Updated Apr 16, 2024

Sokoban environment for OpenAI Gym

Python 334 79 Updated Nov 8, 2023
Python 11 Updated Jun 9, 2020

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 320 12 Updated Jan 4, 2025

MedRG: Medical Report Grounding with Multi-modal Large Language Model

Python 4 Updated Dec 11, 2024

[ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"

Python 75 1 Updated Dec 4, 2024

ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection

Python 22 2 Updated May 8, 2024

Pytorch implementation of "Test-time Adaption against Multi-modal Reliability Bias".

Python 32 Updated Dec 24, 2024

This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: [email protected] [email protected] qinyang.gm…

49 8 Updated Dec 23, 2024

Quality-aware multimodal fusion on ICML 2023

Python 82 6 Updated Dec 2, 2024
Next