xdjiangkai

Jiang Kai xdjiangkai

27 followers · 27 following

Xidian University
Shaanxi China

Achievements

Lists (1)

Sort

✨ Inspiration

1 repository

Stars

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,860 265 Updated Jun 4, 2024

mbzuai-oryx / VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 237 15 Updated Aug 11, 2024

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,099 220 Updated Dec 3, 2024

meta-llama / llama

Inference code for Llama models

Python 57,080 9,641 Updated Aug 18, 2024

OpenRobotLab / EmbodiedScan

[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

Python 520 38 Updated Dec 26, 2024

RUCBM / GUICourse

GUICourse: From General Vision Langauge Models to Versatile GUI Agents

Python 95 6 Updated Jul 17, 2024

uni-medical / GMAI-MMBench

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.

34 1 Updated Dec 17, 2024

open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Python 1,612 230 Updated Jan 3, 2025

lucidrains / AMIE-pytorch

Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind

Python 54 Updated Sep 16, 2024

uni-medical / GMAI-VL

GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI.

58 Updated Nov 27, 2024

microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.

Python 1,654 202 Updated Aug 13, 2024

epfLLM / meditron

Meditron is a suite of open-source medical Large Language Models (LLMs).

Python 1,925 174 Updated Apr 10, 2024

epfLLM / Megatron-LLM

distributed trainer for LLMs

Python 554 79 Updated May 20, 2024

njucckevin / SeeClick

The model, data and code for the visual GUI Agent SeeClick

HTML 274 13 Updated Nov 22, 2024

kyegomez / Med-PaLM

Towards Generalist Biomedical AI

Python 338 51 Updated Feb 17, 2024

OS-Copilot / OS-Atlas

OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

221 8 Updated Nov 19, 2024

ant-8 / GUI-Grounding-via-Iterative-Narrowing

Code for paper: Improved GUI Grounding via Iterative Narrowing

Jupyter Notebook 6 Updated Dec 20, 2024

sarashs / FPGA_AGI

Jupyter Notebook 8 1 Updated Aug 1, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,271 403 Updated Aug 7, 2024

OSU-NLP-Group / GUI-Agents-Paper-List

Building a comprehensive and handy list of papers for GUI agents

Python 147 7 Updated Jan 4, 2025

mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 5,312 578 Updated Aug 8, 2024

Jingtao-Li-CVer / UniADRS

This is an official implementation for "Learning a Cross-Modality Anomaly Detector for Remote Sensing Imagery“ (TIP 2024))

11 Updated Dec 21, 2024

baaivision / tokenize-anything

[ECCV 2024] Tokenize Anything via Prompting

Jupyter Notebook 554 24 Updated Dec 11, 2024

academicpages / academicpages.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

JavaScript 12,880 44,815 Updated Dec 30, 2024

TianxingChen / Embodied-AI-Guide

具身智能入门指南 Embodied-AI-Guide

1,054 50 Updated Jan 5, 2025

silverling / xdwlan-login

西电校园网登录助手，支持自动登录与开机自启。

Rust 10 1 Updated Oct 22, 2024

utkuozbulak / pytorch-cnn-visualizations

Pytorch implementation of convolutional neural network visualization techniques

Python 7,916 1,490 Updated Jan 1, 2025

jingyi0000 / VLM_survey

Collection of AWESOME vision-language models for vision tasks

2,678 227 Updated Dec 3, 2024

maxin-cn / Awesome-Autoregressive-Visual-Generation-Models

a collection of awesome autoregressive visual generation models

57 Updated Dec 29, 2024

ChaofanTao / Autoregressive-Models-in-Vision-Survey

The paper collections for the autoregressive models in vision.

343 12 Updated Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jiang Kai xdjiangkai

Achievements

Achievements

Block or report xdjiangkai

Lists (1)

✨ Inspiration

Stars

DAMO-NLP-SG / Video-LLaMA

mbzuai-oryx / VideoGPT-plus

PKU-YuanGroup / Video-LLaVA

meta-llama / llama

OpenRobotLab / EmbodiedScan

RUCBM / GUICourse

uni-medical / GMAI-MMBench

open-compass / VLMEvalKit

lucidrains / AMIE-pytorch

uni-medical / GMAI-VL

microsoft / LLaVA-Med

epfLLM / meditron

epfLLM / Megatron-LLM

njucckevin / SeeClick

kyegomez / Med-PaLM

OS-Copilot / OS-Atlas

ant-8 / GUI-Grounding-via-Iterative-Narrowing

sarashs / FPGA_AGI

QwenLM / Qwen-VL

OSU-NLP-Group / GUI-Agents-Paper-List

mnotgod96 / AppAgent

Jingtao-Li-CVer / UniADRS

baaivision / tokenize-anything

academicpages / academicpages.github.io

TianxingChen / Embodied-AI-Guide

silverling / xdwlan-login

utkuozbulak / pytorch-cnn-visualizations

jingyi0000 / VLM_survey

maxin-cn / Awesome-Autoregressive-Visual-Generation-Models

ChaofanTao / Autoregressive-Models-in-Vision-Survey