Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
configs/xdecoder		configs/xdecoder
datasets		datasets
demo		demo
images		images
utils		utils
xdecoder		xdecoder
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
DATASET.md		DATASET.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
__init__.py		__init__.py
eval.py		eval.py
id2answer.da		id2answer.da
install_cococapeval.sh		install_cococapeval.sh
requirements.txt		requirements.txt
xgpt.py		xgpt.py

Repository files navigation

X-GPT: Connecting generalist X-Decoder with GPT-3

🔥 News

[2023.09.22] The previous model links are deprecated, please download the checkpoints from here.
[2023.03.14] We build X-GPT, a multi-modal conversational demo that is built on X-Decoder using GPT-3 and langchain!
[2023.03.14] Feel free to explore the Full resolution demo video!

🎶 Introduction

What are the uniques?

This specific project was started with our previous two demos, all-in-one and instruct x-decoder. Then we were inspired by visual-chatgpt developed by our MSRA collegues to use the langchain to empower a conversational X-Decoder and encompassing all the capacities of our single X-Decoder model.

Our X-GPT has several unique and new features:

It uses a SINGLE X-Decoder model to support a wide range of vision and vision-language tasks. As such, you do not need separate models for individual tasks!
It delivers the state-of-the-art segmentation performance. It is much better than CLIPSeg or other existing open-vocabulary segmentation systems!
It also supports text-to-image retrieval. You can choose find a real image from your own pool or ask for generating a new image!

In the next, we will:

You may notice we developed a more grounded instructPix2Pix with the support of our X-Decoder. Next we integrate it to our X-GPT!

Getting Started

Installation

# set up environment
conda create -n xgpt python=3.8
conda activate xgpt

# install dependencies
pip3 install torch==1.13.1 torchvision==0.14.1 --extra-index-url https://download.pytorch.org/whl/cu113
python -m pip install 'git+https://github.com/MaureenZOU/detectron2-xyz.git'
pip install git+https://github.com/cocodataset/panopticapi.git
python -m pip install -r requirements.txt
sh install_cococapeval.sh

# download x-decoder tiny model
wget https://projects4jw.blob.core.windows.net/x-decoder/release/xdecoder_focalt_last_novg.pt
wget https://projects4jw.blob.core.windows.net/x-decoder/release/xdecoder_focalt_vqa.pt

# create a folder for image uploading and image retrieval, note image_folder is for image retrieval, image is for cache
mkdir image & mkdir image_pool

# export openai key, please follow https://langchain.readthedocs.io/en/latest/modules/llms/integrations/azure_openai_example.html
export OPENAI_API_TYPE=azure
export OPENAI_API_VERSION=2022-12-01
export OPENAI_API_BASE=https://your-resource-name.openai.azure.com
export OPENAI_API_KEY=<your Azure OpenAI API key>

Run Demo

# Simply run this single line and enjoy it!!!
python xgpt.py

Acknowledgement

We are highly inspired by visual-chatgpt in the usage of langchain, and build on top of many great open-sourced projects: HuggingFace Transformers, LangChain, Stable Diffusion, InstructPix2Pix.

Citation

@article{zou2022xdecoder,
  author      = {Zou, Xueyan and Dou, Zi-Yi and Yang, Jianwei and Gan, Zhe and Li, Linjie and Li, Chunyuan and Dai, Xiyang and Wang, Jianfeng and Yuan, Lu and Peng, Nanyun and Wang, Lijuan and Lee, Yong Jae and Gao, Jianfeng},
  title       = {Generalized Decoding for Pixel, Image and Language},
  publisher   = {arXiv},
  year        = {2022},
}

Contact Information

For issues to use our X-GPT, please submit a GitHub issue or contact Jianwei Yang ([email protected]).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

X-GPT: Connecting generalist X-Decoder with GPT-3

🔥 News

🎶 Introduction

What are the uniques?

Getting Started

Installation

Run Demo

Acknowledgement

Citation

Contact Information

About

Releases 4

Packages

Contributors 3

Languages

License

microsoft/X-Decoder

Folders and files

Latest commit

History

Repository files navigation

X-GPT: Connecting generalist X-Decoder with GPT-3

🔥 News

🎶 Introduction

What are the uniques?

Getting Started

Installation

Run Demo

Acknowledgement

Citation

Contact Information

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 3

Languages

Packages