Skip to content
/ GVT Public

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".

License

Notifications You must be signed in to change notification settings

TencentARC/GVT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
gzhiwang
Jun 27, 2023
fead7e3 · Jun 27, 2023

History

10 Commits
May 26, 2023
Jun 27, 2023
Jun 25, 2023
May 26, 2023

Repository files navigation

GVT: Good Visual Tokenizer for LLMs

This repo contains assets in our paper What makes for Good Visual Tokenizers for Large Language Models?

Model

We provide related details in gvt.

GVTBench

We provide the Object Counting (OC) and Multi-Class Identification (MCI) on MS-COCO and VCR datasets in GVTBench.

Acknowledgement

Our work is built on VLMo LAVIS EVA Vicuna.

Thanks for their great work!

Citation

If you find this work useful, please cite:

@misc{wang2023gvt,
      title={What Makes for Good Visual Tokenizers for Large Language Models?}, 
      author={Guangzhi Wang and Yixiao Ge and Xiaohan Ding and Mohan Kankanhalli and Ying Shan},
      year={2023},
      eprint={2305.12223},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published