GVT: Good Visual Tokenizer for LLMs

This repo contains assets in our paper What makes for Good Visual Tokenizers for Large Language Models?

Model

We provide related details in gvt.

GVTBench

We provide the Object Counting (OC) and Multi-Class Identification (MCI) on MS-COCO and VCR datasets in GVTBench.

Acknowledgement

Our work is built on VLMo LAVIS EVA Vicuna.

Thanks for their great work!

Citation

If you find this work useful, please cite:

@misc{wang2023gvt,
      title={What Makes for Good Visual Tokenizers for Large Language Models?}, 
      author={Guangzhi Wang and Yixiao Ge and Xiaohan Ding and Mohan Kankanhalli and Ying Shan},
      year={2023},
      eprint={2305.12223},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name	Name	Last commit message	Last commit date
Latest commit gzhiwang Jun 27, 2023 fead7e3 · Jun 27, 2023 History 10 Commits
GVTBench	GVTBench	cmt after pull	May 26, 2023
gvt	gvt	update model switch	Jun 27, 2023
LICENSE	LICENSE	Create LICENSE	Jun 25, 2023
README.md	README.md	update outer readme	May 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GVT: Good Visual Tokenizer for LLMs

Model

GVTBench

Acknowledgement

Citation

About

Releases

Packages

Languages

License

TencentARC/GVT

Folders and files

Latest commit

History

Repository files navigation

GVT: Good Visual Tokenizer for LLMs

Model

GVTBench

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages