Skip to content

This is a repo to track the latest autoregressive visual generation papers.

Notifications You must be signed in to change notification settings

lxa9867/Awesome-Autoregressive-Visual-Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 

Repository files navigation

Awesome-Autoregressive-Visual-Generation

This is a repo to track the latest autoregressive visual generation papers.

Image Tokenizers

  1. Neural Discrete Representation Learning Paper, NeurIPS 2017
  2. Generating Diverse High-Fidelity Images with VQ-VAE-2 Paper, NeurIPS 2019
  3. Taming Transformers for High-Resolution Image Synthesis Paper, CVPR 2021
  4. Autoregressive Image Generation using Residual Quantization Paper, CVPR 2022
  5. * BEIT V2: Masked Image Modeling with Vector-Quantized Visual Tokenizers (for understanding) Paper, Arxiv 2022
  6. Vector-quantized Image Modeling with Improved VQGAN Paper, ICLR 2022
  7. MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation Paper, NeurIPS 2022
  8. * PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers (for understanding) Paper, AAAI 2023
  9. * All in Tokens: Unifying Output Space of Visual Tasks via Soft Token (for understanding) Paper, CVPR 2023
  10. Regularized Vector Quantization for Tokenized Image Synthesis Paper, CVPR 2023
  11. Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization Paper, CVPR 2023
  12. Not all image regionsmatter: Masked vector quantization for autoregressive image generation Paper, CVPR 2023
  13. Spae: Semantic pyramid autoencoder for multimodal generation with frozen llms Paper, NeurIPS 2023
  14. HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes Paper, TMLR 2024
  15. Finite Scalar Quantization: VQ-VAE Made Simple Paper, ICLR 2024
  16. Planting a seed of vision in large language model Paper, ICLR 2024
  17. Language model beats diffusion–tokenizer is key to visual generation Paper, ICLR 2024
  18. Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis Paper, CVPR 2024
  19. Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper, NeurIPS 2024
  20. An Image is Worth 32 Tokens for Reconstruction and Generation Paper, NeurIPS 2024
  21. Image Understanding Makes for A Good Tokenizer for Image Generation Paper, NeurIPS 2024
  22. Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Paper, Arxiv 2024
  23. Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data Paper, Arxiv 2024
  24. VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Paper, Arxiv 2024
  25. OPEN-MAGVIT2: AN OPEN-SOURCE PROJECT TOWARD DEMOCRATIZING AUTO-REGRESSIVE VISUAL GENERATION Paper, Arxiv 2024
  26. MaskBit: Embedding-free Image Generation via Bit Tokens Paper, Arxiv 2024
  27. Image and Video Tokenization with Binary Spherical Quantization Paper, Arxiv 2024
  28. Cosmos Tokenizer: A suite of image and video neural tokenizers Website
  29. Adaptive Length Image Tokenization via Recurrent Allocation Paper, Arxiv 2024
  30. RandAR: Decoder-only Autoregressive Visual Generation in Random Orders Paper, Arxiv 2024
  31. Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective Paper, Arxiv 2024
  32. MUSE-VL: Modeling Unified VLM through Semantic Discrete Encoding Paper, Arxiv 2024
  33. TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation Paper, Arxiv 2024
  34. XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation Paper, Arxiv 2024
  35. ImageFolder: Autoregressive Image Generation with Folded Tokens 🚀 Paper, Arxiv 2024
  36. Taming Scalable Visual Tokenizer for Autoregressive Image Generation Paper, Arxiv 2024
  37. Language-Guided Image Tokenization for Generation Paper, Arxiv 2024
  38. Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper, Arxiv 2024
  39. Scaling Image Tokenizers with Grouped Spherical Quantization Paper, Arxiv 2024
  40. Taming Scalable Visual Tokenizer for Autoregressive Image Generation Paper, Arxiv 2024
  41. Spectral Image Tokenizer Paper, Arxiv 2024

AutoRegressive Image Generation

  1. Conditional image generation with pixelcnn decoders Paper, NeurIPS 2016
  2. DiVAE : Photorealistic Images Synthesis with Denoising Diffusion Decoder Paper
  3. Vector Quantized Diffusion Model for Text-to-Image Synthesis Paper
  4. MaskGIT: Masked Generative Image Transformer Paper
  5. BEIT: BERT Pre-Training of Image Transformers Paper
  6. BEIT V2: Masked Image Modeling with Vector-Quantized Visual Tokenizers Paper
  7. MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis Paper
  8. Sequential modeling enables scalable learning for large vision models Paper, Arxiv 2023
  9. 4m: Massively multimodal masked modeling Paper, NeurIPS 2023
  10. Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper, Arxiv 2024
  11. ControlVAR: Exploring Controllable Visual Autoregressive Modeling Paper, Arxiv 2024
  12. Autoregressive Image Generation without Vector Quantization Paper, Arxiv 2024
  13. MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis Paper, Arxiv 2024
  14. ANOLE: AnOpen,Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Paper, Arxiv 2024
  15. VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling Paper, Arxiv 24
  16. Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Paper, Arxiv 24
  17. Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper, Arxiv 2024
  18. Scalable Autoregressive Image Generation with Mamba Paper, Arxiv 2024
  19. SHOW-O: ONE SINGLE TRANSFORMER TO UNIFY MULTIMODAL UNDERSTANDING AND GENERATION Paper, Arxiv 2024
  20. DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Paper, Arxiv 2024
  21. Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Paper, Arxiv 2024
  22. Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling Paper, Arxiv 2024
  23. M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation Paper, Arxiv 2024
  24. MMAR:TowardsLossless Multi-Modal Auto-Regressive Probabilistic Modeling Paper, Arxiv 2024
  25. Randomized Autoregressive Visual Generation Paper, Arxiv 2024
  26. Elucidating the design space of language models for image generation Paper, Arxiv 2024
  27. Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment Paper, Arxiv 2024
  28. CART: Compositional Auto-Regressive Transformer for Image Generation Paper, Arxiv 2024
  29. CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient Paper, Arxiv 2024
  30. X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models Paper, Arxiv 2024
  31. JetFormer: An Autoregressive Generative Model of Raw Images and Text Paper, Arxiv 2024
  32. Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis Paper, Arxiv 2024
  33. Liquid: Language Models are Scalable Multi-modal Generators Paper, Arxiv 2024
  34. FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching Paper, Arxiv 2024

About

This is a repo to track the latest autoregressive visual generation papers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •