AI content generated in daily basis by Diego Marinho.
Page to register good AI material to read later.
- The Llama 3 Herd of Models (92 pages)
- mcdse-2b Multi-lingual model embedding (visual doc retrieval)
- Qwen 2VL 7B & 2B
- Microsoft Open Source Phi 3.5 Mini, Phi 3.5 MoE, Phi 3.5 Vision
- RAG and RAU: A Survey on Retrieval-Augmented LM in NLP (arvix pdf)
- LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs (github.io, University of Waterloo)
- Llama 3 quantize 8bits with bitsandbytes
- Mistral 7B with DPO (notebook example)
- Google Research Guidelines for Finetuning
- LLMs Datasets (github list)
- The Tome (github)
- Fine Tome 100k (github)
- Unsloth Library - Llama3.1+others: Fine-tuning easily and free.
- 2023-10-23 AutoGen is a framework for simplifying orchestration, optimisation and automation for LLM workflows.
- 2024-07-24 LAMBDA: Multi-agent Data Analysis System Code, Announcement, Paper
- ShieldGemma: Generative AI Content Moderation Based on Gemma (Google LLC, 31-06-2024)
- 2024-08-21 Fine-tune your GPT4o
- 2024-08-02 Stable Fast 3D Released
- OpenVLM Leaderboard (huggingface board)
- VLM Architectures (github)
- Exploring the Potential of Vision-Language Models (VLMs) A Comprehensive Guide
- An Introduction to VLMs (arvix pdf)
- LivePortrait
- Flux HF, Announcement, Flux Online
- Imagen 3 Paper Best score of text-to-image aligment and second in visual apeal losing to Midjourney v6.
- Stable Video Diffusion img2vid-xt-1-1
- Moondream (github)
- 2024-08-08 Parler-TTS, Online: A lightweight TTS that can generate high-qulaity, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc).
- Giving a Voice to Your Graph: Representing Structured Data for LLMs (google research, ICML'24)