doing random stuff with neural networks. This is my journey so far:
- A failed experiment with LISA: "Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning", code, paper
- ๐ ๏ธ Memory-efficient LLM Training with GaLore, yet another PEFT approach, code
- โ๏ธ Evaluating LLMs with Semantic Similarity, code
- ๐ ๏ธ Finetune TinyLlama and StableLM 2, code
- ๐ ๏ธ Finetune Microsoft's Phi-2, code
- ๐ ๏ธ Finetune Mamba, code
- ๐ ๏ธ Finetune Llama2 and Mistral using QLoRA, code
- โ๏ธ Evaluate LLM language capabilities with meta's Belebele benchmark, code
- โ๏ธ Evaluate LLM language capabilities with BLEU, code
- โ๏ธ Llama2-70B as a judge of LLMs performs almost as good as GPT-4, code
- โ๏ธ Validation loss is not a good metric for chatbot quality
- โ๏ธ Use GPT3.5 as a judge of open-source LLMs, code
- ๐ ๏ธ Finetune Llama on podcast transripts with QLoRA, code
- ๐ Use Stable Diffusion for sketch-guided image generation, code