This repository contains the code, data, and experiments for fine-tuning the Llama2 7B language model using LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA) techniques. The experiments cover various NLP tasks, including TruthfulQA MC1, TruthfulQA MC2, Arithmetic 2ds, Arithmetic 4ds, BLiMP Causative, MMLU Global Facts.
The main objectives of this project are:
- Evaluate the performance of the Llama2 7B model when fine-tuned using LoRA and QLoRA techniques.
- Explore the impact of different configurations, such as rank sizes, alphas, optimization algorithms, and quantization formats, on the model's performance across various NLP tasks.
- Investigate the trade-offs between computational efficiency, memory usage, and performance when employing LoRA and QLoRA techniques.
The experiments are conducted using the Hugging Face Transformers library and the Alpaca 52k cleaned dataset. The results of the experiments are available in the /outputs
directory.
This project can be used as a reference for fine-tuning various large language models using LoRA and QLoRA techniques on different dataset and for evaluating the performance of these models across various NLP tasks.
- Python 3.10+
- PyTorch 1.12.1
- Hugging Face Transformers 4.26.0
- NVIDIA GPU (RTX A6000, L40, A100 80GB PCIe, or RTX 6000 Ada Generation)
- Clone the repository:
git clone https://github.com/akkky02/LLM_Instruction_Tuning.git
cd LLM_Instruction_Tuning
- Create a virtual environment and install the required packages:
python3 -m venv myenv
source myenv/bin/activate
pip install -r requirements.txt
To train the Llama2 7B model using LoRA or QLoRA, run the following command:
python3 finetune.py \
--model_name_or_path "meta-llama/Llama-2-7b-hf" \
--dataset "yahma/alpaca-cleaned" \
--num_train_epochs 1 \
--lora_r 8 \
--lora_alpha 16 \
--lora_dropout 0.05 \
--target_modules 'q_proj', 'v_proj' \
--report_to "wandb" \
--run_name "lama2_7b_base_adapter" \
--output_dir "./experiments" \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 128 \
--learning_rate 3e-4 \
--weight_decay 0.01 \
--do_train \
--warmup_steps 100 \
--optimizer "AdamW" \
--logging_steps 1 \
--save_strategy "steps" \
--save_steps 25 \
--save_total_limit 3 \
--push_to_hub \
--hub_model_id "MAdAiLab/llama2_7b_base_adapter" \
--hub_strategy "checkpoint"
Replace the command-line arguments with the desired settings for your experiment, such as dataset, model_name, rank sizes, alphas, quantization formats, and optimization algorithms.
After training, use the following command to evaluate the model's performance on various NLP tasks:
MODEL_NAME="meta-llama/Llama-2-7b-hf"
ADAPTER="MAdAiLab/llama2_7b_base_adapter"
python3 merge_eval.py --model_name "$MODEL_NAME" --adapter "$ADAPTER"
lm_eval --model hf \
--model_args "pretrained=${ADAPTER}_merged_final" \
--tasks truthfulqa_mc1,truthfulqa_mc2,arithmetic_2ds,arithmetic_4ds,blimp_causative,mmlu_global_facts \
--device cuda:0 \
--batch_size auto:4 \
--output_path "./outputs/${ADAPTER}_merged_final" \
--log_samples
Adjust the --tasks
argument based on the specific tasks you want to evaluate.
To run these experiments on the Runpod platform, follow these steps:
- Sign up for a Runpod account at https://runpod.io.
- Create a new Runpod instance and select the desired GPU configuration (e.g., NVIDIA RTX A6000, NVIDIA L40, NVIDIA A100 80GB PCIe, or NVIDIA RTX 6000 Ada Generation).
- Once your Runpod instance is ready, follow the steps in the "Installation" section to set up the environment.
- Clone this repository into your Runpod instance and follow the "Usage" section to perform training and evaluation.
- Remember to terminate the Runpod instance after completing your experiments to stop incurring costs.
The results of the experiments are available in the results/
directory. Each subdirectory contains the evaluation metrics for a specific LoRA or QLoRA configuration applied to the Llama2 7B model.
For a detailed report and analysis, please refer to our Evaluation of Llama 2 7b model with LoRA and QLoRA using Huggingface ecosystem .
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the Apache 2.0.
- Llama2 7B model - The language model used in these experiments.
- Alpaca 52k cleaned dataset - The dataset used for fine-tuning the model.