README for Gemma-2-2B-IT Fine-Tuning with LoRA
This project fine-tunes the Gemma-2-2B-IT model using LoRA (Low-Rank Adaptation) for Question Answering tasks, leveraging the Wikitext-2 dataset. The fine-tuning process is optimized for efficient training on limited GPU memory by freezing most model parameters and applying LoRA to specific layers.
Project Overview
- Model:
Gemma-2-2B-IT, a causal language model. - Dataset:
Wikitext-2for text generation and causal language modeling. - Training Strategy: LoRA adaptation for low-resource fine-tuning.
- Frameworks: Hugging Face
transformers,peft, anddatasets.
Key Features
- LoRA Configuration:
- LoRA is applied to the following projection layers:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj, anddown_proj. - LoRA hyperparameters:
- Rank (
r): 4 - LoRA Alpha: 8
- Dropout: 0.1
- Rank (
- LoRA is applied to the following projection layers:
- Training Configuration:
- Mixed precision (
fp16) enabled for faster and more memory-efficient training. - Gradient accumulation with
32steps to manage large model sizes on small GPUs. - Batch size of 1 due to GPU memory constraints.
- Learning rate:
5e-5with weight decay:0.01.
- Mixed precision (
System Requirements
- GPU: Required for efficient training. This script was tested with CUDA-enabled GPUs.
- Python Packages: Install dependencies with:
pip install -r requirements.txt
Notes
- This fine-tuned model leverages LoRA to adapt the large
Gemma-2-2B-ITmodel with minimal trainable parameters, allowing fine-tuning even on hardware with limited memory. - The fine-tuned model can be further utilized for tasks like Question Answering, and it is optimized for resource-efficient deployment.
Memory Usage
- The training script includes CUDA memory summaries before and after the training process to monitor GPU memory consumption.
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support