This repository contains the code and instructions for fine-tuning the Phi-2 language model using QLoRA (Quantization-Based Low-Rank Adaptation) for generating journal entries based on personal notes.
Phi-2 Journal Fine-tuning with QLoRA is a project aimed at leveraging advanced natural language processing techniques to generate journal entries based on personal notes. By fine-tuning the Phi-2 language model with QLoRA, we aim to create a model that can understand the context of personal notes and generate meaningful journal entries.
The LoRA method proposed by Hu et al. replaces to decompose the weight changes, ΔW, into a lower-rank representation. To be precise, it does not require to explicitly compute ΔW. Instead, LoRA learns the decomposed representation of ΔW directly during training which is where the savings are coming from, as shown in the figure below.
- Preparing Data
- Setting Up GPU Environment
- Loading Dataset
- Loading Base Model
- Tokenization
- Setting Up LoRA
- Running Training
- Trying the Trained Model
Before starting the fine-tuning process, ensure your data is formatted correctly. The dataset should consist of JSONL files containing input-output pairs or structured data suitable for training. Use the provided script to preprocess your data into the required format.
Utilize a GPU environment for training the model efficiently. Instructions are provided for setting up the environment using Brev.dev, which offers GPU instances suitable for deep learning tasks.
Load the training and evaluation datasets using the datasets
library. Ensure the data is formatted properly and define a formatting function to structure training examples as prompts.
Load the Phi-2 base model with 8-bit quantization enabled for memory-efficient training. This step initializes the model for fine-tuning.
Set up the tokenizer for tokenizing the input data. Define the maximum length for input tensors based on the distribution of dataset lengths. Tokenize the data with padding and truncation as necessary.
Prepare the model for fine-tuning with Low-Rank Adaptation (LoRA). Configure the LoRA settings such as rank (r
) and scaling factor (alpha
) to control the number of parameters and emphasize new fine-tuned data.
Initiate the training process using the configured model, datasets, and training arguments. Monitor the training progress, evaluate the model, and save checkpoints at specified intervals.
Load the trained model from the best performing checkpoint directory. Test the model by providing sample prompts and generating journal entries based on the trained model's output.