Hands on Large(?) Language model from scratch

tutorial: LLM basics from scratch provide step by step explanation.

how to run

Download Dataset

cd to data folder

cd data

Initialize Git LFS for Large Files

git lfs install

Clone the dataset:

git clone https://huggingface.co/datasets/Skylion007/openwebtext

Unzip dataset:

bash unzip.sh

Convert Data

Back to the root folder, run the following command:

python convert_data.py

It converts all the .xz files in data/openwebtext/subsets and put the converted .txt files in folder data/extracted.

We are using neetbox for monitoring, open localhost:20202 (neetbox's default port) in your browser and you can check the progresses. If you are working on a remote server, you can use ssh -L 20202:localhost:20202 user@remotehost to forward the port to your local machine, or you can directly access the server's IP address with the port number, and you will see all the processes:

Optionally, the script will ask you if you'd like to delete the original .xz files to save disk space. If you want to keep them, type n and press Enter.

train

python train.py --config config/gptv1_s.toml

Since we are using neetbox for monitoring, open localhost:20202 (neetbox's default port) in your browser and you can check the progresses:

predict

python inference.py --config config/gptv1_s.toml

Open localhost:20202 (neetbox's default port) in your browser and feed text to your model via action button.

further

more information see also LLM basics from scratch

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
data		data
imgs/readme		imgs/readme
.gitignore		.gitignore
convert_data.py		convert_data.py
inference.py		inference.py
model.py		model.py
readme.md		readme.md
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hands on Large(?) Language model from scratch

how to run

Download Dataset

Convert Data

train

predict

further

About

Languages

visualDust/naive-llm-from-scratch

Folders and files

Latest commit

History

Repository files navigation

Hands on Large(?) Language model from scratch

how to run

Download Dataset

Convert Data

train

predict

further

About

Topics

Resources

Stars

Watchers

Forks

Languages