Skip to content

Latest commit

 

History

History
21 lines (18 loc) · 499 Bytes

README.md

File metadata and controls

21 lines (18 loc) · 499 Bytes

LLM Train Tutorial for me.

You need to set environment variables in .venv file.

  • WANDB_API_KEY: wandb_api_token
  • DISCORD_WEBHOOK_URL: url of discord webhook

First, You have to build singularity image

module load singularitypro
singularity build --fakeroot llm-train.sif llm-train.def

You can try single-node training or multi-node training.

  1. Single-Node
qsub -g <your_ABCI_group> train.sh
  1. Multi-Node
qsub -g <your_ABCI_group> multinode_train.sh