Project overview

In this project, we would like to check some common properties of the loss functions during training deep learning tasks. In theory, we usually assume the loss to be convex and smooth but that might not be the case due to deep neural networks. For simplicity, let us denote our model parameter as $x,y \in R^d$ and our loss function $f(x): R^d \mapsto R$. Then we want to check the following.

Convexity_gap: We compute the additive convexity gap in every iterate as $f(x_t) -f(y) - \langle \nabla f(x_t), x_t-y \rangle$ where $x_t$ is the current iterate and $y$ is some reference point. We then report the average of this quantity in every epoch (negative convexity gap means the function is convex).
Smoothness: We compute the smoothness constant $L= |x_t -y|/| | \nabla f(x_t)-\nabla f(y)|$ where $x_t$ is the current iterate and $y$ is some reference point. We then report the maximum L of every epoch.
Ratio: We also compute the multiplicative convexity gap which is $\langle \nabla f(x_t), x_t-y \rangle/(f(x_t) -f(y)) $. We then report the sum of the numerator/sum of the denominator in each epoch (our function is "well-behaved" if this ratio is a positive constant).

Installing Packages

For BU SCC

Before installing additional packages, we need to set up a virtual environment. Use 'python3 -m venv <env_name>' to create your environment, then 'source <env_name>/bin/activate' to activate. Then, we need to load some existing modules:

module load python3 pytorch cuda

To install the rest of the packages, go to the appropriate project and run pip install -r requirements.txt. If there are any missing packages, just keep pip install <package_name> until there's no error left.

Running instruction

To run the new_clm.py on scc, we can use these following commands:

module load python3 cuda pytorch
source /projectnb/aclab/tranhp/venv/mynewenv/bin/activate

python /projectnb/aclab/tranhp/test_properties/transformers/new_clm.py  --dataset_name the_pile  --model_name_or_path gpt2 --streaming True  --output_dir /projectnb/aclab/tranhp/test_properties/transformers/examples/pytorch/language-modeling/pile_1e-5/ --num_train_epochs 50  --checkpointing_steps epoch  --name pile8_1e-5_900000 --weight_decay 0.01 --learning_rate 1e-5 --max_train_steps 1000000 --max_step 1000001

The results would be logged at new_transformer_project in optimizedlearning wandb project. To resume from the checkpoint, we can add the following arguments (change the last 3 arguments accordingly depending on the current step):

python /projectnb/aclab/tranhp/test_properties/transformers/new_clm.py  --dataset_name the_pile  --model_name_or_path gpt2 --streaming True  --output_dir /projectnb/aclab/tranhp/test_properties/transformers/examples/pytorch/language-modeling/pile_1e-5/ --num_train_epochs 50  --checkpointing_steps epoch  --name pile8_1e-5_900000 --weight_decay 0.01 --learning_rate 1e-5 --max_train_steps 1000000 --max_step 1000001 --resume_from_checkpoint /projectnb/aclab/tranhp/test_properties/transformers/examples/pytorch/language-modeling/pile_1e-5/0_900000  --resume_from_checkpoint_torch /projectnb/aclab/tranhp/test_properties/transformers/examples/pytorch/language-modeling/pile_1e-5/0_900000.pth.tar --starting_step 900000

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
ImageClassification		ImageClassification
transformers		transformers
README.md		README.md
new_clm.py		new_clm.py
properties_checker.py		properties_checker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project overview

Installing Packages

Running instruction

About

Releases

Packages

Contributors 2

Languages

optimizedlearning/test_properties

Folders and files

Latest commit

History

Repository files navigation

Project overview

Installing Packages

Running instruction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages