Re-evaluate breaking-IID experiments from checkpoints #23

mcw92 · 2025-01-07T10:16:06Z

Additional metrics for breaking-IID series

As we decided to analyze a variety of performance metrics in addition to the plain accuracy, we need to re-evaluate all 16-node breaking-IID experiments from their respective model checkpoints to obtain the local and global confusion matrices.

Relevant scripts are:

Actual Python script to run: ./scripts/examples/evaluate_from_checkpoint_breaking_iid.py
SLURM utility function to obtain correct checkpoint paths for each parameter combination, starting from given base path: find_checkpoint_dir_and_uuid in ./specialcouscous/utils/slurm.py
Job script generation script: ./scripts/experiments/generate_parallel_evaluation_from_breaking_iid_ckpt_job_scripts.py
Path to actual results on HoreKa: ${BASEDIR}/results/ (BASEDIR corresponds to our workspace)

The text was updated successfully, but these errors were encountered:

mcw92 added the experiments Tasks related to experiments to run label Jan 7, 2025

mcw92 self-assigned this Jan 7, 2025

mcw92 linked a pull request Jan 7, 2025 that will close this issue

Introduce functionality for chunking and breaking IID experiments #20

Merged

mcw92 closed this as completed in #20 Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-evaluate breaking-IID experiments from checkpoints #23

Re-evaluate breaking-IID experiments from checkpoints #23

mcw92 commented Jan 7, 2025

Re-evaluate breaking-IID experiments from checkpoints #23

Re-evaluate breaking-IID experiments from checkpoints #23

Comments

mcw92 commented Jan 7, 2025

Additional metrics for breaking-IID series